date:20120310

Re: [R] hierarchical clustering of large dataset

2012-03-10 Thread Hans Ekbrand

On Fri, Mar 09, 2012 at 08:26:01PM -0500, Massimo Di Stefano wrote:
 my target is to have 'groups of species' based on the similarity of theyr 
 environmental parameters, and build a dendrogram like [2] 
 
 [2] http://massimo-timecapsule.whoi.edu//data/img/manova_clust_matlab.png

 Il giorno Mar 9, 2012, alle ore 7:18 PM, Peter Langfelder ha scritto:
 
  Well, you didn't say that column e was a label that you wanted to keep
  separate. Any other labels in the data? You may not want to use labels
  in the distance calculation.

If you want to use the results of the cluster-analysis as evidence on
similarities and differences between species, you _must_ not include
numeric variables representing labels in the matrix. Including them
would mean imposing the expected result onto the data.

First do the cluster analysis, then test the distribution of species
in clusters.

-- 
Hans Ekbrand (http://sociologi.cjb.net) h...@sociologi.cjb.net

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Issues in installing rgl in Mac OS 10.6.8

2012-03-10 Thread Prof Brian Ripley


Please ask about OS X on R-sig-mac .

There is something you have not installed on your OS, but it will 
probably need several rounds to find what (and it will be not just 
Mac-specific but depend on the exact versions of OS X (which you told 
us) and Xcode (which you did not)).


On Fri, 9 Mar 2012, A Ezhil wrote:


Dear All,

I am trying to install rgl on my mac notebook from the source file. I tried 
using: /usr/bin/R64 CMD INSTALL rgl_0.92.798.tar.gz and get the following
error message:

checking for X... no
configure: error: X11 not found but required, configure aborted.
ERROR: configuration failed for package ‘rgl’
* removing
‘/Library/Frameworks/R.framework/Versions/2.14/Resources/library/rgl’
* restoring previous
‘/Library/Frameworks/R.framework/Versions/2.14/Resources/library/rgl’

I do see a directory X11 installed under /usr and Sys.getenv(PATH) inside R gives me: 
[1] /usr/bin:/bin:/usr/sbin:/sbin:/usr/local/bin:/usr/X11/bin

Could you please hep me to install rgl package?


sessionInfo()

R version 2.14.1 (2011-12-22)
Platform: x86_64-apple-darwin9.8.0/x86_64 (64-bit)

locale:
[1] en_GB.UTF-8/en_GB.UTF-8/en_GB.UTF-8/C/en_GB.UTF-8/en_GB.UTF-8

attached base packages:
[1] stats graphics  grDevices utils datasets  methods   base

Thanks in advance.
Kind regards,
Ezhil

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


--
Brian D. Ripley,  rip...@stats.ox.ac.uk
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford, Tel:  +44 1865 272861 (self)
1 South Parks Road, +44 1865 272866 (PA)
Oxford OX1 3TG, UKFax:  +44 1865 272595__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] round giving different results on Windows and Mac

2012-03-10 Thread Petr Savicky

On Fri, Mar 09, 2012 at 09:34:14PM +, Ruth Ripley wrote:
 Dear all,
 
 I have been running some tests of my package RSiena on different 
 platforms and trying to reconcile the results.
 
 With Mac, the commands
 
 options(digits=4)
 round(1.81652, digits=4)
 
 print 1.817
 
 With Windows, the same commands print 1.816
 
 I am not bothered which answer I get, but it would be nice if they were 
 the same. A linux box agreed with the Mac.

Hi.

I obtain the same difference between Linux (1.817) and
32 bit Windows (1.816). As Duncan said, the number 1.8165
is not exactly representable and printing it to 4
significant digits may depend on the platform, since
it is a middle case.

Note that options(digits=4) means rounding to 4 significant
digits, while round(1.81652, digits=4) is rounding to 4
digits in the fractional part. Try signif(1.81652, digits=4)
to get the same type of rounding as in options(digits=4).

The problem is not in round(), since

  x - round(1.81652, digits=4)
  print(x, digits=20)
  print(x, digits=4)

yields on Linux

  [1] 1.8165036
  [1] 1.817

and on 32 bit Windows

  [1] 1.8165036
  [1] 1.816

The difference is not due to R, since R is responsible
only for the choice of the number of printed digits
and not for the digits themselves. The digits are computed
by sprintf() on the given platform. So, the difference
seems to be there.

The command

  sprintf(%5.3f, 18165/1)

yields on Linux

  [1] 1.817

and on 32 bit Windows 

  [1] 1.816

Thank you for the example.

Petr Savicky.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Treat Variable as String and a String as variables name

2012-03-10 Thread Alaios

Dear all.
I am having ten variables (let's call the four of them as 

Alpha, Beta, Gamma and Delta.)

For each variable I have to print around 100 (plots). E


So far I was copying paste the code below many times. 


pdf(file=DC_Alpha_All.pdf, width=15) # First Variable is treated as string
plot_dc_for_multiple_kapas(Alpha, 4, c(5, 4), coloridx=c(24, 32)) # First 
Variable is now passed #inside the function as variable
dev.off()




So I could save my time If I can make a function that for every variable 
produces the current number of plots. The problem is, as you can also see from 
comment above that my variable has to be converted to string (first line) and 
also at the second line should be used as a variable.

How I can make a loop in R that for a list of variables (the 10 variables I 
gave at the beginning) can either treat each entry of that list once as a 
string and once a real variable.

Could you please help me with that?

Best Regards
Alex
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Issues in installing rgl in Mac OS 10.6.8

2012-03-10 Thread peter dalgaard


On Mar 10, 2012, at 09:57 , Prof Brian Ripley wrote:

 Please ask about OS X on R-sig-mac .

Yep. (Or R-devel for generic developer issues, but this one is pretty OSX 
specific.)

 There is something you have not installed on your OS, but it will probably 
 need several rounds to find what (and it will be not just Mac-specific but 
 depend on the exact versions of OS X (which you told us) and Xcode (which you 
 did not)).

It is certainly non-trivial to install this particular package from source. 

Is there any reason you don't want to use the precompiled version from CRAN? I 
mean, it is all well and good that more people do source builds so that we 
don't end up with a situation where only one or two persons actually know how 
to build stuff, but it might not be the most productive route if you actually 
need to get things done...

 
 On Fri, 9 Mar 2012, A Ezhil wrote:
 
 Dear All,
 
 I am trying to install rgl on my mac notebook from the source file. I tried 
 using: /usr/bin/R64 CMD INSTALL rgl_0.92.798.tar.gz and get the following
 error message:
 
 checking for X... no
 configure: error: X11 not found but required, configure aborted.
 ERROR: configuration failed for package ‘rgl’
 * removing
 ‘/Library/Frameworks/R.framework/Versions/2.14/Resources/library/rgl’
 * restoring previous
 ‘/Library/Frameworks/R.framework/Versions/2.14/Resources/library/rgl’
 
 I do see a directory X11 installed under /usr and Sys.getenv(PATH) inside 
 R gives me: [1] /usr/bin:/bin:/usr/sbin:/sbin:/usr/local/bin:/usr/X11/bin
 
 Could you please hep me to install rgl package?
 
 sessionInfo()
 R version 2.14.1 (2011-12-22)
 Platform: x86_64-apple-darwin9.8.0/x86_64 (64-bit)
 
 locale:
 [1] en_GB.UTF-8/en_GB.UTF-8/en_GB.UTF-8/C/en_GB.UTF-8/en_GB.UTF-8
 
 attached base packages:
 [1] stats graphics  grDevices utils datasets  methods   base
 
 Thanks in advance.
 Kind regards,
 Ezhil
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.
 
 -- 
 Brian D. Ripley,  rip...@stats.ox.ac.uk
 Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
 University of Oxford, Tel:  +44 1865 272861 (self)
 1 South Parks Road, +44 1865 272866 (PA)
 Oxford OX1 3TG, UKFax:  +44 1865 
 272595__
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

-- 
Peter Dalgaard, Professor,
Center for Statistics, Copenhagen Business School
Solbjerg Plads 3, 2000 Frederiksberg, Denmark
Phone: (+45)38153501
Email: pd@cbs.dk  Priv: pda...@gmail.com

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Treat Variable as String and a String as variables name

2012-03-10 Thread Berend Hasselman


On 10-03-2012, at 10:39, Alaios wrote:

 Dear all.
 I am having ten variables (let's call the four of them as 
 
 Alpha, Beta, Gamma and Delta.)
 
 For each variable I have to print around 100 (plots). E
 
 So far I was copying paste the code below many times. 
 
 pdf(file=DC_Alpha_All.pdf, width=15) # First Variable is treated as string
 plot_dc_for_multiple_kapas(Alpha, 4, c(5, 4), coloridx=c(24, 32)) # First 
 Variable is now passed #inside the function as variable
 dev.off()
 
 So I could save my time If I can make a function that for every variable 
 produces the current number of plots. The problem is, as you can also see 
 from comment above that my variable has to be converted to string (first 
 line) and also at the second line should be used as a variable.
 
 How I can make a loop in R that for a list of variables (the 10 variables I 
 gave at the beginning) can either treat each entry of that list once as a 
 string and once a real variable.

Something like this

varlist - LETTERS[1:10]
varlist

for( k in 1:length(varlist) ) assign(varlist[k], runif(10))
varlist

myplot - function(x,k) plot(x,col=k)
for( k in 1:length(varlist) ) {
varname - varlist[k]
filename - paste(DC_,varname,_All.pdf, sep=)
pdf(file=filename, width=15)
myplot(get(varlist[k]),k)
dev.off()
}

Berend

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] help please. 2 tables, which test?

2012-03-10 Thread aoife doherty

Thank you for the replies.
So what my test wants to do is this:

I have a big matrix, 30 rows (students in a class) X 50 columns (students
grades for the year).
An example of the matrix is as such:


grade1   grade2grade3 .  grade 50
student 1
student 2***
student 3
student 4***
student 5***
student 6
.
.
.
.
.
student 30***

As you can see, four students (students 2,4,5 and 30) have stars beside
their name. I have chosen these students based on a particular
characteristic that they all share.I then pulled these students out to make
a new table:

grade1  grade2 grade3 ... grade 50

student 2
student 4
student 5
student 30


and what i want to see is basically is there any difference between the
grades this particular set of students(i.e. student 2,4,5 and 30) got, and
the class as a whole?

So my null hypothesis is that there is no difference between this set of
students grades, and what you would expect from the class as a whole.

Aaral






On Sat, Mar 10, 2012 at 12:18 AM, Greg Snow 538...@gmail.com wrote:

 Just what null hypothesis are you trying to test or what question are
 you trying to answer by comparing 2 matrices of different size?

 I think you need to figure out what your real question is before
 worrying about which test might work on it.

 Trying to get your data to fit a given test rather than finding the
 appropriate test or other procedure to answer your question is like
 buying a new suit then having plastic surgery to make you fit the suit
 rather than having the tailor modify the suit to fit you.

 If you can give us more information about what your question is we
 have a better chance of actually helping you.

 On Fri, Mar 9, 2012 at 9:46 AM, aoife doherty aaral.si...@gmail.com
 wrote:
 
  Thank you. Can the chi-squared test compare two matrices that are not the
  same size, eg if matrix 1 is a 2 X 4 table, and matrix 2 is a 3 X 5
 matrix?
 
 
 
  On Fri, Mar 9, 2012 at 4:37 PM, Greg Snow 538...@gmail.com wrote:
 
  The chi-squared test is one option (and seems reasonable to me if it
  the the proportions/patterns that you want to test).  One way to do
  the test is to combine your 2 matrices into a 3 dimensional array (the
  abind package may help here) and test using the loglin function.
 
  On Thu, Mar 8, 2012 at 5:46 AM, aaral singh aaral.si...@gmail.com
 wrote:
   Hi.Please help if someone can.
  
   Problem:
   I have 2 matrices
  
   Eg
  
   matrix 1:
  Freq  None  Some
Heavy32  5
Never8   13 8
Occas14  4
Regul 95 7
  
   matrix 2:
Freq None Some
Heavy7  1 3
Never  87 18  84
Occas  12   34
Regul917
  
  
   I want to see if matrix 1 is significantly different from matrix 2. I
   consider using a chi-squared test. Is this appropriate?
   Could anyone advise?
   Many thank you.
   Aaral Singh
  
   --
   View this message in context:
  
 http://r.789695.n4.nabble.com/help-please-2-tables-which-test-tp4456312p4456312.html
   Sent from the R help mailing list archive at Nabble.com.
  
   __
   R-help@r-project.org mailing list
   https://stat.ethz.ch/mailman/listinfo/r-help
   PLEASE do read the posting guide
   http://www.R-project.org/posting-guide.html
   and provide commented, minimal, self-contained, reproducible code.
 
 
 
  --
  Gregory (Greg) L. Snow Ph.D.
  538...@gmail.com
 
  __
  R-help@r-project.org mailing list
  https://stat.ethz.ch/mailman/listinfo/r-help
  PLEASE do read the posting guide
  http://www.R-project.org/posting-guide.html
  and provide commented, minimal, self-contained, reproducible code.
 
 



 --
 Gregory (Greg) L. Snow Ph.D.
 538...@gmail.com

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Issues in installing rgl in Mac OS 10.6.8

2012-03-10 Thread Hans Ekbrand

On Fri, Mar 09, 2012 at 04:52:31PM -0800, A Ezhil wrote:
 Dear All,
 
 I am trying to install rgl on my mac notebook from the source file. I tried 
 using: /usr/bin/R64 CMD INSTALL rgl_0.92.798.tar.gz and get the following
 error message:
 
 checking for X... no
 configure: error: X11 not found but required, configure aborted.
 ERROR: configuration failed for package ‘rgl’
 * removing
 ‘/Library/Frameworks/R.framework/Versions/2.14/Resources/library/rgl’
 * restoring previous
 ‘/Library/Frameworks/R.framework/Versions/2.14/Resources/library/rgl’
 
 I do see a directory X11 installed under /usr and Sys.getenv(PATH) inside R 
 gives me: [1] /usr/bin:/bin:/usr/sbin:/sbin:/usr/local/bin:/usr/X11/bin
 
 Could you please hep me to install rgl package?

Not really, but I can offer a hint: I think your system has the
_runtime_ libraries for X11 (in /usr/X11), but you need _development_
libraries to comile rgl.

I have no knowledge about Mac OS, but in my system, Debian GNU/Linux,
the needed libraries to build rgl from source are:

libgl1-mesa-dev
libglu1-mesa-dev
mesa-common-dev

-- 
Hans Ekbrand (http://sociologi.cjb.net) h...@sociologi.cjb.net

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Issues in installing rgl in Mac OS 10.6.8

2012-03-10 Thread Berend Hasselman


On 10-03-2012, at 12:49, Hans Ekbrand wrote:

 On Fri, Mar 09, 2012 at 04:52:31PM -0800, A Ezhil wrote:
 Dear All,
 
 I am trying to install rgl on my mac notebook from the source file. I tried 
 using: /usr/bin/R64 CMD INSTALL rgl_0.92.798.tar.gz and get the following
 error message:
 
 checking for X... no
 configure: error: X11 not found but required, configure aborted.
 ERROR: configuration failed for package ‘rgl’
 * removing
 ‘/Library/Frameworks/R.framework/Versions/2.14/Resources/library/rgl’
 * restoring previous
 ‘/Library/Frameworks/R.framework/Versions/2.14/Resources/library/rgl’
 
 I do see a directory X11 installed under /usr and Sys.getenv(PATH) inside 
 R gives me: [1] /usr/bin:/bin:/usr/sbin:/sbin:/usr/local/bin:/usr/X11/bin
 
 Could you please hep me to install rgl package?
 
 Not really, but I can offer a hint: I think your system has the
 _runtime_ libraries for X11 (in /usr/X11), but you need _development_
 libraries to comile rgl.
 
 I have no knowledge about Mac OS, but in my system, Debian GNU/Linux,
 the needed libraries to build rgl from source are:
 
 libgl1-mesa-dev
 libglu1-mesa-dev
 mesa-common-dev

That's for Linux systems not for Mac OS X.

One of the many possibilities is, is that your X11 has been corrupted/trashed.
You could try reinstalling X11 (from the install disc).

Berend

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Window on a vector

2012-03-10 Thread Alaios

Dear all,
I have a large vector (lets call it myVector) and I want to plot its value with 
the logic below

yaxis-myVector[1]
yaxis-c(xaxis,mean(myvector[2:3])
yaxis-c(xaxis,mean(myvector[4:8])
yaxisc(xaxis,mean(myvector[9:16])
yaxisc(xaxis,mean(myvector[17:32])

this has to stop when the new . yaxisc(xaxis,mean(myvector[1024:2048]) 
will not find the correspondent number of elements, either wise it will stop 
with an error.


How I can do something like that in R?

I would like to thank you in advance for your help

B.R
Alex
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] problem with effects : 'subscript out of bounds'

2012-03-10 Thread John Fox

Dear Nicole,

Sorry, I didn't notice the earlier messages in this thread.

Please see below.

 -Original Message-
 From: r-help-boun...@r-project.org [mailto:r-help-bounces@r-
 project.org] On Behalf Of Nicole Marie Ford
 Sent: March-10-12 1:20 AM
 To: r-help
 Subject: Re: [R] problem with effects : 'subscript out of bounds'
 
 if that is  not specific (or not general) enough:
 
 newDV - dat$DV  ## newDV is my DV  it is continuous.
 newDV - as.numeric(newDV)-5
 str(newDV)
 
 
 (i had to do a great deal of coding here so i am snipping down to the
 end)
 
 
 tmp[which(dat$v1 == stuff  dat$v2 == more stuff)] - lots of
 stuff
 tmp - factor(tmp, levels=c(la, la la, fa la la))
 dat$v3 - tmp
 newIV - as.factor(dat$v3)    newIV is my IV, a factor as you can
 see.
 
 n.var4 - dat$v4 ## control
 
 n.var5 - dat$v5  ##control  (there are others but they were coded the
 same)
 
 
 n.mod1 - lm(newDV ~ newIV + v4 + v5 + v6 + v7 + v8 + v9, data=dat)
 ### linear model.  all of these variables already are specific to the
 dataset which i called 'norway' so there is no need to specify in the
 model.
 
 summary(n.mod1)
 
  plot(effect(newIV, n.mod1), multiline=T)
 Error in plot(effect(nor.trust, n.mod1), multiline = T) :
   error in evaluating the argument 'x' in selecting a method for
 function 'plot': Error in apply(mod.matrix[, components], 1, prod) :
   subscript out of bounds

This seems very odd: The command given is

  plot(effect(newIV, n.mod1), multiline=T)

but the error message is apparently for 

  plot(effect(nor.trust, n.mod1), multiline = T)

and nor.trust isn't a variable in the model. What command did you execute?

Although it's not relevant to the error, there's no point in setting
multiline=TRUE for a model without interactions.

Best,
 John


John Fox
Senator William McMaster
  Professor of Social Statistics
Department of Sociology
McMaster University
Hamilton, Ontario, Canada
http://socserv.mcmaster.ca/jfox


 
 
 ~~this ran perfectly on my previous dataset so i am unsure of the
 issue.  thanks in advance.
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-
 guide.html
 and provide commented, minimal, self-contained, reproducible code.
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-
 guide.html
 and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Window on a vector

2012-03-10 Thread David Winsemius



On Mar 10, 2012, at 7:44 AM, Alaios wrote:


Dear all,
I have a large vector (lets call it myVector) and I want to plot its  
value with the logic below


yaxis-myVector[1]
yaxis-c(xaxis,mean(myvector[2:3])
yaxis-c(xaxis,mean(myvector[4:8])
yaxisc(xaxis,mean(myvector[9:16])
yaxisc(xaxis,mean(myvector[17:32])

this has to stop when the new .  
yaxisc(xaxis,mean(myvector[1024:2048]) will not find the  
correspondent number of elements, either wise it will stop with an  
error.



How I can do something like that in R?


This will generate two series that are somewhat like your index  
specification. I say somewhat because you appear to have changed the  
indexing strategy in the middle. You start with 2^0. 2^1 and 2^2 as  
you begin but then switch to 2^3+1, and 2^4+1.


n=20
cbind(2^(0:(n-1)), 2^(1:n)-1)

You can decide what to use for n with logic like:

which.max(20 = 2^(1:10) )

Then you can use sapply or mapply.



Alex
[[alternative HTML version deleted]]


Please learn to post in plain text.

--

David Winsemius, MD
West Hartford, CT

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] rgl: cylinder3d() with elliptical cross-section

2012-03-10 Thread Duncan Murdoch

My first reply to this went privately, by accident.  I've done a little 
editing to it, but mainly this is for the archives.


On 12-03-09 2:36 PM, Michael Friendly wrote:

For a paper dealing with generalized ellipsoids, I want to illustrate in
3D an ellipsoid that is unbounded
in one dimension, having the shape of an infinite cylinder along, say,
z, but whose cross-section in (x,y)
is an ellipse, say, given by the 2x2 matrix cov(x,y).

I've looked at rgl:::cylinder3d, but don't see any way to make it
accomplish this.  Does anyone have
any ideas?


rgl has no way to display curved surfaces that are unbounded.  (It has 
lines and planes that adapt to the viewport.)  So you would need to make 
a finite cylinder, and it will be up to you to choose how big to make it.


The cylinder3d() function can do that, but it's not very good at 
cylinders that are straight. (This is a little embarrassing...) It sets 
up a local coordinate system based on the curvature, but if there are no 
curves, it fails, and you have to supply your own coordinates.


So here's how I would do what you want:

center - cbind(0, 0, 1:10)  # cylinder centered on points (0, 0, z)
e2 - cbind(1, 0, rep(0, 10)) # define the normal vectors
cyl - cylinder3d(center, e2=e2)

# Now you have an octagonal cylinder.  Use the sides arg to cylinder3d
# if it doesn't end up smooth enough, but in most cases I've seen, 8
# is sufficient.

# Define a transformation to the x and y coordinates to give the
# elliptical shape; use it as the
# top left 2x2 matrix of a 3x3 matrix
xfrm - matrix( c(2, 1, 0,
   1, 3, 0,
   0, 0, 1), 3,3, byrow=TRUE)
cyl - transform3d(cyl, xfrm)
cyl - addNormals(cyl)  # this makes it shade smoothly
shade3d(cyl, col=green)
decorate3d()  # show some axes for scale

Duncan Murdoch

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Treat Variable as String and a String as variables name

2012-03-10 Thread Alaios

Thanks a lot works great :)
Alex

 From: Berend Hasselman b...@xs4all.nl

Cc: R help R-help@r-project.org 
Sent: Saturday, March 10, 2012 11:19 AM
Subject: Re: [R] Treat Variable as String and  a String as variables name

On 10-03-2012, at 10:39, Alaios wrote:

 Dear all.
 I am having ten variables (let's call the four of them as 

 Alpha, Beta, Gamma and Delta.)

 For each variable I have to print around 100 (plots). E

 So far I was copying paste the code below many times. 

 pdf(file=DC_Alpha_All.pdf, width=15) # First Variable is treated as string
 plot_dc_for_multiple_kapas(Alpha, 4, c(5, 4), coloridx=c(24, 32)) # First 
 Variable is now passed #inside the function as variable
 dev.off()

 So I could save my time If I can make a function that for every variable 
 produces the current number of plots. The problem is, as you can also see 
 from comment above that my variable has to be converted to string (first 
 line) and also at the second line should be used as a variable.

 How I can make a loop in R that for a list of variables (the 10 variables I 
 gave at the beginning) can either treat each entry of that list once as a 
 string and once a real variable.

Something like this

varlist - LETTERS[1:10]
varlist

for( k in 1:length(varlist) ) assign(varlist[k], runif(10))
varlist

myplot - function(x,k) plot(x,col=k)
for( k in 1:length(varlist) ) {    
    varname - varlist[k]
    filename - paste(DC_,varname,_All.pdf, sep=)
    pdf(file=filename, width=15)
    myplot(get(varlist[k]),k)
    dev.off()
}

Berend
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] round giving different results on Windows and Mac

2012-03-10 Thread Ruth Ripley


Petr,

Many thanks for this detailed explanation. It seems that the printing is 
going to vary because it is not done by R. I will try alternative 
numbers of significant digits: I had set options(digits=4) in an attempt 
to avoid inter-platform printing differences, without really 
understanding what was causing them.


Ruth



 Original Message 
Subject: Re: [R] round giving different results on Windows and Mac
Date: Sat, 10 Mar 2012 10:08:21 +0100
From: Petr Savicky savi...@cs.cas.cz
To: r-help@r-project.org

On Fri, Mar 09, 2012 at 09:34:14PM +, Ruth Ripley wrote:

Dear all,

I have been running some tests of my package RSiena on different
platforms and trying to reconcile the results.

With Mac, the commands

options(digits=4)
round(1.81652, digits=4)

print 1.817

With Windows, the same commands print 1.816

I am not bothered which answer I get, but it would be nice if they were
the same. A linux box agreed with the Mac.


Hi.

I obtain the same difference between Linux (1.817) and
32 bit Windows (1.816). As Duncan said, the number 1.8165
is not exactly representable and printing it to 4
significant digits may depend on the platform, since
it is a middle case.

Note that options(digits=4) means rounding to 4 significant
digits, while round(1.81652, digits=4) is rounding to 4
digits in the fractional part. Try signif(1.81652, digits=4)
to get the same type of rounding as in options(digits=4).

The problem is not in round(), since

x - round(1.81652, digits=4)
print(x, digits=20)
print(x, digits=4)

yields on Linux

[1] 1.8165036
[1] 1.817

and on 32 bit Windows

[1] 1.8165036
[1] 1.816

The difference is not due to R, since R is responsible
only for the choice of the number of printed digits
and not for the digits themselves. The digits are computed
by sprintf() on the given platform. So, the difference
seems to be there.

The command

sprintf(%5.3f, 18165/1)

yields on Linux

[1] 1.817

and on 32 bit Windows

[1] 1.816

Thank you for the example.

Petr Savicky.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


--
Ruth M. Ripley, Email:r...@stats.ox.ac.uk
Dept. of Statistics,http://www.stats.ox.ac.uk/~ruth/
University of Oxford,   Tel:   01865 282857
1 South Parks Road, Oxford OX1 3TG, UK  Fax:   01865 272595

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] round giving different results on Windows and Mac

2012-03-10 Thread Ruth Ripley


Duncan,

Thanks for your reply: given Petr's response, it seems the problem is 
the interpretation by the printing code, not the actual representation 
of the number. Given the representation, 1.817 would be correct, unless 
at the working accuracy it is considered to be equal to 1.8165 (as 
indeed I asked it to be).


I will have to find a workaround, or live with two sets of test results.

Ruth

On 10/03/2012 00:48, Duncan Murdoch wrote:

On 12-03-09 4:34 PM, Ruth Ripley wrote:

Dear all,

I have been running some tests of my package RSiena on different
platforms and trying to reconcile the results.

With Mac, the commands

options(digits=4)
round(1.81652, digits=4)

print 1.817


The value you're printing is 1.8165, so I believe Windows gets it right
using our round-to-even rule, but I'm not surprised that there are
differences. The value 1.8165 isn't exactly representable, so it's
somewhat random whether a system chooses to represent it slightly larger
or slightly smaller.

Duncan Murdoch




With Windows, the same commands print 1.816

I am not bothered which answer I get, but it would be nice if they were
the same. A linux box agreed with the Mac.

Mac sessionInfo():
R version 2.14.2 (2012-02-29)
Platform: x86_64-apple-darwin9.8.0/x86_64 (64-bit)

locale:
[1] en_GB.UTF-8/en_GB.UTF-8/en_GB.UTF-8/C/en_GB.UTF-8/en_GB.UTF-8

attached base packages:
[1] stats graphics grDevices utils datasets methods base

other attached packages:
[1] RSiena_1.0.12.205

loaded via a namespace (and not attached):
[1] grid_2.14.2 lattice_0.20-0 Matrix_1.0-4 tools_2.14.2

Windows (but 2.14.1patched was the same) sessionInfo():
R version 2.15.0 alpha (2012-03-08 r58640)
Platform: i386-pc-mingw32/i386 (32-bit)

locale:
[1] LC_COLLATE=English_United Kingdom.1252
[2] LC_CTYPE=English_United Kingdom.1252
[3] LC_MONETARY=English_United Kingdom.1252
[4] LC_NUMERIC=C
[5] LC_TIME=English_United Kingdom.1252

attached base packages:
[1] stats graphics grDevices utils datasets methods base

Any enlightenment would be gratefully received.

Ruth




--
Ruth M. Ripley, Email:r...@stats.ox.ac.uk
Dept. of Statistics,http://www.stats.ox.ac.uk/~ruth/
University of Oxford,   Tel:   01865 282857
1 South Parks Road, Oxford OX1 3TG, UK  Fax:   01865 272595

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] round giving different results on Windows and Mac

2012-03-10 Thread Petr Savicky

Hello Ruth:

 Many thanks for this detailed explanation. It seems that the printing is 
 going to vary because it is not done by R. I will try alternative 
 numbers of significant digits: I had set options(digits=4) in an attempt 
 to avoid inter-platform printing differences, without really 
 understanding what was causing them.

If you round the numbers by, say, signif(x, digits=4) before printing
and print with at least 4 digits precision, then the output should not
depend on the printing function, but on signif(), since in this case,
the printing function does not get middle cases.

Function signif() can also have platform dependence, but i think, it
should be rare. Send examples of platform dependencies in signif(), if
you find some.

Petr.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] index values of one matrix to another of a different size

2012-03-10 Thread Ben quant

Thanks for the info. Unfortunately its a little bit slower after one apples
to apples test using my big data. Mine: 0.28 seconds. Yours. 0.73 seconds.
Not a big deal, but significant when I have to do this 300 to 500 times.

regards,

ben

On Fri, Mar 9, 2012 at 1:23 PM, Rui Barradas rui1...@sapo.pt wrote:

 Hello,

 I don't know if it's the fastest but it's more natural to have an index
 matrix with two columns only,
 one for each coordinate. And it's fast.

 fun - function(valdata, inxdata){
nr - nrow(inxdata)
nc - ncol(inxdata)
mat - matrix(NA, nrow=nr*nc, ncol=2)
i1 - 1
i2 - nr
for(j in 1:nc){
mat[i1:i2, 1] - inxdata[, j]
mat[i1:i2, 2] - rep(j, nr)
i1 - i1 + nr
i2 - i2 + nr
}
matrix(valdata[mat], ncol=nc)
 }

 fun(vals, indx)

 Rui Barradas


 --
 View this message in context:
 http://r.789695.n4.nabble.com/Re-index-values-of-one-matrix-to-another-of-a-different-size-tp4458666p4460575.html
 Sent from the R help mailing list archive at Nabble.com.

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Paste ignore arrayys

2012-03-10 Thread Alaios

Dear all,
I am using paste to create a file name.
filename- paste(GPS_, TimeStamps, sep=)
where TimeStamps is a character vector, of two elements.

The problem is that the paste instead of one string will make two, one for each 
entry of the TimeStamps vector. Would it be possible to make the TimeStamps 
entry interconnect by _ and convert them to string. That then will create the

GPS_18:00_19:00

which is what I want.

'I would like to thank you in advance for your hel

B.R
Alex

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Paste ignore arrayys

2012-03-10 Thread R. Michael Weylandt

Read ?paste and use something like

paste(GPS_, paste(TimeStamps, collapse = _),  sep = )

Michael

On Sat, Mar 10, 2012 at 11:41 AM, Alaios ala...@yahoo.com wrote:
 Dear all,
 I am using paste to create a file name.
 filename- paste(GPS_, TimeStamps, sep=)
 where TimeStamps is a character vector, of two elements.

 The problem is that the paste instead of one string will make two, one for 
 each entry of the TimeStamps vector. Would it be possible to make the 
 TimeStamps entry interconnect by _ and convert them to string. That then will 
 create the

 GPS_18:00_19:00

 which is what I want.

 'I would like to thank you in advance for your hel

 B.R
 Alex

        [[alternative HTML version deleted]]

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] max.print

2012-03-10 Thread sybil kennelly

Dear all.

I wanted to read in a 20,000 row X 60 column matrix (called table) into R.

i did this:

R
table - read.table(table, header=TRUE)
table

it prints out the start of my table (~1 rows by 7 columns) and then
this error:


 [ reached getOption(max.print) -- omitted 5465 rows ]]
There were 50 or more warnings (use warnings() to see the first 50)

I have tried:

options(max.print = Inf)

and options(max.print = 9)

but i still get the same error. I have seen many people on R help have
this problem. However the solution of options(max.print = Inf) does
not seem to work for me.


Any ideas?


Syb

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Use different panel functions with lattice

2012-03-10 Thread Balaitous

Hi,

I have a data.frame df with
names(df) = c(Var1, Var2, Var3, Var4)

and I plot data with

xyplot(Var1+Var2~Var3|Var4, data=df)

I want to use different panel functions for Var1 and Var2.
How can I do ?

Something like :

panel.mypanel = function(x, y, ...) {
  if (Var1) panel.Var1Panel(x, y, ...)
  else panel.Var2Panel(x, y, ...)
}
xyplot(Var1+Var2~Var3|Var4, data=df, panel=panel.mypanel)

(I have search with google, but I found nothing)

Thanks

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] max.print

2012-03-10 Thread aoife doherty

Hey i have a similar size dataset and ran into the same problem, but i
found this command works fine:

options(max.print=100)

to fix it?


On Sat, Mar 10, 2012 at 4:35 PM, sybil kennelly sybilkenne...@gmail.comwrote:

 Dear all.

 I wanted to read in a 20,000 row X 60 column matrix (called table) into
 R.

 i did this:

 R
 table - read.table(table, header=TRUE)
 table

 it prints out the start of my table (~1 rows by 7 columns) and then
 this error:


  [ reached getOption(max.print) -- omitted 5465 rows ]]
 There were 50 or more warnings (use warnings() to see the first 50)

 I have tried:

 options(max.print = Inf)

 and options(max.print = 9)

 but i still get the same error. I have seen many people on R help have
 this problem. However the solution of options(max.print = Inf) does
 not seem to work for me.


 Any ideas?


 Syb

[[alternative HTML version deleted]]

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Help on subgraphs in xyplot of lattice library

2012-03-10 Thread Chee Chen

Dear All,
I would like to ask a question on how to do overlay plots in each subgraph of 
xyplot.
1.  I did simulations for m=1000, 2500, 5000, 1, as the sample sizes.
2. for each sample size value m,  4 graphs are generated; each graph contains 
overlayed comparisons between 4 methods,
3.  now I want put them into a 4-by-4  plot by xyplot, i.e.,  4 sample size 
values, each of which has 4 plots.

I know how to do this using plot, but the spaces between subplots are big.

I do not know how to make each subplot in xyplot an overlayed one as it would 
appear using plot.

Any help would be appreciated!

Thank you,
Chee


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Help on subgraphs in xyplot of lattice library

2012-03-10 Thread R. Michael Weylandt

What does your data look likedput() is your friend.

Also, it'd be helpful if you could give base graphics code for
more-or-less what you are looking for (since you can do so already) as
it's pretty hard to describe graphics without pictures.

Running example(xyplot) might help you get started as well.

Michael

On Sat, Mar 10, 2012 at 12:04 PM, Chee Chen chee.c...@yahoo.com wrote:
 Dear All,
 I would like to ask a question on how to do overlay plots in each subgraph of 
 xyplot.
 1.  I did simulations for m=1000, 2500, 5000, 1, as the sample sizes.
 2. for each sample size value m,  4 graphs are generated; each graph contains 
 overlayed comparisons between 4 methods,
 3.  now I want put them into a 4-by-4  plot by xyplot, i.e.,  4 sample size 
 values, each of which has 4 plots.

 I know how to do this using plot, but the spaces between subplots are big.

 I do not know how to make each subplot in xyplot an overlayed one as it would 
 appear using plot.

 Any help would be appreciated!

 Thank you,
 Chee


        [[alternative HTML version deleted]]

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] index values of one matrix to another of a different size

2012-03-10 Thread Joshua Wiley

Hi Ben,

It seems likely that there are bigger bottle necks in your overall
program/use---have you tried Rprof() to find where things really get
slowed down?  In any case, f2() below takes about 70% of the time as
your function in your test data, and 55-65% of the time for a bigger
example I constructed.  Rui's function benefits substantially from
byte compiling, but is still slower.  As a side benefit, f2() seems to
use less memory than your current implementation.

Cheers,

Josh

%%
##sample data ##
vals - matrix(LETTERS[1:9], nrow = 3, ncol = 3,
  dimnames = list(c('row1','row2','row3'), c('col1','col2','col3')))

indx - matrix(c(1,1,3,3,2,2,2,3,1,2,2,1), nrow=4, ncol=3)
storage.mode(indx) - integer


f - function(x, i, di = dim(i), dx = dim(x)) {
  out - x[c(i + matrix(0:(dx[1L] - 1L) * dx[1L], nrow = di[1L], ncol
= di[2L], TRUE))]
  dim(out) - di
  return(out)
}


fun - function(valdata, inxdata){
nr - nrow(inxdata)
nc - ncol(inxdata)
mat - matrix(NA, nrow=nr*nc, ncol=2)
i1 - 1
i2 - nr
for(j in 1:nc){
mat[i1:i2, 1] - inxdata[, j]
mat[i1:i2, 2] - rep(j, nr)
i1 - i1 + nr
i2 - i2 + nr
}
matrix(valdata[mat], ncol=nc)
}

require(compiler)
f2 - cmpfun(f)
fun2 - cmpfun(fun)

system.time(for (i in 1:1) f(vals, indx))
system.time(for (i in 1:1) f2(vals, indx))
system.time(for (i in 1:1) fun(vals, indx))
system.time(for (i in 1:1) fun2(vals, indx))
system.time(for (i in 1:1)
matrix(vals[cbind(c(indx),rep(1:ncol(indx),each=nrow(indx)))],nrow=nrow(indx),ncol=ncol(indx)))

## now let's make a bigger test set
set.seed(1)
vals2 - matrix(sample(LETTERS, 10^7, TRUE), nrow = 10^4)
indx2 - sapply(1:ncol(vals2), FUN = function(x) sample(10^4, 10^3, TRUE))

dim(vals2)
dim(indx2)

## the best contenders from round 1
gold - 
matrix(vals2[cbind(c(indx2),rep(1:ncol(indx2),each=nrow(indx2)))],nrow=nrow(indx2),ncol=ncol(indx2))
test1 - f2(vals2, indx2)
all.equal(gold, test1)

system.time(for (i in 1:20) f2(vals2, indx2))
system.time(for (i in 1:20)
matrix(vals2[cbind(c(indx2),rep(1:ncol(indx2),each=nrow(indx2)))],nrow=nrow(indx2),ncol=ncol(indx2)))

%%

On Sat, Mar 10, 2012 at 7:48 AM, Ben quant ccqu...@gmail.com wrote:
 Thanks for the info. Unfortunately its a little bit slower after one apples
 to apples test using my big data. Mine: 0.28 seconds. Yours. 0.73 seconds.
 Not a big deal, but significant when I have to do this 300 to 500 times.

 regards,

 ben

 On Fri, Mar 9, 2012 at 1:23 PM, Rui Barradas rui1...@sapo.pt wrote:

 Hello,

 I don't know if it's the fastest but it's more natural to have an index
 matrix with two columns only,
 one for each coordinate. And it's fast.

 fun - function(valdata, inxdata){
        nr - nrow(inxdata)
        nc - ncol(inxdata)
        mat - matrix(NA, nrow=nr*nc, ncol=2)
        i1 - 1
        i2 - nr
        for(j in 1:nc){
                mat[i1:i2, 1] - inxdata[, j]
                mat[i1:i2, 2] - rep(j, nr)
                i1 - i1 + nr
                i2 - i2 + nr
        }
        matrix(valdata[mat], ncol=nc)
 }

 fun(vals, indx)

 Rui Barradas


 --
 View this message in context:
 http://r.789695.n4.nabble.com/Re-index-values-of-one-matrix-to-another-of-a-different-size-tp4458666p4460575.html
 Sent from the R help mailing list archive at Nabble.com.

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


        [[alternative HTML version deleted]]

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.



-- 
Joshua Wiley
Ph.D. Student, Health Psychology
Programmer Analyst II, Statistical Consulting Group
University of California, Los Angeles
https://joshuawiley.com/

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] How do I do a pretty scatter plot using ggplot2?

2012-03-10 Thread Michael

Thanks Josh!

How do I make it 50% quantile in each bin instead of the mean?

Thanks a lot!

On Fri, Mar 9, 2012 at 9:11 PM, Joshua Wiley jwiley.ps...@gmail.com wrote:

 Hmm, smooth the chart makes me think you are trying to find the trends:


 require(ggplot2)
 ggplot(mtcars, aes(mpg, hp)) +
  geom_point() +
  stat_smooth()

 Try it out and see what you think---it adds a locally smoothed line
 that does something like trace the means (that is a very over
 simplification, but the gist of it).

 Cheers,

 Josh

 On Fri, Mar 9, 2012 at 7:00 PM, Michael comtech@gmail.com wrote:
  The origin of this problem was that a plain scatter plot with too many
  points with high dispersion generated too many points flying all over
  places.
 
  We are trying to smooth the charts a bit...
 
  Any good recommendations?
 
  Thanks a lot!
 
  On Fri, Mar 9, 2012 at 8:59 PM, Michael comtech@gmail.com wrote:
 
  Sorry for the confusion Michael.
 
  I myself am trying to figure out what my boss is requesting:
 
  I am certain that I need to plot the quantiles of each bin.   ...
 
  But how are the quantiles plotted? Shall I specify 50% quantile, etc?
 
  Being a diligent guy I am trying my hard to do some homework and figure
 it
  out myself...
 
  I thought there is a standard statistical prodedure that everybody
 knows...
 
  Any more thoughts?
 
  Thanks a lot!
 
 
  On Fri, Mar 9, 2012 at 8:51 PM, R. Michael Weylandt 
  michael.weyla...@gmail.com wrote:
 
  On Fri, Mar 9, 2012 at 9:28 PM, Michael comtech@gmail.com wrote:
   Thanks a lot Mike!
  
 
  Michael if you don't mind. (Though admittedly it leads to some degree
  of confusion in a conversation like this)
 
   Could you please explain your code a bit?
 
  Which part?
 
  
   My imagination is that for each bin, I am plotting a line which is
 the
   quantile of the y-values in that bin?
 
  Oh, so you want a qqnorm()-esque line? How is that like a scatterplot?
 
  yes, that's something else entirely (and not clear from your first
  post -- to my ear the quantile is a statistic tied to the [e]cdf)
  This is actually much easier in ggplot (and certainly doable in base
  as well)
 
  Try this,
 
  DAT - data.frame(x = runif(1000, 0, 20), y = rnorm(1000)) # Not so
  volatile this time
  DAT$xbin - with(DAT, cut(x, seq(0, 20, 5)))
 
  library(ggplot2)
  p - ggplot(DAT) + facet_wrap( ~ xbin) + stat_qq(aes(sample = y))
 
  print(p)
 
  If this isn't what you want, please spend some time to show an example
  of the sort of graph you desire (it can be a bit of code or a link to
  a picture or even a hand sketch hosted somewhere online)
 
  Out on a limb, I think you might really be thinking of something more
  like this:
 
  p - ggplot(DAT) + facet_wrap( ~ xbin) + geom_step(aes(x =
  seq_along(y), y = sort(y)))
 
  and see this for more: http://had.co.nz/ggplot2/geom_step.html
 
  Michael Weylandt
 
  
   I ran your program but couldn't figure out the meaning of the dots in
  your
   plot?
  
   Thanks again!
  
   On Fri, Mar 9, 2012 at 7:07 PM, R. Michael Weylandt
   michael.weyla...@gmail.com wrote:
  
   That doesn't really seem to make sense to me as a graphical
   representation (transforming adjacent y values differently), but if
   you really want to do so, here's what I'd do if I understand your
 goal
   (the preprocessing is independent of the graphics engine):
  
   DAT - data.frame(x = runif(1000, 0, 20), y = rcauchy(1000)^2) #
 Nice
   and volatile!
  
   # split y based on some x binning and assign empirical quantiles of
  each
   group
  
   DAT$yquant - with(DAT, ave(y, cut(x, seq(0, 20, 5)), FUN =
   function(x) ecdf(x)(x)))
  
   # BASE
   plot(yquant ~ x, data = DAT)
  
# ggplot2
   library(ggplot2)
  
   p - ggplot(DAT, aes(x = x, y = yquant)) + geom_point()
   print(p)
  
   Michael Weylandt
  
   PS -- I see Josh Wiley just responded pointing out your requirements
   #1 and #2 are incompatible: I've used 1 here.
  
   On Fri, Mar 9, 2012 at 7:37 PM, Michael comtech@gmail.com
 wrote:
Hi all,
   
I am trying hard to do the following and have already spent a few
  hours
in
vain:
   
I wanted to do the scatter plot.
   
But given the high dispersion on those dots, I would like to bin
 the
x-axis
and then for each bin of the x-axis, plot the quantiles of the
  y-values
of
the data points in each bin:
   
1. Uniform bin size on the x-axis;
2. Equal number of observations in each bin;
   
How to do that in R? I guess for the sake of prettyness, I'd
 better
  do
it
in ggplot2?
   
Thank you!
   
   [[alternative HTML version deleted]]
   
__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.htmlhttp://www.r-project.org/posting-guide.html
 http://www.r-project.org/posting-guide.html
and provide

Re: [R] How do I do a pretty scatter plot using ggplot2?

2012-03-10 Thread Michael

Thanks a lot!

Could you please elaborate on this one?

What I'd really do, if you had lots of data, would be to bin x into
small contiguous bins and to calculate quantiles for each of those
bins and to plot smoothers across the quantiles (using bin medians as
the x axis) 

On Fri, Mar 9, 2012 at 9:21 PM, R. Michael Weylandt 
michael.weyla...@gmail.com wrote:

 Could you just add a log scale to the y dimension?

 DAT - data.frame(x = runif(1000, 0, 20), y = rnorm(1000))

 plot(y ~ x, data = DAT, log = y)

 That lessens large dispersion (in some circumstances) but I'm not
 really sure what that has to do with smoothingdo you mean
 smoothing in the technical sense (loess, splines, and friends) or in
 some graphical sense?

 Still not sure what this has to do with quantile plots: they are
 usually diagnostic tools for examining distributional shape/fit.

 Here's two (related) ideas:

 i) If you have categorical x data, boxplots:
 http://had.co.nz/ggplot2/geom_boxplot.html

 ii) If you have continuous x data, quantile envelopes:
 http://had.co.nz/ggplot2/stat_quantile.html

 # In ggplot2

 DAT - data.frame(x = runif(1000, 0, 20), y = rnorm(1000))
 DAT$xbin - with(DAT, cut(x, seq(0, 20, 2)))

 p - ggplot(DAT, aes(x = x, y = y)) + geom_point(alpha = 0.2) +
 stat_quantile(aes(colour = ..quantile..), quantiles = seq(0.05, 0.95,
 by=0.05)) + facet_wrap(~ xbin, scales = free)
 print(p)

 What I'd really do, if you had lots of data, would be to bin x into
 small contiguous bins and to calculate quantiles for each of those
 bins and to plot smoothers across the quantiles (using bin medians as
 the x axis) -- I'm sure that's doable in ggplot2 as well.

 Michael

 On Fri, Mar 9, 2012 at 10:00 PM, Michael comtech@gmail.com wrote:
  The origin of this problem was that a plain scatter plot with too many
  points with high dispersion generated too many points flying all over
  places.
 
  We are trying to smooth the charts a bit...
 
  Any good recommendations?
 
  Thanks a lot!
 
  On Fri, Mar 9, 2012 at 8:59 PM, Michael comtech@gmail.com wrote:
 
  Sorry for the confusion Michael.
 
  I myself am trying to figure out what my boss is requesting:
 
  I am certain that I need to plot the quantiles of each bin.   ...
 
  But how are the quantiles plotted? Shall I specify 50% quantile, etc?
 
  Being a diligent guy I am trying my hard to do some homework and figure
 it
  out myself...
 
  I thought there is a standard statistical prodedure that everybody
  knows...
 
  Any more thoughts?
 
  Thanks a lot!
 
 
  On Fri, Mar 9, 2012 at 8:51 PM, R. Michael Weylandt
  michael.weyla...@gmail.com wrote:
 
  On Fri, Mar 9, 2012 at 9:28 PM, Michael comtech@gmail.com wrote:
   Thanks a lot Mike!
  
 
  Michael if you don't mind. (Though admittedly it leads to some degree
  of confusion in a conversation like this)
 
   Could you please explain your code a bit?
 
  Which part?
 
  
   My imagination is that for each bin, I am plotting a line which is
 the
   quantile of the y-values in that bin?
 
  Oh, so you want a qqnorm()-esque line? How is that like a scatterplot?
 
  yes, that's something else entirely (and not clear from your first
  post -- to my ear the quantile is a statistic tied to the [e]cdf)
  This is actually much easier in ggplot (and certainly doable in base
  as well)
 
  Try this,
 
  DAT - data.frame(x = runif(1000, 0, 20), y = rnorm(1000)) # Not so
  volatile this time
  DAT$xbin - with(DAT, cut(x, seq(0, 20, 5)))
 
  library(ggplot2)
  p - ggplot(DAT) + facet_wrap( ~ xbin) + stat_qq(aes(sample = y))
 
  print(p)
 
  If this isn't what you want, please spend some time to show an example
  of the sort of graph you desire (it can be a bit of code or a link to
  a picture or even a hand sketch hosted somewhere online)
 
  Out on a limb, I think you might really be thinking of something more
  like this:
 
  p - ggplot(DAT) + facet_wrap( ~ xbin) + geom_step(aes(x =
  seq_along(y), y = sort(y)))
 
  and see this for more: http://had.co.nz/ggplot2/geom_step.html
 
  Michael Weylandt
 
  
   I ran your program but couldn't figure out the meaning of the dots in
   your
   plot?
  
   Thanks again!
  
   On Fri, Mar 9, 2012 at 7:07 PM, R. Michael Weylandt
   michael.weyla...@gmail.com wrote:
  
   That doesn't really seem to make sense to me as a graphical
   representation (transforming adjacent y values differently), but if
   you really want to do so, here's what I'd do if I understand your
 goal
   (the preprocessing is independent of the graphics engine):
  
   DAT - data.frame(x = runif(1000, 0, 20), y = rcauchy(1000)^2) #
 Nice
   and volatile!
  
   # split y based on some x binning and assign empirical quantiles of
   each
   group
  
   DAT$yquant - with(DAT, ave(y, cut(x, seq(0, 20, 5)), FUN =
   function(x) ecdf(x)(x)))
  
   # BASE
   plot(yquant ~ x, data = DAT)
  
# ggplot2
   library(ggplot2)
  
   p - ggplot(DAT, aes(x = x, y = yquant)) + geom_point()
   print(p)
  
   Michael

[R] LME4 output

2012-03-10 Thread Zd Gibbs

I hope you all don't mind this question, but I need help interpreting output 
for a linear mixed effects model output I've been trying to learn to do in R. I 
am new to longitudinal data analysis and linear mixed effects regression. I 
have a model I fitted with weeks as the time predictor, and score on an 
employment course as my y. I modeled score with weeks (time) and several fixed 
effects, sex and race. My model includes random effects. I need help 
understanding what the variance means. The output is the following:
 
 
Random effects
Group NameVariance
EmpId intercept 980.236
weeks   13.562
 Residual  23.256
 
I really appreciate the help.  Thanks.
 
 
Zeda 
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] max.print

2012-03-10 Thread Peter Ehlers



On 2012-03-10 08:35, sybil kennelly wrote:

Dear all.

I wanted to read in a 20,000 row X 60 column matrix (called table) into R.

i did this:


R
table- read.table(table, header=TRUE)
table


it prints out the start of my table (~1 rows by 7 columns) and then
this error:


  [ reached getOption(max.print) -- omitted 5465 rows ]]
There were 50 or more warnings (use warnings() to see the first 50)

I have tried:

options(max.print = Inf)

and options(max.print = 9)

but i still get the same error. I have seen many people on R help have
this problem. However the solution of options(max.print = Inf) does
not seem to work for me.


Any ideas?


Well, I don't know why you would want to do this to your
eyeballs, but View() would seem to be your friend and
this is probably somewhere in the archives.

You can't set max.print to anything that can't be
coerced to integer (see ?integer) and I think that
setting it to Inf is no longer legal (if ever it was).
[Perhaps options() should generate a warning.]

Peter Ehlers




Syb

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Use different panel functions with lattice

2012-03-10 Thread ilai

On Sat, Mar 10, 2012 at 9:33 AM, Balaitous balait...@mailoo.org wrote:
 Hi,

 I have a data.frame df with
 names(df) = c(Var1, Var2, Var3, Var4)

 and I plot data with

 xyplot(Var1+Var2~Var3|Var4, data=df)

 I want to use different panel functions for Var1 and Var2.
 How can I do ?

You didn't specify which different panel functions you want. Is
something like this what you're looking for?

 xyplot(Var1+Var2~Var3|Var4, data=df, panel=panel.superpose,
 panel.groups=function(x , y , group.number , ...){
 panel.xyplot(x , y[group.number==1] , ...)
 panel.lines(x , y[group.number==2] , lwd=2 , col=1)
})



 Something like :

 panel.mypanel = function(x, y, ...) {
  if (Var1) panel.Var1Panel(x, y, ...)
  else panel.Var2Panel(x, y, ...)
 }
 xyplot(Var1+Var2~Var3|Var4, data=df, panel=panel.mypanel)

 (I have search with google, but I found nothing)

 Thanks

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] index values of one matrix to another of a different size

2012-03-10 Thread Ben quant

Very interesting. You are doing some stuff here that I have never seen.
Thank you. I will test it on my real data on Monday and let you know what I
find. That cmpfun function looks very useful!

Thanks,
Ben

On Sat, Mar 10, 2012 at 10:26 AM, Joshua Wiley jwiley.ps...@gmail.comwrote:

 Hi Ben,

 It seems likely that there are bigger bottle necks in your overall
 program/use---have you tried Rprof() to find where things really get
 slowed down?  In any case, f2() below takes about 70% of the time as
 your function in your test data, and 55-65% of the time for a bigger
 example I constructed.  Rui's function benefits substantially from
 byte compiling, but is still slower.  As a side benefit, f2() seems to
 use less memory than your current implementation.

 Cheers,

 Josh

 %%
 ##sample data ##
 vals - matrix(LETTERS[1:9], nrow = 3, ncol = 3,
  dimnames = list(c('row1','row2','row3'), c('col1','col2','col3')))

 indx - matrix(c(1,1,3,3,2,2,2,3,1,2,2,1), nrow=4, ncol=3)
 storage.mode(indx) - integer


 f - function(x, i, di = dim(i), dx = dim(x)) {
  out - x[c(i + matrix(0:(dx[1L] - 1L) * dx[1L], nrow = di[1L], ncol
 = di[2L], TRUE))]
  dim(out) - di
  return(out)
 }


 fun - function(valdata, inxdata){
nr - nrow(inxdata)
nc - ncol(inxdata)
mat - matrix(NA, nrow=nr*nc, ncol=2)
i1 - 1
i2 - nr
for(j in 1:nc){
mat[i1:i2, 1] - inxdata[, j]
mat[i1:i2, 2] - rep(j, nr)
i1 - i1 + nr
i2 - i2 + nr
}
matrix(valdata[mat], ncol=nc)
 }

 require(compiler)
 f2 - cmpfun(f)
 fun2 - cmpfun(fun)

 system.time(for (i in 1:1) f(vals, indx))
 system.time(for (i in 1:1) f2(vals, indx))
 system.time(for (i in 1:1) fun(vals, indx))
 system.time(for (i in 1:1) fun2(vals, indx))
 system.time(for (i in 1:1)

 matrix(vals[cbind(c(indx),rep(1:ncol(indx),each=nrow(indx)))],nrow=nrow(indx),ncol=ncol(indx)))

 ## now let's make a bigger test set
 set.seed(1)
 vals2 - matrix(sample(LETTERS, 10^7, TRUE), nrow = 10^4)
 indx2 - sapply(1:ncol(vals2), FUN = function(x) sample(10^4, 10^3, TRUE))

 dim(vals2)
 dim(indx2)

 ## the best contenders from round 1
 gold -
 matrix(vals2[cbind(c(indx2),rep(1:ncol(indx2),each=nrow(indx2)))],nrow=nrow(indx2),ncol=ncol(indx2))
 test1 - f2(vals2, indx2)
 all.equal(gold, test1)

 system.time(for (i in 1:20) f2(vals2, indx2))
 system.time(for (i in 1:20)

 matrix(vals2[cbind(c(indx2),rep(1:ncol(indx2),each=nrow(indx2)))],nrow=nrow(indx2),ncol=ncol(indx2)))

 %%

 On Sat, Mar 10, 2012 at 7:48 AM, Ben quant ccqu...@gmail.com wrote:
  Thanks for the info. Unfortunately its a little bit slower after one
 apples
  to apples test using my big data. Mine: 0.28 seconds. Yours. 0.73
 seconds.
  Not a big deal, but significant when I have to do this 300 to 500 times.
 
  regards,
 
  ben
 
  On Fri, Mar 9, 2012 at 1:23 PM, Rui Barradas rui1...@sapo.pt wrote:
 
  Hello,
 
  I don't know if it's the fastest but it's more natural to have an index
  matrix with two columns only,
  one for each coordinate. And it's fast.
 
  fun - function(valdata, inxdata){
 nr - nrow(inxdata)
 nc - ncol(inxdata)
 mat - matrix(NA, nrow=nr*nc, ncol=2)
 i1 - 1
 i2 - nr
 for(j in 1:nc){
 mat[i1:i2, 1] - inxdata[, j]
 mat[i1:i2, 2] - rep(j, nr)
 i1 - i1 + nr
 i2 - i2 + nr
 }
 matrix(valdata[mat], ncol=nc)
  }
 
  fun(vals, indx)
 
  Rui Barradas
 
 
  --
  View this message in context:
 
 http://r.789695.n4.nabble.com/Re-index-values-of-one-matrix-to-another-of-a-different-size-tp4458666p4460575.html
  Sent from the R help mailing list archive at Nabble.com.
 
  __
  R-help@r-project.org mailing list
  https://stat.ethz.ch/mailman/listinfo/r-help
  PLEASE do read the posting guide
  http://www.R-project.org/posting-guide.html
  and provide commented, minimal, self-contained, reproducible code.
 
 
 [[alternative HTML version deleted]]
 
  __
  R-help@r-project.org mailing list
  https://stat.ethz.ch/mailman/listinfo/r-help
  PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
  and provide commented, minimal, self-contained, reproducible code.



 --
 Joshua Wiley
 Ph.D. Student, Health Psychology
 Programmer Analyst II, Statistical Consulting Group
 University of California, Los Angeles
 https://joshuawiley.com/


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] index values of one matrix to another of a different size

2012-03-10 Thread Joshua Wiley

On Sat, Mar 10, 2012 at 12:11 PM, Ben quant ccqu...@gmail.com wrote:
 Very interesting. You are doing some stuff here that I have never seen.

and that I would not typically do or recommend (e.g., fussing with
storage mode or manually setting the dimensions of an object), but
that can be faster by sacrificing higher level functions flexibility
for lower level, more direct control.

 Thank you. I will test it on my real data on Monday and let you know what I
 find. That cmpfun function looks very useful!

It can reduce the overhead of repeated function calls.  I find the
biggest speedups when it is used with some sort of loop.  Then again,
many loops can be avoided entirely, which often yields even larger
performance gains.


 Thanks,

You're welcome.  You might also look at the data table package by
Matthew Dowle.  It does some *very* fast indexing and subsetting and
if those operations are serious slow down for you, you would likely
benefit substantially from using it.  One final comment, since you are
creating the matrix of indices; if you can create it in such a way
that it already has the vector position rather than row/column form,
you could eliminate the need for my f2() function altogether as you
could use it to directly index your data, and then just add dimensions
back afterward.

Cheers,

Josh

 Ben


 On Sat, Mar 10, 2012 at 10:26 AM, Joshua Wiley jwiley.ps...@gmail.com
 wrote:

 Hi Ben,

 It seems likely that there are bigger bottle necks in your overall
 program/use---have you tried Rprof() to find where things really get
 slowed down?  In any case, f2() below takes about 70% of the time as
 your function in your test data, and 55-65% of the time for a bigger
 example I constructed.  Rui's function benefits substantially from
 byte compiling, but is still slower.  As a side benefit, f2() seems to
 use less memory than your current implementation.

 Cheers,

 Josh

 %%
 ##sample data ##
 vals - matrix(LETTERS[1:9], nrow = 3, ncol = 3,
  dimnames = list(c('row1','row2','row3'), c('col1','col2','col3')))

 indx - matrix(c(1,1,3,3,2,2,2,3,1,2,2,1), nrow=4, ncol=3)
 storage.mode(indx) - integer


 f - function(x, i, di = dim(i), dx = dim(x)) {
  out - x[c(i + matrix(0:(dx[1L] - 1L) * dx[1L], nrow = di[1L], ncol
 = di[2L], TRUE))]
  dim(out) - di
  return(out)
 }


 fun - function(valdata, inxdata){
        nr - nrow(inxdata)
        nc - ncol(inxdata)
        mat - matrix(NA, nrow=nr*nc, ncol=2)
        i1 - 1
        i2 - nr
        for(j in 1:nc){
                mat[i1:i2, 1] - inxdata[, j]
                mat[i1:i2, 2] - rep(j, nr)
                i1 - i1 + nr
                i2 - i2 + nr
        }
        matrix(valdata[mat], ncol=nc)
 }

 require(compiler)
 f2 - cmpfun(f)
 fun2 - cmpfun(fun)

 system.time(for (i in 1:1) f(vals, indx))
 system.time(for (i in 1:1) f2(vals, indx))
 system.time(for (i in 1:1) fun(vals, indx))
 system.time(for (i in 1:1) fun2(vals, indx))
 system.time(for (i in 1:1)

 matrix(vals[cbind(c(indx),rep(1:ncol(indx),each=nrow(indx)))],nrow=nrow(indx),ncol=ncol(indx)))

 ## now let's make a bigger test set
 set.seed(1)
 vals2 - matrix(sample(LETTERS, 10^7, TRUE), nrow = 10^4)
 indx2 - sapply(1:ncol(vals2), FUN = function(x) sample(10^4, 10^3, TRUE))

 dim(vals2)
 dim(indx2)

 ## the best contenders from round 1
 gold -
 matrix(vals2[cbind(c(indx2),rep(1:ncol(indx2),each=nrow(indx2)))],nrow=nrow(indx2),ncol=ncol(indx2))
 test1 - f2(vals2, indx2)
 all.equal(gold, test1)

 system.time(for (i in 1:20) f2(vals2, indx2))
 system.time(for (i in 1:20)

 matrix(vals2[cbind(c(indx2),rep(1:ncol(indx2),each=nrow(indx2)))],nrow=nrow(indx2),ncol=ncol(indx2)))

 %%

 On Sat, Mar 10, 2012 at 7:48 AM, Ben quant ccqu...@gmail.com wrote:
  Thanks for the info. Unfortunately its a little bit slower after one
  apples
  to apples test using my big data. Mine: 0.28 seconds. Yours. 0.73
  seconds.
  Not a big deal, but significant when I have to do this 300 to 500 times.
 
  regards,
 
  ben
 
  On Fri, Mar 9, 2012 at 1:23 PM, Rui Barradas rui1...@sapo.pt wrote:
 
  Hello,
 
  I don't know if it's the fastest but it's more natural to have an index
  matrix with two columns only,
  one for each coordinate. And it's fast.
 
  fun - function(valdata, inxdata){
         nr - nrow(inxdata)
         nc - ncol(inxdata)
         mat - matrix(NA, nrow=nr*nc, ncol=2)
         i1 - 1
         i2 - nr
         for(j in 1:nc){
                 mat[i1:i2, 1] - inxdata[, j]
                 mat[i1:i2, 2] - rep(j, nr)
                 i1 - i1 + nr
                 i2 - i2 + nr
         }
         matrix(valdata[mat], ncol=nc)
  }
 
  fun(vals, indx)
 
  Rui Barradas
 
 
  --
  View this message in context:
 
  http://r.789695.n4.nabble.com/Re-index-values-of-one-matrix-to-another-of-a-different-size-tp4458666p4460575.html
  Sent from the R help mailing list archive at Nabble.com.

[R] How to fit a line through the Mountain crest, i.e., through the highest density of points - in a loess-like fashion.

2012-03-10 Thread Emmanuel Levy

Hi,

I'm trying to normalize data by fitting a line through the highest density
of points (in a 2D plot).
In other words, if you visualize the data as a density plot, the fit I'm
trying to achieve is the line that goes through the crest of the mountain.

This is similar yet different to what LOESS does. I've been using loess
before, but it does not exactly that as it takes into account all points.
Although points farther from the fit have a smaller weight, they result in
the fit being a bit off the crest.

Do you know a package or maybe even an option in loess that would allow me
achieve this?
Any advice or idea appreciated.

Emmanuel

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Generating abnormal returns in R

2012-03-10 Thread drsenne

Hello

This is my first post on this forum and I hope someone can help me out.
I have a datafile (weeklyR) with returns of +- 100 companies. 
I acquired this computing the following code:

library(tseries);
tickers  = c(GSPC , BP , TOT ,ENI.MI , VOW.BE ,   CS.PA ,
DAI.DE ,  ALV.DE ,  EOAN.DE , CA.PA ,   G.MI , DE , 
EXR.MI ,
MUV2.BE , UG.PA , PRU.L, VOD.L , DPB.BE , REP.MC , RWE.BE ,
AGN.AS , FTE.PA , EAD , LGEN.L , CNP.PA , ULVR.L , TKA.BE ,
RIO.L , NOK , SGO.PA , RNO.PA , VIE.PA , BAYN.DE , SAN.PA  ,
DG.PA , SSE.L , GSK.L , EN.PA , LYB , MLSNP.PA , IBE.MC ,
EURS.PA , AH.AS , VIV.PA , TIT.MI , VOLV-B.ST , ABI.BR ,
LHA.DE , OML.L , CNA.L , CON.DE , PHG , AZN.L , SBRY.L ,
BA.L , BT-A.L , AF.PA , 430021.VI , SL.L , ERIC-A.ST , CDI.PA
, AAL.L , ALO.PA , DELB.BR , HOT.BE , GAS.MC , SU.PA , OR.PA ,
FNC.MI , MRW.L , MAP.MC , ML.PA , IMT.L , EBK.DE , PP.PA ,
ACN , BTI , CRG.IR , CPG.L , BN.PA , NG.L , T7L.BE , HEIA.AS
, ACS.MC , LG.PA , STAN.L , ALU.PA , FRE.MU , SW.PA , WOS.L ,
AKZA.AS , HEN.MU)
for( series in tickers ){
print(series)
close -
get.hist.quote(instrument=series,retclass=zoo,quote=AdjClose,compression=d,
start=2000-1-1,  end=2011-12-31,quiet=TRUE)
if(series==tickers[1]){ pricedata = close }else{ pricedata = merge(
pricedata , close ) }
}
colnames(pricedata) = tickers
# Avoid a missing because of trade halt for that stock
pricedata = na.approx(pricedata)
weeklyR = diff(log(pricedata))
time(weeklyR) = as.Date(time(weeklyR))
print(weeklyR)
save(weeklyR , file = weeklyR.Rdata)
write.zoo(weeklyR,file=weeklyR.csv,quote=T,sep=,, na = NA, dec = . ,
row.names = F,col.names = T)

Now I need to make a market model in R so i can generate abnormal returns
from these stocks. As market index I would like to use the GSPC. I also need
to consider abnormal returns calculated over a sixty-trading-day window.
Can this be done in R? Is it difficult to write this code?

Any help would be much appreciated!

thanks

drsenne


--
View this message in context: 
http://r.789695.n4.nabble.com/Generating-abnormal-returns-in-R-tp4462541p4462541.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] function input as variable name (deparse/quote/paste) ??

2012-03-10 Thread casperyc

Hi all

Say I have a function:

myname=function(dat,x=5,y=6){
res-x+y-dat
}

for various input such as

myname(dat1)
myname(dat2)
myname(dat3)
myname(dat4)
myname(dat5)

how should I modify the 'res' line, to have new informative variable name
correspondingly, such as

dat1.res
dat2.res
dat3.res
dat4.res
dat5.res

stored in the workspace.

This is only an example of a complex function I have written.

Thanks in advance!

Casper




--
View this message in context: 
http://r.789695.n4.nabble.com/function-input-as-variable-name-deparse-quote-paste-tp4462841p4462841.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Draw values from multiple data sets as inputs to a Monte-Carlo function; then apply across entire matrix

2012-03-10 Thread Diann J Prosser

Hi all,

I am trying to implement a Monte-Carlo simulation for each cell in a 
spatial matrix (using mcd2 package) .
I have figured out how to conduct the simulation using data from a single 
location (where I manually input distribution parameters into the R code), 
but am having trouble (a) adjusting the code to pull input variables from 
my various data sets and then (b) applying the entire process across each 
of the cells of the matrices.

I have been doing a lot of reading about loops (a big no-no?), apply, and 
ddply, but can not quite figure it out. 

Here is the situation:

Data: 
I have (for simplicity) 3 spatial raster data sets (each 4848 x 4053 
cells) as ASCII files:
-Poultry density (mean value in each cell) 
-Poultry density (standard deviation in each cell)
-Wild bird density (single estimate in each cell)
I read them into R using read.table. The data look correct:
Pmn - read.table(D:/Data/PoultryMeans.txt)
Psd - read.table(D:/Data/PoultryStDev.txt)
Wde - read.table(D:/Data/WildBirdDensity.txt)

The Model:
In the Monte-Carlo simulation, Poultry and Wild birds have different 
distributions (normal and triangle, respectively).
Below are the 2 lines of code that use the mcstoc function to draw the 
samples.
The values in bold are ones that I would like to draw from the data tables 
I read in above. 
For example, 3.5 would be cell (i,j) in the Poultry MEAN density table; 
0.108 would be cell (i,j) in the Poultry STDEV table; and 47 the single 
estimate for cell (i,j) of the Wild bird density table.

Poultry - mcstoc(rnorm, type=U, 3.5, 0.108, rtrunc=TRUE, linf=0)
Wild - mcstoc(rtriang, type=U, min=0, mode=47, max=75)
Risk - Poultry * Wild#this is the risk function the MC is applied 
to

Questions:
1) How can I edit the Poultry and Wild variables above to read the data 
values directly from the 3 input tables (i.e., replacing 3.5, 0.108, and 
47 with some variable name for the data table and using a loop?? Or 
somehow use apply or ddply?)
2) Have the entire process be run for every cell in the 4848 x 4053 
matrix?

Thank you for any help you can provide to get me moving forward!
Diann


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] PCA in predefined Groups??

2012-03-10 Thread SHAFI

Hi

This has a simple answer but it has been eluding me nonetheless. 

I have been trying to build a PCA plot from scratch with the ability to plot
predefined groups in different colors.  I can plot PCA but I want it to plot
with predefined groups(samples) with top 100 expressed genes. I have three
groups. Can any body help me keeping in mind that the user is just beginner
in R.

Thanks in advance

--
View this message in context: 
http://r.789695.n4.nabble.com/PCA-in-predefined-Groups-tp4462536p4462536.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Finding the mean.

2012-03-10 Thread elliot.welch

Using functions how would I go about do this question?

(I already have a mean defined for a function of x.)

Write a function called MyMean2. This function has two arguments, x and 
nonzero, where nonzero has the default value TRUE. This function should return 
the 

(Previous defined mean of x) if nonzero=FALSE

(Previous defined mean of x) for all x's0 if nonzero=TRUE

Much appreciated.

elliot.we...@virgin.net
Sent from my BlackBerry® smartphone
__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Help on subgraphs in xyplot of lattice library

2012-03-10 Thread Chee Chen

Hi, Michael,
Thank you for your help!
In its simplified form,  the data frame looks like:

idx true_value  meandiff_mean1 diff_mean2diff_mean3sdt  
   diff_std1diff_std2diff_std3 samplesize
1   

 1000
2   

   1000
3   

  1000 
4   

1000
5   

   1000
1   

  5000
2   

   5000
3   

 5000   
  
4   

 5000
5   

5000
  
I would like the plot to be:
row1 has 4 subplots for samplesize 1000;
row2  has 4 subplots for samplesize 5000;

in each row:
 the 1st subplot is true_value against mean;
 the 2nd is an overlay plot for idx against diff_mean1, idx against diff_mean2, 
idx against diff_mean3; 
the 3rd is true_value against std;
the 4th is an overlay plot for idx against diff_std1, idx against diff_std2, 
idx against diff_std3.

I have looked at sample xyplot codes, but still did not know how to realize 
this.

Thanks again!
Chee







From: R. Michael Weylandt 
Sent: Saturday, March 10, 2012 12:20 PM
To: Chee Chen 
Cc: R-ORG 
Subject: Re: [R] Help on subgraphs in xyplot of lattice library


What does your data look likedput() is your friend.

Also, it'd be helpful if you could give base graphics code for
more-or-less what you are looking for (since you can do so already) as
it's pretty hard to describe graphics without pictures.

Running example(xyplot) might help you get started as well.

Michael

On Sat, Mar 10, 2012 at 12:04 PM, Chee Chen chee.c...@yahoo.com wrote:
 Dear All,
 I would like to ask a question on how to do overlay plots in each subgraph of 
 xyplot.
 1.  I did simulations for m=1000, 2500, 5000, 1, as the sample sizes.
 2. for each sample size value m,  4 graphs are generated; each graph contains 
 overlayed comparisons between 4 methods,
 3.  now I want put them into a 4-by-4  plot by xyplot, i.e.,  4 sample size 
 values, each of which has 4 plots.

 I know how to do this using plot, but the spaces between subplots are big.

 I do not know how to make each subplot in xyplot an overlayed one as it would 
 appear using plot.

 Any help would be appreciated!

 Thank you,
 Chee


[[alternative HTML version deleted]]

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] too many open devices

2012-03-10 Thread harold kincaid

I am getting too many open devices after 60 graphs. The archived
comments on this problem were too sketchy to be helpful. Any ideas?
Thanks Harold

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] PCA in predefined Groups??

2012-03-10 Thread chuck.01

Without taking away all the fun of trial and error, and exploration in R... I
will direct you to this website which I found invaluable when I first began
to use R. 

one way would be to use:
plot(Yourdata, type=n)
and then 3 text() or points() statements to plot the groups represented by
different colors.

Good luck!


SHAFI wrote
 
 Hi
 
 This has a simple answer but it has been eluding me nonetheless. 
 
 I have been trying to build a PCA plot from scratch with the ability to
 plot predefined groups in different colors.  I can plot PCA but I want it
 to plot with predefined groups(samples) with top 100 expressed genes. I
 have three groups. Can any body help me keeping in mind that the user is
 just beginner in R.
 
 Thanks in advance
 


--
View this message in context: 
http://r.789695.n4.nabble.com/PCA-in-predefined-Groups-tp4462536p4462765.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] applying a function in list of indexed elements of a vector:

2012-03-10 Thread aldi


Hi,

I have a vector
Y1 -c(8, 11,  7,  5,  6,  3,  6, 3,  3)
and an index
iy -c(c(1, 2),c(1 2), c(1, 2, 3, 4), c(2, 3, 5), c(4), c(5, 6, 7), c(7, 
8, 9))


how can I produce the mean, or the sum of the elements specified in the 
index iy from the vector Y1?


expecting something like this for the sum:
Y2
19 19 31 24 5 15 12

I thought lapply function may perform this, but does not work:
Y2-lapply(Y1[iy],sum)

Any suggestion?
TIA,

Aldi

--

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] too many open devices

2012-03-10 Thread jim holtman

It would help if you showed us how you were plotting.  are you calling
'dev.off()' after creating an output file? The comments on this
problem were to sketchy to be helpful.

On Sat, Mar 10, 2012 at 3:21 PM, harold kincaid
kincaidharold...@gmail.com wrote:
 I am getting too many open devices after 60 graphs. The archived
 comments on this problem were too sketchy to be helpful. Any ideas?
 Thanks Harold

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.



-- 
Jim Holtman
Data Munger Guru

What is the problem that you are trying to solve?
Tell me what you want to do, not how you want to do it.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Finding the mean.

2012-03-10 Thread jim holtman

if (nonzero) mean(x[x0]) else mean(x)

On Sat, Mar 10, 2012 at 2:47 PM,  elliot.we...@virgin.net wrote:
 Using functions how would I go about do this question?

 (I already have a mean defined for a function of x.)

 Write a function called MyMean2. This function has two arguments, x and 
 nonzero, where nonzero has the default value TRUE. This function should 
 return the

 (Previous defined mean of x) if nonzero=FALSE

 (Previous defined mean of x) for all x's0 if nonzero=TRUE

 Much appreciated.

 elliot.we...@virgin.net
 Sent from my BlackBerry® smartphone
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.



-- 
Jim Holtman
Data Munger Guru

What is the problem that you are trying to solve?
Tell me what you want to do, not how you want to do it.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Help with confidence intervals for gam model using mgcv

2012-03-10 Thread Anthony Staines


Hi,

I would be very grateful for advice on getting confidence 
intervals for the ordinary (non smoothed) parameter 
estimates from a gam.


Motivation
I am studying hospital outcomes in a large data set. The 
outcomes of interest to me are all binary variables. The one 
in the example here, Dead30d, is death within 30 days of 
admission. Sexf is gender (M or F), Age is age in years at 
the start of the admission. The standard glm is a logistic 
regression :-


glmDead.AS - glm(Dead30d~Sexf+Age, 
data=HIPE,family=binomial(link=logit))


The corresponding GAM, with a smooth for age, is :-

gamDead.AS - gam(Dead30d~Sexf+s(Age), 
data=HIPE,family=binomial(link=logit))


For my work, age is a nuisance. We already know exactly the 
effect of age (which has an odd shape). I have no interest 
in this parameter, nor in CIs for it. The GAM fits notably 
better than the GLM. The substantive interest is in the 
effects of the other variables, Sexf, and many more.


For the GLM, the confidence intervals are simple matter of 
confint(glmDead.AS). For my discipline CI's are required, 
and the profile CI's that confint produces are ideal.


There doesn't seem to be an analogous function for mgcv. The 
advice most commonly given is to use predict.gam with 
se.fit=TRUE. This does not seem to produce CI's for the 
non-smoothed parameters, which is what I need to calculate. 
CIs for the smooth, which are the focus of interest in many 
other cases are not of interest to me.


Any suggestions? Am I missing something very obvious?

Best wishes,
Anthony Staines
--
Anthony Staines, Professor of Health Systems,
School of Nursing and Human Sciences, DCU, Dublin 9,Ireland.
Tel:- +353 1 700 7807. Mobile:- +353 86 606 9713
http://astaines.eu/

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] function input as variable name (deparse/quote/paste) ??

2012-03-10 Thread Hans Ekbrand

On Sat, Mar 10, 2012 at 01:29:16PM -0800, casperyc wrote:
 Hi all
 
 Say I have a function:
 
 myname=function(dat,x=5,y=6){
 res-x+y-dat
 }
 
 for various input such as
 
 myname(dat1)
 myname(dat2)
 myname(dat3)
 myname(dat4)
 myname(dat5)
 
 how should I modify the 'res' line, to have new informative variable name
 correspondingly, such as
 
 dat1.res
 dat2.res
 dat3.res
 dat4.res
 dat5.res
 
 stored in the workspace.

Why not keep the information of input values in a list, or vector?
What is gained by storing that info in the variable _name_ ? Your
function could return a list with both the result and the input value.

While you did say that this was part of something complex, I suspect
your post might be a case of Being overly specific and not stating
your real goal.

-- 
Hans Ekbrand (http://sociologi.cjb.net) h...@sociologi.cjb.net

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] function input as variable name (deparse/quote/paste) ??

2012-03-10 Thread Thomas Lumley

On Sun, Mar 11, 2012 at 10:29 AM, casperyc caspe...@hotmail.co.uk wrote:
 Hi all

 Say I have a function:

 myname=function(dat,x=5,y=6){
    res-x+y-dat
 }

 for various input such as

 myname(dat1)
 myname(dat2)
 myname(dat3)
 myname(dat4)
 myname(dat5)

 how should I modify the 'res' line, to have new informative variable name
 correspondingly, such as

 dat1.res
 dat2.res
 dat3.res
 dat4.res
 dat5.res

You *can* do it with

myname=function(dat,x=5,y=6){
  name-paste(deparse(substitute(dat)),res,sep=.)
  assign(name, x+y-dat, parent.frame(), inherits=TRUE)
 }

but I would be very surprised if this is actually the best way to do
whatever complex thing you are really doing.

It's very unusual for assignments into the global workspace to be a
useful R programming technique.

   -thomas

-- 
Thomas Lumley
Professor of Biostatistics
University of Auckland

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Use different panel functions with lattice

2012-03-10 Thread ilai

Inline

On Sat, Mar 10, 2012 at 1:47 PM, Balaitous balait...@mailoo.org wrote:
 Le samedi 10 mars 2012 à 12:25 -0700, ilai a écrit :
 On Sat, Mar 10, 2012 at 9:33 AM, Balaitous balait...@mailoo.org wrote:

 Var1 and Var2 are 2 two different observed variables (with different scales)

You might want to consider scales=list(y=list(relation='free')) in ?xyplot

 Var3 is the time
 Var4 is the point of observation

 I have also a Var5 for groups, but I just want groups for the Var1.

snip


 But I don't know how to make the test
  if(Varx)

 in the function panel.mypanel, because I need

 Var1 - panel.superpose (It's OK)
 Var2 - panel.lines (I don't want groups for this)

 (And I will have others variables with other panel functions to use)


Since outer=T (i.e. Var1 and Var2 are in different panels), at the
beginning of the panel or panel.groups function, try

if(packet.number() %in% 1:3) {
panel.rect(x,y,groups,...)  # or whatever for panels 1:3
}
else{
panel.rect(x,y,groups,col=constant,...) # or some other stuff for panels 4:6
}

Hope that works better.



 
  Something like :
 
  panel.mypanel = function(x, y, ...) {
   if (Var1) panel.Var1Panel(x, y, ...)
   else panel.Var2Panel(x, y, ...)
  }
  xyplot(Var1+Var2~Var3|Var4, data=df, panel=panel.mypanel)
 
  (I have search with google, but I found nothing)
 
  Thanks
 
  __
  R-help@r-project.org mailing list
  https://stat.ethz.ch/mailman/listinfo/r-help
  PLEASE do read the posting guide 
  http://www.R-project.org/posting-guide.html
  and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] resume on error

2012-03-10 Thread Alaios

Dear all,
I would like to ask you how I can catch an error on R and then ask it to resume.

For example I have a large for loop and I know for a small number inside that 
loop there will be errors. How I can ask in that case from R just to ignore it 
and return back to the loop?

I would like to thank you in advance fro your help

B.R
Ale

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] odd error with rJava

2012-03-10 Thread Erin Hodgess

Hello!

I'm using R-2.14.2 on a Windows 7 64 bit machine and I did the following:



 install.packages(rJava,depen=TRUE)
--- Please select a CRAN mirror for use in this session ---
trying URL 
'http://cran.sixsigmaonline.org/bin/windows/contrib/2.14/rJava_0.9-3.zip'
Content type 'application/zip' length 745867 bytes (728 Kb)
opened URL
downloaded 728 Kb

package ‘rJava’ successfully unpacked and MD5 sums checked

The downloaded packages are in
C:\Users\erin\AppData\Local\Temp\RtmpgVpcnT\downloaded_packages
 library(OpenStreetMap)
Loading required package: rJava
Error : .onLoad failed in loadNamespace() for 'rJava', details:
  call: inDL(x, as.logical(local), as.logical(now), ...)
  error: unable to load shared object
'c:/R64/R-2.14.2/library/rJava/libs/x64/rJava.dll':
  LoadLibrary failure:  %1 is not a valid Win32 application.

Error: package ‘rJava’ could not be loaded
 library(rgdal)
Loading required package: sp
Geospatial Data Abstraction Library extensions to R successfully loaded
Loaded GDAL runtime: GDAL 1.9.0, released 2011/12/29
Path to GDAL shared files: c:/R64/R-2.14.2/library/rgdal/gdal
Loaded PROJ.4 runtime: Rel. 4.7.1, 23 September 2009, [PJ_VERSION: 470]
Path to PROJ.4 shared files: c:/R64/R-2.14.2/library/rgdal/proj


What am I doing wrong, please?

Thanks,
Erin

-- 
Erin Hodgess
Associate Professor
Department of Computer and Mathematical Sciences
University of Houston - Downtown
mailto: erinm.hodg...@gmail.com

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] odd error with rJava

2012-03-10 Thread Joshua Wiley

Hi Erin,

You need to make sure that rJava both installs correctly and can load.
 The R package system is quite robust, so off the top of my head, I
would guess you need to setup Java properly on the machine.  See the
rJava package for what it requires.

Cheers,

Josh

On Sat, Mar 10, 2012 at 3:19 PM, Erin Hodgess erinm.hodg...@gmail.com wrote:
 Hello!

 I'm using R-2.14.2 on a Windows 7 64 bit machine and I did the following:



 install.packages(rJava,depen=TRUE)
 --- Please select a CRAN mirror for use in this session ---
 trying URL 
 'http://cran.sixsigmaonline.org/bin/windows/contrib/2.14/rJava_0.9-3.zip'
 Content type 'application/zip' length 745867 bytes (728 Kb)
 opened URL
 downloaded 728 Kb

 package ‘rJava’ successfully unpacked and MD5 sums checked

 The downloaded packages are in
        C:\Users\erin\AppData\Local\Temp\RtmpgVpcnT\downloaded_packages
 library(OpenStreetMap)
 Loading required package: rJava
 Error : .onLoad failed in loadNamespace() for 'rJava', details:
  call: inDL(x, as.logical(local), as.logical(now), ...)
  error: unable to load shared object
 'c:/R64/R-2.14.2/library/rJava/libs/x64/rJava.dll':
  LoadLibrary failure:  %1 is not a valid Win32 application.

 Error: package ‘rJava’ could not be loaded
 library(rgdal)
 Loading required package: sp
 Geospatial Data Abstraction Library extensions to R successfully loaded
 Loaded GDAL runtime: GDAL 1.9.0, released 2011/12/29
 Path to GDAL shared files: c:/R64/R-2.14.2/library/rgdal/gdal
 Loaded PROJ.4 runtime: Rel. 4.7.1, 23 September 2009, [PJ_VERSION: 470]
 Path to PROJ.4 shared files: c:/R64/R-2.14.2/library/rgdal/proj


 What am I doing wrong, please?

 Thanks,
 Erin

 --
 Erin Hodgess
 Associate Professor
 Department of Computer and Mathematical Sciences
 University of Houston - Downtown
 mailto: erinm.hodg...@gmail.com

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.



-- 
Joshua Wiley
Ph.D. Student, Health Psychology
Programmer Analyst II, Statistical Consulting Group
University of California, Los Angeles
https://joshuawiley.com/

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] How to improve the robustness of loess? - example included.

2012-03-10 Thread Emmanuel Levy

Hi,

I posted a message earlier entitled How to fit a line through the
Mountain crest ...

I figured loess is probably the best way, but it seems that the
problem is the robustness of the fit. Below I paste an example to
illustrate the problem:

tmp=rnorm(2000)
X.background = 5+tmp; Y.background = 5+ (10*tmp+rnorm(2000))
X.specific = 3.5+3*runif(1000); Y.specific = 5+120*runif(1000)
X = c(X.background, X.specific);Y = c(Y.background, Y.specific)
MINx=range(X)[1];MAXx=range(X)[2]

my.loess = loess(Y ~ X, data.frame( X = X, Y = Y),
family=symmetric, degree=2, span=0.1)
lo.pred = predict(my.loess, data.frame(X = seq(MINx, MAXx,
length=100)), se=TRUE)
plot( seq(MINx, MAXx, length=100), lo.pred$fit, lwd=2,col=2, l)
points(X,Y, col= grey(abs(my.loess$res)/max(abs(my.loess$res))) )

As you will see, the red line does not follow the background signal.
However, when decreasing the specific signal to 500 points it
becomes perfect.

I'm sure there is a way to tune the fitting so that it works but I
can't figure out how. Importantly, *I cannot increase the span*
because in reality the relationship I'm looking at is more complex so
I need a small  span value to allow for a close fit.

I foresee that changing the weigthing is the way to go but I do not
really understand how the weight option is used (I tried to change
it and nothing happened), and also the embedded tricubic weighting
does not seem changeable.

So any idea would be very welcome.

Emmanuel

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Reading text files from other languages

2012-03-10 Thread Julio Sergio

I'm trying to read a data file that contains characters from the Spanish 
language:

 Station - read.fwf(LosDatos.txt,widths=c(7,7,25,8,8,5),header=FALSE,
+ skip=3,n=separ[1]-4)

Then the R interpreter issues the following message:

  Error en substring(x, first, last) : 
invalid multibyte string at 'd1A, S.'
  Calls: read.fwf - cat - sapply - lapply - FUN - substring

I know that the message is because there is a Ñ before the text A, S..

Is there a way to tell R that the text file is UTF-8 encoded?

Thanks,

--Sergio.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Reading text files from other languages

2012-03-10 Thread Joshua Wiley

Hi Julio,

If you look at the documentation for

?read.fwf

you will see '...' further arguments to be passed to 'read.table'

and if you look at

?read.table

you will see there is an argument called, 'encoding', so, yes.  Just
specify the encoding.

Cheers,

Josh


On Sat, Mar 10, 2012 at 3:41 PM, Julio Sergio julioser...@gmail.com wrote:
 I'm trying to read a data file that contains characters from the Spanish
 language:

 Station - read.fwf(LosDatos.txt,widths=c(7,7,25,8,8,5),header=FALSE,
 +                     skip=3,n=separ[1]-4)

 Then the R interpreter issues the following message:

  Error en substring(x, first, last) :
    invalid multibyte string at 'd1A, S.'
  Calls: read.fwf - cat - sapply - lapply - FUN - substring

 I know that the message is because there is a Ñ before the text A, S..

 Is there a way to tell R that the text file is UTF-8 encoded?

 Thanks,

 --Sergio.

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.



-- 
Joshua Wiley
Ph.D. Student, Health Psychology
Programmer Analyst II, Statistical Consulting Group
University of California, Los Angeles
https://joshuawiley.com/

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] resume on error

2012-03-10 Thread R. Michael Weylandt

? try or ? tryCatch

Michael

On Sat, Mar 10, 2012 at 6:08 PM, Alaios ala...@yahoo.com wrote:
 Dear all,
 I would like to ask you how I can catch an error on R and then ask it to 
 resume.

 For example I have a large for loop and I know for a small number inside that 
 loop there will be errors. How I can ask in that case from R just to ignore 
 it and return back to the loop?

 I would like to thank you in advance fro your help

 B.R
 Ale

        [[alternative HTML version deleted]]

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Generating abnormal returns in R

2012-03-10 Thread R. Michael Weylandt

Well, it's not hard to write the code for it, but if you know the
secret way to accurately model abnormal returns, you'll be a far
richer man than I quite soon.

Less snidely, one needs to say quite a bit more about a distribution
to specify it than not Gaussian.

Michael

On Sat, Mar 10, 2012 at 12:46 PM, drsenne dr_se...@pandora.be wrote:
 Hello

 This is my first post on this forum and I hope someone can help me out.
 I have a datafile (weeklyR) with returns of +- 100 companies.
 I acquired this computing the following code:

 library(tseries);
 tickers  = c(GSPC , BP , TOT ,    ENI.MI , VOW.BE ,   CS.PA ,
 DAI.DE ,      ALV.DE ,      EOAN.DE ,     CA.PA ,       G.MI , DE 
 , EXR.MI ,
 MUV2.BE , UG.PA , PRU.L, VOD.L , DPB.BE , REP.MC , RWE.BE ,
 AGN.AS , FTE.PA , EAD , LGEN.L , CNP.PA , ULVR.L , TKA.BE ,
 RIO.L , NOK , SGO.PA , RNO.PA , VIE.PA , BAYN.DE , SAN.PA  ,
 DG.PA , SSE.L , GSK.L , EN.PA , LYB , MLSNP.PA , IBE.MC ,
 EURS.PA , AH.AS , VIV.PA , TIT.MI , VOLV-B.ST , ABI.BR ,
 LHA.DE , OML.L , CNA.L , CON.DE , PHG , AZN.L , SBRY.L ,
 BA.L , BT-A.L , AF.PA , 430021.VI , SL.L , ERIC-A.ST , CDI.PA
 , AAL.L , ALO.PA , DELB.BR , HOT.BE , GAS.MC , SU.PA , OR.PA ,
 FNC.MI , MRW.L , MAP.MC , ML.PA , IMT.L , EBK.DE , PP.PA ,
 ACN , BTI , CRG.IR , CPG.L , BN.PA , NG.L , T7L.BE , HEIA.AS
 , ACS.MC , LG.PA , STAN.L , ALU.PA , FRE.MU , SW.PA , WOS.L ,
 AKZA.AS , HEN.MU)
 for( series in tickers ){
 print(series)
 close -
 get.hist.quote(instrument=series,retclass=zoo,quote=AdjClose,compression=d,
 start=2000-1-1,  end=2011-12-31,quiet=TRUE)
 if(series==tickers[1]){ pricedata = close }else{ pricedata = merge(
 pricedata , close ) }
 }
 colnames(pricedata) = tickers
 # Avoid a missing because of trade halt for that stock
 pricedata = na.approx(pricedata)
 weeklyR = diff(log(pricedata))
 time(weeklyR) = as.Date(time(weeklyR))
 print(weeklyR)
 save(weeklyR , file = weeklyR.Rdata)
 write.zoo(weeklyR,file=weeklyR.csv,quote=T,sep=,, na = NA, dec = . ,
 row.names = F,col.names = T)

 Now I need to make a market model in R so i can generate abnormal returns
 from these stocks. As market index I would like to use the GSPC. I also need
 to consider abnormal returns calculated over a sixty-trading-day window.
 Can this be done in R? Is it difficult to write this code?

 Any help would be much appreciated!

 thanks

 drsenne


 --
 View this message in context: 
 http://r.789695.n4.nabble.com/Generating-abnormal-returns-in-R-tp4462541p4462541.html
 Sent from the R help mailing list archive at Nabble.com.

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] applying a function in list of indexed elements of a vector:

2012-03-10 Thread R. Michael Weylandt

Your code for iy doesn't work as providedI'll assume you meant this instead:

iy - list(c(1, 2),c(1, 2), c(1, 2, 3, 4), c(2, 3, 5), c(4), c(5, 6,
7), c(7, 8, 9))

Then

sapply(iy, function(x) sum(Y1[x]))

Michael

On Sat, Mar 10, 2012 at 5:01 PM, aldi a...@dsgmail.wustl.edu wrote:
 Hi,

 I have a vector
 Y1 -c(8, 11,  7,  5,  6,  3,  6, 3,  3)
 and an index
 iy -c(c(1, 2),c(1 2), c(1, 2, 3, 4), c(2, 3, 5), c(4), c(5, 6, 7), c(7, 8,
 9))

 how can I produce the mean, or the sum of the elements specified in the
 index iy from the vector Y1?

 expecting something like this for the sum:
 Y2
 19 19 31 24 5 15 12

 I thought lapply function may perform this, but does not work:
 Y2-lapply(Y1[iy],sum)

 Any suggestion?
 TIA,

 Aldi

 --

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Generating abnormal returns in R

2012-03-10 Thread Mark Leeds

Hi Michael: abnormal returns in a term used in finance to describe the
residual return
after estimating a return model ( either capm or apt or whatever ) so the
needs
to build a return model ( capm is the easiest ) and then just calculate the
residuals.
these are termed the residual returns and can be negative or positive.

drsenne: you should send that to R-Sig-Finance or look around on the net.
It's an interesting exercise but you need to understand R pretty well and
install quantmod and be able to get the prices for all the stocks and run
regression models. I don't know where an R example
of it  is but Eric Zivot has a nice example in his S+Finmetrics book. If
you can get your
hands on that, it will show all the details. But, if you send your question
to R-Sig-Finance,
I bet someone over there will know where a good R example lies.


Mark

P.S: also checkout the website of systematic investor. I don't know if he
does exactly
the above but it does a lot of related things and provides R code.









On Sat, Mar 10, 2012 at 7:03 PM, R. Michael Weylandt 
michael.weyla...@gmail.com wrote:

 Well, it's not hard to write the code for it, but if you know the
 secret way to accurately model abnormal returns, you'll be a far
 richer man than I quite soon.

 Less snidely, one needs to say quite a bit more about a distribution
 to specify it than not Gaussian.

 Michael

 On Sat, Mar 10, 2012 at 12:46 PM, drsenne dr_se...@pandora.be wrote:
  Hello
 
  This is my first post on this forum and I hope someone can help me out.
  I have a datafile (weeklyR) with returns of +- 100 companies.
  I acquired this computing the following code:
 
  library(tseries);
  tickers  = c(GSPC , BP , TOT ,ENI.MI , VOW.BE ,   CS.PA
 ,
  DAI.DE ,  ALV.DE ,  EOAN.DE , CA.PA ,   G.MI
 , DE , EXR.MI ,
  MUV2.BE , UG.PA , PRU.L, VOD.L , DPB.BE , REP.MC , RWE.BE
 ,
  AGN.AS , FTE.PA , EAD , LGEN.L , CNP.PA , ULVR.L , TKA.BE
 ,
  RIO.L , NOK , SGO.PA , RNO.PA , VIE.PA , BAYN.DE , SAN.PA
  ,
  DG.PA , SSE.L , GSK.L , EN.PA , LYB , MLSNP.PA , IBE.MC ,
  EURS.PA , AH.AS , VIV.PA , TIT.MI , VOLV-B.ST , ABI.BR ,
  LHA.DE , OML.L , CNA.L , CON.DE , PHG , AZN.L , SBRY.L ,
  BA.L , BT-A.L , AF.PA , 430021.VI , SL.L , ERIC-A.ST , 
 CDI.PA
  , AAL.L , ALO.PA , DELB.BR , HOT.BE , GAS.MC , SU.PA , 
 OR.PA ,
  FNC.MI , MRW.L , MAP.MC , ML.PA , IMT.L , EBK.DE , PP.PA ,
  ACN , BTI , CRG.IR , CPG.L , BN.PA , NG.L , T7L.BE , 
 HEIA.AS
  , ACS.MC , LG.PA , STAN.L , ALU.PA , FRE.MU , SW.PA ,
 WOS.L ,
  AKZA.AS , HEN.MU)
  for( series in tickers ){
  print(series)
  close -
 
 get.hist.quote(instrument=series,retclass=zoo,quote=AdjClose,compression=d,
  start=2000-1-1,  end=2011-12-31,quiet=TRUE)
  if(series==tickers[1]){ pricedata = close }else{ pricedata = merge(
  pricedata , close ) }
  }
  colnames(pricedata) = tickers
  # Avoid a missing because of trade halt for that stock
  pricedata = na.approx(pricedata)
  weeklyR = diff(log(pricedata))
  time(weeklyR) = as.Date(time(weeklyR))
  print(weeklyR)
  save(weeklyR , file = weeklyR.Rdata)
  write.zoo(weeklyR,file=weeklyR.csv,quote=T,sep=,, na = NA, dec =
 . ,
  row.names = F,col.names = T)
 
  Now I need to make a market model in R so i can generate abnormal returns
  from these stocks. As market index I would like to use the GSPC. I also
 need
  to consider abnormal returns calculated over a sixty-trading-day window.
  Can this be done in R? Is it difficult to write this code?
 
  Any help would be much appreciated!
 
  thanks
 
  drsenne
 
 
  --
  View this message in context:
 http://r.789695.n4.nabble.com/Generating-abnormal-returns-in-R-tp4462541p4462541.html
  Sent from the R help mailing list archive at Nabble.com.
 
  __
  R-help@r-project.org mailing list
  https://stat.ethz.ch/mailman/listinfo/r-help
  PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
  and provide commented, minimal, self-contained, reproducible code.

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Reading text files from other languages

2012-03-10 Thread Julio Sergio

Joshua Wiley jwiley.psych at gmail.com writes:

 
 
 

Thanks Joshua!

Best regards,

--Sergio.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] How to fit a line through the Mountain crest, i.e., through the highest density of points - in a loess-like fashion.

2012-03-10 Thread David Winsemius



On Mar 10, 2012, at 3:55 PM, Emmanuel Levy wrote:


Hi,

I'm trying to normalize data by fitting a line through the highest  
density

of points (in a 2D plot).
In other words, if you visualize the data as a density plot, the fit  
I'm
trying to achieve is the line that goes through the crest of the  
mountain.


Are you familiar with the kde2d  of bkde2D functions in various  
packages? If you then collected the max density for each X and Y you  
might want to see whether that 2-d function would follow a  
sufficiently regular path that would represent the projection of the  
ridge on the z=0 plane.




This is similar yet different to what LOESS does.


Do you want a curve or a line?


I've been using loess
before, but it does not exactly that as it takes into account all  
points.
Although points farther from the fit have a smaller weight, they  
result in

the fit being a bit off the crest.

Do you know a package or maybe even an option in loess that would  
allow me

achieve this?


I don't. I happen to have a dataset where I could test it. But you are  
likely to get better responses if you provide a test case.



Any advice or idea appreciated.

Emmanuel

[[alternative HTML version deleted]]


Plain text is preferred.

--

David Winsemius, MD
West Hartford, CT

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] How to fit a line through the Mountain crest, i.e., through the highest density of points - in a loess-like fashion.

2012-03-10 Thread Emmanuel Levy

Hi,

Thanks a lot for your reply - I posted a second message where I
provide a dummy example, entitled
How to improve the robustness of loess? - example included.

I need to fit a curve which makes it a bit difficult to work with kde2d only.

I'm actually trying to use kde2d in combination with loess - basically
I give the output density of kde2d as weights in the loess function.
It seems to give nice results :)

In my second post I wrote that the weight option did not work but
that's because I was writing weigth - not sure why I did not get an
error message.

I'll post the lines of code as a reply to the second post.

All the best,

Emmanuel




On 10 March 2012 19:46, David Winsemius dwinsem...@comcast.net wrote:

 On Mar 10, 2012, at 3:55 PM, Emmanuel Levy wrote:

 Hi,

 I'm trying to normalize data by fitting a line through the highest density
 of points (in a 2D plot).
 In other words, if you visualize the data as a density plot, the fit I'm
 trying to achieve is the line that goes through the crest of the
 mountain.


 Are you familiar with the kde2d  of bkde2D functions in various packages? If
 you then collected the max density for each X and Y you might want to see
 whether that 2-d function would follow a sufficiently regular path that
 would represent the projection of the ridge on the z=0 plane.



 This is similar yet different to what LOESS does.


 Do you want a curve or a line?


 I've been using loess
 before, but it does not exactly that as it takes into account all points.
 Although points farther from the fit have a smaller weight, they result in
 the fit being a bit off the crest.

 Do you know a package or maybe even an option in loess that would allow me
 achieve this?


 I don't. I happen to have a dataset where I could test it. But you are
 likely to get better responses if you provide a test case.

 Any advice or idea appreciated.

 Emmanuel

[[alternative HTML version deleted]]


 Plain text is preferred.

 --

 David Winsemius, MD
 West Hartford, CT


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] How to improve the robustness of loess? - example included.

2012-03-10 Thread Emmanuel Levy

Ok so this seems to work :)


tmp=rnorm(2000)
X.background = 5+tmp
Y.background = 5+ (10*tmp+rnorm(2000))
X.specific = 3.5+3*runif(3000)
Y.specific = 5+120*runif(3000)

X = c(X.background, X.specific)
Y = c(Y.background, Y.specific)

MINx=range(X)[1]
MAXx=range(X)[2]
MINy=range(Y)[1]
MAXy=range(Y)[2]

  ## estimates the density for each datapoint
nBins=50
my.lims= c(range(X,na.rm=TRUE),range(Y,na.rm=TRUE))

z1 = kde2d(X,Y,n=nBins, lims=my.lims, h= c(
(my.lims[2]-my.lims[1])/(nBins/4) ,  (my.lims[4]-my.lims[3])/(nBins/4)
) )
X.cut = cut(X, seq(z1$x[1], z1$x[nBins],len=(nBins+1) ))
Y.cut = cut(Y, seq(z1$y[1], z1$y[nBins],len=(nBins+1) ))
xy.cuts = data.frame(X.cut,Y.cut, ord=1:(length(X.cut)) )
density = data.frame( X=rep(factor(levels(X.cut)),rep(nBins) ),
Y=rep(factor(levels(Y.cut)), rep(nBins,nBins) ) , Z= as.vector(z1$z))

xy.density = merge( xy.cuts, density, by=c(1,2), sort=FALSE, all.x=TRUE)
xy.density = xy.density[order(x=xy.density$ord),]

### Now uses the density as a weight
my.loess = loess(Y ~ X, data.frame( X = X, Y = Y),
family=symmetric, degree=2, span=0.1, weights= xy.density$Z^3)
lo.pred = predict(my.loess, data.frame(X = seq(MINx, MAXx,
length=100)), se=TRUE)
plot( seq(MINx, MAXx, length=100), lo.pred$fit, lwd=2,col=2, l)
#, ylim=c(0, max(tmp$fit, na.rm=TRUE) ) , col=dark grey)
points(X,Y, pch=., col= grey(abs(my.loess$res)/max(abs(my.loess$res))) )


On 10 March 2012 18:30, Emmanuel Levy emmanuel.l...@gmail.com wrote:
 Hi,

 I posted a message earlier entitled How to fit a line through the
 Mountain crest ...

 I figured loess is probably the best way, but it seems that the
 problem is the robustness of the fit. Below I paste an example to
 illustrate the problem:

tmp=rnorm(2000)
X.background = 5+tmp; Y.background = 5+ (10*tmp+rnorm(2000))
X.specific = 3.5+3*runif(1000); Y.specific = 5+120*runif(1000)
X = c(X.background, X.specific);Y = c(Y.background, Y.specific)
MINx=range(X)[1];MAXx=range(X)[2]

my.loess = loess(Y ~ X, data.frame( X = X, Y = Y),
 family=symmetric, degree=2, span=0.1)
lo.pred = predict(my.loess, data.frame(X = seq(MINx, MAXx,
 length=100)), se=TRUE)
plot( seq(MINx, MAXx, length=100), lo.pred$fit, lwd=2,col=2, l)
points(X,Y, col= grey(abs(my.loess$res)/max(abs(my.loess$res))) )

 As you will see, the red line does not follow the background signal.
 However, when decreasing the specific signal to 500 points it
 becomes perfect.

 I'm sure there is a way to tune the fitting so that it works but I
 can't figure out how. Importantly, *I cannot increase the span*
 because in reality the relationship I'm looking at is more complex so
 I need a small  span value to allow for a close fit.

 I foresee that changing the weigthing is the way to go but I do not
 really understand how the weight option is used (I tried to change
 it and nothing happened), and also the embedded tricubic weighting
 does not seem changeable.

 So any idea would be very welcome.

 Emmanuel

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Help on subgraphs in xyplot of lattice library

2012-03-10 Thread R. Michael Weylandt

That's not useful sample data -- like I said, dput() some sample data
and send it. I can try to figure out how to plot what you're asking,
but there is literally no data in what you sent.

Not copy and paste the output of a print command -- dput(). (You'll
understand why when you see it)

And like I also said, if you could give a sketch as to how you would
do this in base graphics, it will be much easier for us to help you
translate into lattice graphics. I only ask because you said you could
do so.

Michael

Also please send plain text if you know how.

On Sat, Mar 10, 2012 at 2:42 PM, Chee Chen chee.c...@yahoo.com wrote:
 Hi, Michael,
 Thank you for your help!
 In its simplified form,  the data frame looks like:

 idx true_value  mean    diff_mean1 diff_mean2    diff_mean3
  sdt diff_std1    diff_std2    diff_std3 samplesize
 1
      1000
 2
      1000
 3
       1000
 4
     1000
 5
   1000
 1
  5000
 2
      5000
 3
   5000
 4
   5000
 5
  5000

 I would like the plot to be:
 row1 has 4 subplots for samplesize 1000;
 row2  has 4 subplots for samplesize 5000;

 in each row:
  the 1st subplot is true_value against mean;
  the 2nd is an overlay plot for idx against diff_mean1, idx against
 diff_mean2, idx against diff_mean3;
 the 3rd is true_value against std;
 the 4th is an overlay plot for idx against diff_std1, idx against diff_std2,
 idx against diff_std3.

 I have looked at sample xyplot codes, but still did not know how to realize
 this.

 Thanks again!
 Chee






 From: R. Michael Weylandt
 Sent: Saturday, March 10, 2012 12:20 PM
 To: Chee Chen
 Cc: R-ORG
 Subject: Re: [R] Help on subgraphs in xyplot of lattice library

 What does your data look likedput() is your friend.

 Also, it'd be helpful if you could give base graphics code for
 more-or-less what you are looking for (since you can do so already) as
 it's pretty hard to describe graphics without pictures.

 Running example(xyplot) might help you get started as well.

 Michael

 On Sat, Mar 10, 2012 at 12:04 PM, Chee Chen chee.c...@yahoo.com wrote:
 Dear All,
 I would like to ask a question on how to do overlay plots in each subgraph
 of xyplot.
 1.  I did simulations for m=1000, 2500, 5000, 1, as the sample sizes.
 2. for each sample size value m,  4 graphs are generated; each graph
 contains overlayed comparisons between 4 methods,
 3.  now I want put them into a 4-by-4  plot by xyplot, i.e.,  4 sample
 size values, each of which has 4 plots.

 I know how to do this using plot, but the spaces between subplots are big.

 I do not know how to make each subplot in xyplot an overlayed one as it
 would appear using plot.

 Any help would be appreciated!

 Thank you,
 Chee


        [[alternative HTML version deleted]]

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] rpanel / list error

2012-03-10 Thread R. Michael Weylandt

Your immediate problem seems to be that you use sum as a variable
name when it is also a function name. You also have scoping issues
that result from how you're using with() -- if you don't return an
object, it gets thrown away after the with() function is done (part of
the functional paradigm) -- I've started to clean this up a little,
but it now bumps up against the fact you don't return things in the
rpanel bits -- I don't really use that package much but hopefully this
gets you going in the right way:

main - function(panel) {
SUM - with(panel,{
LAST = 1100
START = 0
INDX = 0    Starting Conditions
revenue = 0
minStock = panel$minStock
maxStock = 100
inventory = 100
order_costs = 0
storage_costs = 0
orderlevel = k
SUM = list(ninventory = inventory,
order_costs = 0,
storage_costs = 0,
revenue = 0,
index = INDX)
# initial list containing values

while(SUM$index  LAST  inventory 0) {
SUM$order_costs = SUM$order_costs + order_costs
SUM$storage_costs = SUM$storage_costs + storage_costs
SUM$ninventory = SUM$ninvenotry + inventory

SUM$index = SUM$index + 1
}
SUM
})
print(SUM)
sis = list(Time = SUM$index,
StorageCosts=SUM$storage_costs,
OrderCosts = SUM$order_cost,
fInventory = SUM$ninventory)
print(sis)
return(sis)
 }

panel - rp.control(title=Stochastic Case)
rp.button(panel,action=main,title=Calculate)
rp.slider(panel,k,from=10,to=90,resolution=10,showvalue=TRUE,title=Select
Order Size,initval=70)
rp.slider(panel,minStock,from=10,to=90,resolution=10,initval=
50,title=Minimum Stock Level,showvalue=TRUE)


Note also that index is a function so you need to be smart in how
you use that name.

Michael

On Fri, Mar 9, 2012 at 6:59 AM, jism7690 james.jism.ca...@gmail.com wrote:
 Hi Michael,

 Thank you for your reply. I have uploaded the minimum, I have left out the
 formulas for calculating the amounts as they are not important to the loop.
 Basically I have a while loop running that adds to the list of values and
 then outside this loop I have a list called sis, this is the list that is
 causing the error. I would like this list to return the values with panel,
 before I used rpanel it was returning values perfectly.

 Thanks

 main - function(panel)
 {
 with(panel,{

        LAST = 1100
        START = 0
        index = 0                            Starting Conditions
        revenue = 0
        minStock = panel$minStock
        maxStock = 100
        inventory = 100
        order_costs = 0
        storage_costs = 0
        orderlevel =panel$k
        sum = list(ninventory=inventory,order_costs=0,storage_costs=0,revenue 
 = 0)
 # initial list containing values

        while(index  LAST  inventory 0) {

        sum$order_costs = sum$order_costs + order_costs
        sum$storage_costs = sum$storage_costs + storage_costs
        sum$ninventory = sum$ninvenotry + inventory


  index = index + 1

 }
 })
        sis = list(Time = index,StorageCosts=sum$storage_costs,OrderCosts=
 sum$order_cost,fInventory = sum$ninventory)
        return(sis)
  }


 panel - rp.control(title=Stochastic Case, size=panel.size)
 rp.button(panel,action=main,title=Calculate,pos=pos.go.button)
 rp.slider(panel,k,from=10,to=90,resolution=10,showvalue=TRUE,title=Select
 Order Size,pos=pos.order.slider,initval=70)
 rp.slider(panel,minStock,from=10,to=90,resolution=10,pos=pos.minstock.slider,initval
 = 50,title=Minimum Stock Level,showvalue=TRUE)



 --
 View this message in context: 
 http://r.789695.n4.nabble.com/rpanel-list-error-tp4457308p4459254.html
 Sent from the R help mailing list archive at Nabble.com.

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] (Fisher) Randomization Test for Matched Pairs: Permutation Data Setup Based on Signs

2012-03-10 Thread R. Michael Weylandt

In general, I *think* this is a hard problem (it sounds knapsack-ish)
but since you are on small enough data sets, that's probably not so
important: if I understand you right, this little function will help
you.

plusminus - function(n){
t(as.matrix(do.call(expand.grid, rep(list(c(-1,1)), n
}
plusminus(3)
plusminus(5)

If you multiply the output of this function by your data set you will
have rows corresponding to all possible sign choices: e.g.,

plusminus(3) * c(1,2,3)

Then you can colSums() using only the positive elements:

x - plusminus(3) * c(1,2,3)
x[x  0] - 0

colSums(x)

To wrap this all in one function: I'd do something like this:

test.statistic - function(v){
m - t(as.matrix(do.call(expand.grid, rep(list(c(-1, 1)), length(v)
x - m * v
x[x0] - 0
out - rbind(m * v, colSums(x))
rownames(out)[length(rownames(out))] - Sum of Positive Elements
out
}

X - test.statistic(c(-16, -4, -7, -3, -5, +1, -10))
X[,1:10]

Hopefully that helps (I'm a little fuzzy on your overall goal -- so
that second bit might be a red herring)

Michael


On Fri, Mar 9, 2012 at 12:49 AM, Ghandalf mool...@hotmail.com wrote:
 Hi,

 I am currently attempting to write a small program for a randomization test
 (based on rank/combination) for matched pairs. If you will please allow me
 to introduce you to some background information regarding the test prior to
 my question at hand, or you may skip down to the bold portion for my issue.

 There are two sample sizes; the data, as I am sure you guessed, is matched
 into pairs and each pair's difference is denoted by Di.

 The test statistic =*T* = Sum(Di) (only for those Di  0).

 The issue I am having is based on the method required to use in R to setup
 the data into the proper structure. I am to consider the absolute value of
 Di, without regard to their sign. There are 2^n ways of assigning + or -
 signs to the set of absolute differences obtained, where n = the number of
 Dis. That is, we can assign + signs to all n of the |Di|, or we might assign
 + to |D1| but - signs to |D2| to |Dn|, and so forth.

  So, for example, if I have *D1=-16, D2=-4, D3=-7, D4=-3, D5=-5, D6=+1, and
 D7=-10 and n=7. *
 I need to consider the 2^7 ways of assigning signs that result in the lowest
 sum of the positive absolute difference. To exemplify further, we have
 *
 -16, -4, -7, -3, -5, -1, -10            T = 0
 -16, -4, -7, -3, -5, +1, -10           T = 1
 -16, -4, -7, +3, -5, -1, -10           T = 3
 -16, -4, -7, +3, -5, +1, -10          T = 4 *
 ... and so on.

 So, if you are willing to help me, I am having trouble on setting up my data
 as illustrated above./ How do I create (a code for) the 2^n lines of data
 required with all the possible combinations of + and - in order to calculate
 the positive values in each line (the test statistic, T)?/ I have tried to
 use combn(d=data set, n=7) with a data set, d, consisting of both the
 positive and negative sign of the respective value, to no avail.

 I apologize if this is lengthy, I was not sure how to ask the aforementioned
 question without incorrectly portraying my thoughts. If any clarification is
 required then I will by more than willing to oblige with any further
 explanation. I have searched for possible solutions, but alas, came out
 empty handed.

 Thank you.

 --
 View this message in context: 
 http://r.789695.n4.nabble.com/Fisher-Randomization-Test-for-Matched-Pairs-Permutation-Data-Setup-Based-on-Signs-tp4458606p4458606.html
 Sent from the R help mailing list archive at Nabble.com.

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Which non-parametric regression would allow fitting this type of data? (example given).

2012-03-10 Thread Emmanuel Levy

Hi,

I'm wondering which function would allow fitting this type of data:

tmp=rnorm(2000)
X.1 = 5+tmp
Y.1 = 5+ (5*tmp+rnorm(2000))
tmp=rnorm(100)
X.2 = 9+tmp
Y.2 = 40+ (1.5*tmp+rnorm(100))
X.3 = 7+ 0.5*runif(500)
Y.3 = 15+20*runif(500)
X = c(X.1,X.2,X.3)
Y = c(Y.1,Y.2,Y.3)
   plot(X,Y)

The problem with loess is that distances for the goodness of fit are
calculated on the Y-axis. However, distances would need to be
calculated on the normals of the fitted curve. Is there a function
that provide this option?

A simple trick in that case consists in swapping X and Y, but I'm
wondering if there is a more general solution?

Thanks for your input,

Emmanuel

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] function input as variable name (deparse/quote/paste) ??

2012-03-10 Thread casperyc

Sorry if I wasn't stating what I really wanted or it was a bit confusing.

Basically, there are MANY datasets to run suing the same function

I have written a function to analyze it and returns a LIST of useful out put
in the variable 'res' (to the workspace).

I also created another script run.r such as

myname(dat1)
myname(dat2)
myname(dat3)
myname(dat4)
myname(dat5) 

For now, each time the output in the main workspace 'res' (the list) is over
written.

I want it to have different suffix to differentiate them. So I can have a
look later after the batch is run.

Thanks.

casper

--
View this message in context: 
http://r.789695.n4.nabble.com/function-input-as-variable-name-deparse-quote-paste-tp4462841p4463044.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Which non-parametric regression would allow fitting this type of data? (example given).

2012-03-10 Thread Bert Gunter

Thanks for the example.

Have you tried fitting a principal curve via either the princurve or
pcurve packages?  I think this might work for what you want, but no
guarantees.

Note that loess, splines, etc. are all fitting y|x, that is, a
nonparametric regression of y on x. That is not what you say you want,
so these approaches are unlikely to work.


-- Bert

On Sat, Mar 10, 2012 at 6:20 PM, Emmanuel Levy emmanuel.l...@gmail.com wrote:
 Hi,

 I'm wondering which function would allow fitting this type of data:

    tmp=rnorm(2000)
    X.1 = 5+tmp
    Y.1 = 5+ (5*tmp+rnorm(2000))
    tmp=rnorm(100)
    X.2 = 9+tmp
    Y.2 = 40+ (1.5*tmp+rnorm(100))
    X.3 = 7+ 0.5*runif(500)
    Y.3 = 15+20*runif(500)
    X = c(X.1,X.2,X.3)
    Y = c(Y.1,Y.2,Y.3)
   plot(X,Y)

 The problem with loess is that distances for the goodness of fit are
 calculated on the Y-axis. However, distances would need to be
 calculated on the normals of the fitted curve. Is there a function
 that provide this option?

 A simple trick in that case consists in swapping X and Y, but I'm
 wondering if there is a more general solution?

 Thanks for your input,

 Emmanuel

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.



-- 

Bert Gunter
Genentech Nonclinical Biostatistics

Internal Contact Info:
Phone: 467-7374
Website:
http://pharmadevelopment.roche.com/index/pdb/pdb-functional-groups/pdb-biostatistics/pdb-ncb-home.htm

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] function input as variable name (deparse/quote/paste) ??

2012-03-10 Thread Berend Hasselman


On 11-03-2012, at 01:01, casperyc wrote:

 Sorry if I wasn't stating what I really wanted or it was a bit confusing.
 
 Basically, there are MANY datasets to run suing the same function
 
 I have written a function to analyze it and returns a LIST of useful out put
 in the variable 'res' (to the workspace).
 

Your function uses return?
Probably not.

 I also created another script run.r such as
 
 myname(dat1)
 myname(dat2)
 myname(dat3)
 myname(dat4)
 myname(dat5) 
 
 For now, each time the output in the main workspace 'res' (the list) is over
 written.
 
 I want it to have different suffix to differentiate them. So I can have a
 look later after the batch is run.

Well, if that is the case then there is a better way than doing global 
assignments in a function.

Make sure myfunction returns the list of results with return() and don't do 
global assignment with -

for( k in 1:5) {
dataname - paste(data,k,sep=)
resname   - paste(res,k,sep=)
assign(resname, myfunction(get(dataname)))
}


Berend

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] too many open devices

2012-03-10 Thread Patrick Connolly

On Sat, 10-Mar-2012 at 02:21PM -0600, harold kincaid wrote:

| I am getting too many open devices after 60 graphs. The archived
| comments on this problem were too sketchy to be helpful. Any ideas?

With minimal information, my guess might not be correct, but I suspect
you're plotting to a Windows device and a new one is opened for each
of your plots.  That would be some clutter on your screen.

You'd make life simpler if you used a pdf device that uses a new page
for each of your plots which can be hundreds of pages if you like.

Check out the help for pdf(), making sure you don't forget the
dev.off() part.


HTH
-- 
~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.   
   ___Patrick Connolly   
 {~._.~}   Great minds discuss ideas
 _( Y )_ Average minds discuss events 
(:_~*~_:)  Small minds discuss people  
 (_)-(_)  . Eleanor Roosevelt
  
~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

71 matches

Mail list logo