Re: [R] manova: R vs SAS...need some clarification

2008-08-13 Thread Prof Brian Ripley

Please see the footer of this message: we need to know what you did.

Also, SAS may have made some assumptions for you without telling you (for 
example used a numerically ill-conditioned covariance matrix), and we 
don't know what you did in SAS, either.


On Tue, 12 Aug 2008, Pedro Mardones wrote:


Dear all;
working with a 'fat' data set (700 variables / 50 samples) and trying
to run a manova test on it (I'm aware that it's not the best option
for this kind of data set) I got the error in the summary.manova
function about the rank of the residuals (rank  # variables). Ok. The
thing that I don't understand is why I don't get the same type of
error in SAS. There seems to be no problem with rank deficiency and
the fit-statistics in SAS (no negative DF or something like that...).
I'm sure it must be some differences in the way the manova test is
calculated but I don't know what they are, so I'll appreciate any
comments...
Thanks
PM

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



--
Brian D. Ripley,  [EMAIL PROTECTED]
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford, Tel:  +44 1865 272861 (self)
1 South Parks Road, +44 1865 272866 (PA)
Oxford OX1 3TG, UKFax:  +44 1865 272595

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Update Package on CRAN

2008-08-13 Thread Uwe Ligges



stephen sefick wrote:

To update a package on CRAN I just update all of the version
information stuff etc. and then upload it to the ftp site?
Stephen Sefick



Yes, just build the package and submit it to CRAN as before, but with an 
increased version number.


Uwe Ligges

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Installing R in Ubuntu

2008-08-13 Thread Senthil Kumar M
On Tue, Aug 12, 2008 at 9:24 PM, Shreyasee Pradhan
[EMAIL PROTECTED] wrote:
 Hi,

 I am running Ubuntu on my Windows OS through VMware.
 I am trying to install R in Ubuntu, but not getting with those commands,
 which are there on the site.
 Can anyone please tell me how to install it, stepwise, with commands to be
 used.
 As I m new to Ubuntu as well, I am not aware of the commands very well.
snipped

Hi,

What commands did you try ? What worked and what didn't ? Which site
did you refer ?

Please read the posting guidelines here:
http://www.r-project.org/posting-guide.html

In the Ubuntu command line, try:

sudo aptitude install r-base

And for a list of R packages that you can install from the Ubuntu repositories:

aptitude search r- | grep [^A-Za-z0-9] r-

Install them like this:

sudo aptitude install r-cran-package-name

HTH,

Senthil

-/

You see, but you do not observe. The distinction is clear.
Sir Arthur Conan Doyle in, The Memoirs of Sherlock Holmes

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] aligned memory allocation in C

2008-08-13 Thread Prof Brian Ripley

On Tue, 12 Aug 2008, Jeffrey Horner wrote:


Christophe Dutang1 wrote:

Hi,

I'm currently R porting SF Mersenne Twister algorithm of Matsumoto and 
Saito. To get the full power of their code, I want to use their fonction 
fill_array32 which need aligned memory. That is to say I need to use the C 
function memalign on windows, posix_memalign on linux and classic malloc on 
Mac OS. In 'writing R extenstion', they recommand to use R_alloc function 
to allocate memory in C.


Does R_alloc return a pointer to aligned memory?
if not how can I do this?
probably no, because R crashes when I succesively R_alloc and fill_array32 
(cf below) on my macbook with R 2.7.1.


You can still do this. Just take the address returned from R_alloc and test 
for alignment. If it's not, then just use an aligned address beyond the one 
returned.


We haven't been told what the desired alignment is (and those functions 
need to be told).  On 32-bit Mac OS X, R_alloc is definitely aligned on 
4-byte boundaries (on 64-bit OSes it is usually 8-byte aligned).


(But then the question is, which direction beyond the one returned? How does 
one test for that?)


Addresses always go upwards.  So if you want 64-byte alignment you need to 
allocate a block at least 64 bytes longer than required, and go up to the 
nearest multiple of 64.


BTW, this is clearly an R-devel question -- see the posting guide.




Jeff


Thanks in advance

Kind regards

Christophe


PS : 
http://www.math.sci.hiroshima-u.ac.jp/~m-mat/MT/SFMT/howto-compile.html 
provides an example of memalign.


PPS : mac os report


[removed]

--
Brian D. Ripley,  [EMAIL PROTECTED]
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford, Tel:  +44 1865 272861 (self)
1 South Parks Road, +44 1865 272866 (PA)
Oxford OX1 3TG, UKFax:  +44 1865 272595

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Installing R in Ubuntu

2008-08-13 Thread Shreyasee Pradhan
Hi,

Thanks for that.

the way I tried is as follows:
1) Downloaded the r-base package
2) went in that directory where the r-base package was downloaded from
command line
3) entered the command
   sudo apt-get install r-base
But got the error, that Couldn't find r-base command.

I don't understand where I went I wrong.
I will definitely try the following commands.

Thanks,
Shreyasee

On Wed, Aug 13, 2008 at 12:02 PM, Senthil Kumar M
[EMAIL PROTECTED]wrote:

 On Tue, Aug 12, 2008 at 9:24 PM, Shreyasee Pradhan
 [EMAIL PROTECTED] wrote:
  Hi,
 
  I am running Ubuntu on my Windows OS through VMware.
  I am trying to install R in Ubuntu, but not getting with those commands,
  which are there on the site.
  Can anyone please tell me how to install it, stepwise, with commands to
 be
  used.
  As I m new to Ubuntu as well, I am not aware of the commands very well.
 snipped

 Hi,

 What commands did you try ? What worked and what didn't ? Which site
 did you refer ?

 Please read the posting guidelines here:
 http://www.r-project.org/posting-guide.html

 In the Ubuntu command line, try:

 sudo aptitude install r-base

 And for a list of R packages that you can install from the Ubuntu
 repositories:

 aptitude search r- | grep [^A-Za-z0-9] r-

 Install them like this:

 sudo aptitude install r-cran-package-name

 HTH,

 Senthil

 -/

 You see, but you do not observe. The distinction is clear.
 Sir Arthur Conan Doyle in, The Memoirs of Sherlock Holmes


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Installing R in Ubuntu

2008-08-13 Thread poppyer

Shreyasee Pradhan [EMAIL PROTECTED] writes:

 Hi,

 Thanks for that.

 the way I tried is as follows:
 1) Downloaded the r-base package
 2) went in that directory where the r-base package was downloaded from
 command line
 3) entered the command
sudo apt-get install r-base
 But got the error, that Couldn't find r-base command.

 I don't understand where I went I wrong.
 I will definitely try the following commands.

I think you probably has no universe repository in your
/etc/apt/source.list. all R related stuffs are in universe

Try to google something like source.lst generator, if you are new to
aptsource.lst 


Cheers

poppyer

 Thanks,
 Shreyasee

 On Wed, Aug 13, 2008 at 12:02 PM, Senthil Kumar M
 [EMAIL PROTECTED]wrote:

 On Tue, Aug 12, 2008 at 9:24 PM, Shreyasee Pradhan
 [EMAIL PROTECTED] wrote:
  Hi,
 
  I am running Ubuntu on my Windows OS through VMware.
  I am trying to install R in Ubuntu, but not getting with those commands,
  which are there on the site.
  Can anyone please tell me how to install it, stepwise, with commands to
 be
  used.
  As I m new to Ubuntu as well, I am not aware of the commands very well.
 snipped

 Hi,

 What commands did you try ? What worked and what didn't ? Which site
 did you refer ?

 Please read the posting guidelines here:
 http://www.r-project.org/posting-guide.html

 In the Ubuntu command line, try:

 sudo aptitude install r-base

 And for a list of R packages that you can install from the Ubuntu
 repositories:

 aptitude search r- | grep [^A-Za-z0-9] r-

 Install them like this:

 sudo aptitude install r-cran-package-name

 HTH,

 Senthil

 -/

 You see, but you do not observe. The distinction is clear.
 Sir Arthur Conan Doyle in, The Memoirs of Sherlock Holmes


   [[alternative HTML version deleted]]

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] sqlQuery with date attribute

2008-08-13 Thread Samuel Bächler

Hi Many

GetReturn-function(code,date)
{
db-C:/Test.mdb
channel-odbcConnectAccess(db)
ssql-paste(select * from tblCalendarDate Where CalendarID 
=,code,and DateRebal =,date)

print(ssql)# so as i can see what ssql contains
mydata-sqlQuery(channel,ssql)
mydata
}

[snip]


This is the content of my table tblCalendarDate:
CalendarIDDateRebal
129/09/2006
110/10/2006
120/10/2006
131/10/2006
110/11/2006
120/11/2006

Actually, the channel is open but the query on the table did not 
perform the query correctly, here is the

result of the function when i run GetReturn(1,2007-03-01) for example:
Something with the formatting of the date goes wrong as I think. In the 
table tblCalendarDate you have it
like *29/09/2006* but in your function you have it as *2007-03-01*. 
Dig deeper by experimenting with
the dates format. You can experiment in Access itself to see what kind 
of dates Access accepts.


s.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] sqlQuery with date attribute

2008-08-13 Thread Abderrazzak MANY

Thank you for your answer.
Actually, I've tried with this function where I added the # symbol  
between the date:


GetReturn-function(code,date)
{
db-C:/Test.mdb
channel-odbcConnectAccess(db)
ssql-paste(select * from tblCalendarDate Where CalendarID  
=,code,and DateRebal=

#,date,#)
print(ssql)# so as i can see what ssql contains
mydata-sqlQuery(channel,ssql)
mydata
}
GetReturn(1,2007-01-10)

And it works when I run simply the command GetReturn(1,2007-03-01)


Samuel Bächler [EMAIL PROTECTED] a écrit :


Hi Many

GetReturn-function(code,date)
{
db-C:/Test.mdb
channel-odbcConnectAccess(db)
ssql-paste(select * from tblCalendarDate Where CalendarID  
=,code,and DateRebal =,date)

print(ssql)# so as i can see what ssql contains
mydata-sqlQuery(channel,ssql)
mydata
}

[snip]


This is the content of my table tblCalendarDate:
CalendarIDDateRebal
129/09/2006
110/10/2006
120/10/2006
131/10/2006
110/11/2006
120/11/2006

Actually, the channel is open but the query on the table did not  
perform the query correctly, here is the

result of the function when i run GetReturn(1,2007-03-01) for example:
Something with the formatting of the date goes wrong as I think. In  
the table tblCalendarDate you have it
like *29/09/2006* but in your function you have it as  
*2007-03-01*. Dig deeper by experimenting with
the dates format. You can experiment in Access itself to see what  
kind of dates Access accepts.


s.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] summary.manova rank deficiency error + data

2008-08-13 Thread Peter Dalgaard

Pedro Mardones wrote:

Dear R-users;

Previously I posted a question about the problem of rank deficiency in
summary.manova. As somebody suggested, I'm attaching a small part of
the data set.

#***

test -

structure(.Data = list(structure(.Data = c(rep(1,3),rep(2,18),rep(3,10)),
levels = c(1, 2, 3),
class = factor)

,c(0.181829,0.090159,0.115824,0.112804,0.134650,0.249136,0.163144,0.122012,0.157554,0.126283,
0.105344,0.125125,0.126232,0.084317,0.092836,0.108546,0.159165,0.121620,0.142326,0.122770,
0.117480,0.153762,0.156551,0.185058,0.161651,0.182331,0.139531,0.188101,0.103196,0.116877,0.113733)

,c(0.181445,0.090254,0.115840,0.112863,0.134610,0.249003,0.163116,0.122135,0.157206,0.126129,
0.105302,0.124917,0.126243,0.084455,0.092818,0.108458,0.158769,0.121244,0.141981,0.122595,
0.117556,0.153507,0.156308,0.184644,0.161421,0.181999,0.139376,0.187708,0.103126,0.116615,0.113746)

,c(0.181058,0.090426,0.115926,0.113022,0.134632,0.248845,0.163140,0.122331,0.156871,0.126023,
0.105335,0.124757,0.126325,0.084690,0.092885,0.108455,0.158386,0.120913,0.141676,0.122492,
0.117707,0.153293,0.156095,0.184242,0.161214,0.181670,0.139271,0.187318,0.103129,0.116421,0.113826)

,c(0.180692,0.090704,0.116110,0.113319,0.134745,0.248678,0.163256,0.122637,0.156581,0.125998,
0.105479,0.124686,0.126514,0.085066,0.093088,0.108587,0.158040,0.120674,0.141446,0.122488,
0.117972,0.153150,0.155954,0.183885,0.161063,0.181383,0.139251,0.186956,0.103232,0.116351,0.114001)

,c(0.180353,0.091088,0.116392,0.113753,0.134965,0.248520,0.163475,0.123046,0.156354,0.126067,
0.105726,0.124713,0.126821,0.085584,0.093432,0.108858,0.157742,0.120533,0.141309,0.122595,
0.118340,0.153088,0.155897,0.183582,0.160975,0.181143,0.139314,0.186636,0.103449,0.116415,0.114275)
)
,names = c(GROUP, Y1, Y2, Y3, Y4,Y5)
,row.names = seq(1:31)
,class = data.frame
)

summary(manova(cbind(Y1,Y2,Y3,Y4,Y5)~GROUP, test), test = Wilks)

#Error in summary.manova(manova(cbind(Y1, Y2, Y3, Y4, Y5) ~ GROUP, test),  :
  residuals have rank 3  5

#***

What I don't understand is why SAS returns no errors using PROC GLM
for the same data set. Is because PROC GLM doesn't take into account
problems of rank deficiency? So, should I trust manova instead of PROC
GLM output? I know it can be a touchy question but I would like to
receive some insights.
Thanks
PM

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
  

What you have here is extremely correlated data:

 (V - estVar(lm(cbind(Y1,Y2,Y3,Y4,Y5)~GROUP, test)))
Y1 Y2 Y3 Y4 Y5
Y1 0.001262567 0.001259177 0.001254746 0.001249106 0.001242385
Y2 0.001259177 0.001255814 0.001251416 0.001245812 0.001239132
Y3 0.001254746 0.001251416 0.001247055 0.001241494 0.001234861
Y4 0.001249106 0.001245812 0.001241494 0.001235983 0.001229405
Y5 0.001242385 0.001239132 0.001234861 0.001229405 0.001222889
 eigen(V)
$values
[1] 6.224077e-03 2.313066e-07 3.499837e-10 4.259125e-12 1.334146e-12

$vectors
[,1] [,2] [,3] [,4] [,5]
[1,] 0.4503756 0.61213579 0.5204920 -0.3485941 0.1732681
[2,] 0.4491807 0.32333236 -0.1873653 0.5929444 -0.5540795
[3,] 0.4476157 0.01442094 -0.5498688 0.1272921 0.6934503
[4,] 0.4456201 -0.31202109 -0.3198606 -0.6557557 -0.4144143
[5,] 0.4432397 -0.65052351 0.5378809 0.2840428 0.1017918

Notice the more than 9 orders of magnitude between the eigenvalues.

I think that what is happening is that what SAS calls MANOVA is actually 
looking at within-row contrasts, which effectively removes the largest 
eigenvalue. In R, the equivalent would be


 anova(lm(cbind(Y1,Y2,Y3,Y4,Y5)~GROUP, test), X=~1, test = Wilks)
Analysis of Variance Table


Contrasts orthogonal to
~1

Df Wilks approx F num Df den Df Pr(F)
(Intercept) 1 0.037 164.873 4 25 2e-16 ***
GROUP 2 0.701 1.215 8 50 0.3098
Residuals 28
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

or (this could be computationally more precice, but in fact it gives the 
same result)


 anova(lm(cbind(Y2,Y3,Y4,Y5)-Y1~GROUP, test), test = Wilks)


--
  O__   Peter Dalgaard Øster Farimagsgade 5, Entr.B
 c/ /'_ --- Dept. of Biostatistics PO Box 2099, 1014 Cph. K
(*) \(*) -- University of Copenhagen   Denmark  Ph:  (+45) 35327918
~~ - ([EMAIL PROTECTED])  FAX: (+45) 35327907

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Senging commands to the GUI in Windows through a script

2008-08-13 Thread tolga . i . uzuner
Cool, many thanks Henrik. 
Tolga



Henrik Bengtsson [EMAIL PROTECTED] 
Sent by: [EMAIL PROTECTED]
13/08/2008 02:03

To
[EMAIL PROTECTED]
cc
Prof Brian Ripley [EMAIL PROTECTED], r-help@r-project.org
Subject
Re: [R] Senging commands to the GUI in Windows through a script






With AutoIt [http://www.autoitscript.com/] you can setup scripts that
send keyboard and mouse events, wait for windows to open and more.  It
is quite powerful.

/Henrik

On Tue, Aug 12, 2008 at 4:51 AM,  [EMAIL PROTECTED] wrote:
 OK thanks, Tolga



 Prof Brian Ripley [EMAIL PROTECTED]
 12/08/2008 12:46

 To
 [EMAIL PROTECTED]
 cc
 r-help@r-project.org
 Subject
 Re: [R] Senging commands to the GUI in Windows through a script






 On Tue, 12 Aug 2008, [EMAIL PROTECTED] wrote:

 Dear R Users,

 How can I send commands to the R GUI from within a R script in 
Microsoft
 Windows ? I am trying to get the windows within the R GUI to Tile after
 I
 draw a graph.

 Not directly (it's possible you can via COM, but there is no R function 
to

 do so).

 Thanks in advance,
 Tolga

 --
 Brian D. Ripley,  [EMAIL PROTECTED]
 Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
 University of Oxford, Tel:  +44 1865 272861 (self)
 1 South Parks Road, +44 1865 272866 (PA)
 Oxford OX1 3TG, UKFax:  +44 1865 272595



 Generally, this communication is for informational purposes only
 and it is not intended as an offer or solicitation for the purchase
 or sale of any financial instrument or as an official confirmation
 of any transaction. In the event you are receiving the offering
 materials attached below related to your interest in hedge funds or
 private equity, this communication may be intended as an offer or
 solicitation for the purchase or sale of such fund(s).  All market
 prices, data and other information are not warranted as to
 completeness or accuracy and are subject to change without notice.
 Any comments or statements made herein do not necessarily reflect
 those of JPMorgan Chase  Co., its subsidiaries and affiliates.

 This transmission may contain information that is privileged,
 confidential, legally privileged, and/or exempt from disclosure
 under applicable law. If you are not the intended recipient, you
 are hereby notified that any disclosure, copying, distribution, or
 use of the information contained herein (including any reliance
 thereon) is STRICTLY PROHIBITED. Although this transmission and any
 attachments are believed to be free of any virus or other defect
 that might affect any computer system into which it is received and
 opened, it is the responsibility of the recipient to ensure that it
 is virus free and no responsibility is accepted by JPMorgan Chase 
 Co., its subsidiaries and affiliates, as applicable, for any loss
 or damage arising in any way from its use. If you received this
 transmission in error, please immediately contact the sender and
 destroy the material in its entirety, whether in electronic or hard
 copy format. Thank you.
 Please refer to http://www.jpmorgan.com/pages/disclosures for
 disclosures relating to UK legal entities.
[[alternative HTML version deleted]]

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide 
http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




Generally, this communication is for informational purposes only
and it is not intended as an offer or solicitation for the purchase
or sale of any financial instrument or as an official confirmation
of any transaction. In the event you are receiving the offering
materials attached below related to your interest in hedge funds or
private equity, this communication may be intended as an offer or
solicitation for the purchase or sale of such fund(s).  All market
prices, data and other information are not warranted as to
completeness or accuracy and are subject to change without notice.
Any comments or statements made herein do not necessarily reflect
those of JPMorgan Chase  Co., its subsidiaries and affiliates.

This transmission may contain information that is privileged,
confidential, legally privileged, and/or exempt from disclosure
under applicable law. If you are not the intended recipient, you
are hereby notified that any disclosure, copying, distribution, or
use of the information contained herein (including any reliance
thereon) is STRICTLY PROHIBITED. Although this transmission and any
attachments are believed to be free of any virus or other defect
that might affect any computer system into which it is received and
opened, it is the responsibility of the recipient to ensure that it
is virus free and no responsibility is accepted by JPMorgan Chase 
Co., its subsidiaries and affiliates, as applicable, for any loss
or damage arising in any way from its use. If you 

Re: [R] Installing R in Ubuntu

2008-08-13 Thread Paul Hiemstra

Hi,

If you download a package to your harddrive for installation you need to 
use the dpkg command like:


1) Download pacakge (foo.deb)
2) Go to the directory
3) dpkg -i foo.deb

But I would advise against this because it is better to use repositories 
so R get updated automatically. The standard ubuntu repositories have 
old versions of R, see http://cran.r-project.org/bin/linux/ubuntu/ for a 
description of how to add the CRAN repositories for the latest version 
of R. You can also install a lot of R packages from this repository, 
doing this also ensures that they are automatically updated.


cheers and hth,

Paul

Shreyasee Pradhan wrote:

Hi,

Thanks for that.

the way I tried is as follows:
1) Downloaded the r-base package
2) went in that directory where the r-base package was downloaded from
command line
3) entered the command
   sudo apt-get install r-base
But got the error, that Couldn't find r-base command.

I don't understand where I went I wrong.
I will definitely try the following commands.

Thanks,
Shreyasee

On Wed, Aug 13, 2008 at 12:02 PM, Senthil Kumar M
[EMAIL PROTECTED]wrote:

  

On Tue, Aug 12, 2008 at 9:24 PM, Shreyasee Pradhan
[EMAIL PROTECTED] wrote:


Hi,

I am running Ubuntu on my Windows OS through VMware.
I am trying to install R in Ubuntu, but not getting with those commands,
which are there on the site.
Can anyone please tell me how to install it, stepwise, with commands to
  

be


used.
As I m new to Ubuntu as well, I am not aware of the commands very well.
  

snipped

Hi,

What commands did you try ? What worked and what didn't ? Which site
did you refer ?

Please read the posting guidelines here:
http://www.r-project.org/posting-guide.html

In the Ubuntu command line, try:

sudo aptitude install r-base

And for a list of R packages that you can install from the Ubuntu
repositories:

aptitude search r- | grep [^A-Za-z0-9] r-

Install them like this:

sudo aptitude install r-cran-package-name

HTH,

Senthil

-/

You see, but you do not observe. The distinction is clear.
Sir Arthur Conan Doyle in, The Memoirs of Sherlock Holmes




[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
  



--
Drs. Paul Hiemstra
Department of Physical Geography
Faculty of Geosciences
University of Utrecht
Heidelberglaan 2
P.O. Box 80.115
3508 TC Utrecht
Phone:  +31302535773
Fax:+31302531145
http://intamap.geo.uu.nl/~paul

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] mob(party) formula question

2008-08-13 Thread Birgitle

I try tu use mob() with my data.frame ('data.frame':288 obs. of  81
variables; factors, numerics and ordered factors)
My response is a binary variable and I should use for modelling a logistic
regression (family=binomial).

I read in the MOB Vignette that I could use a formula like this if I would
like to have only partitioning variables apart from the response.

Test.mob-mob(Resp~1|Var1+Var2+, data=dataframe, model=glinearModel,
family=binomial())

but this gives me back an error-message:

Fehler in `[.data.frame`(x, r, vars, drop = drop) : 
  undefined columns selected

Error in `[.data.frame`(x, r, vars, drop = drop) : 
  undefined columns selected

But Var1, Var2 and Resp are in my dataframe. Why do I get this error?

I am also wondering how I can find out which variables I should use for
partitioning and which for modelling?

There are correlations between some variables in my dataframe. Would it be a
possibility to use always one variable of the correlated variable-pairs for
partitioning and one for modelling?

I would be very happy if somebody could give me some hints or answers to my
questions.

Many thanks in advance.

B.



-
The art of living is more like wrestling than dancing.
(Marcus Aurelius)
-- 
View this message in context: 
http://www.nabble.com/mob%28party%29-formula-question-tp18959898p18959898.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Calculating an appropriate error from the NNET package for a continuous target

2008-08-13 Thread b c
I have been using the NNET package and have successfully run neural networks
on both continuous and binary targets. I managed to search the internet and
found out how to capture the error resulting from a binary model no
problems. My problem now is that I am trying to find how to calculate an
approriate error when modelling a continuous target. As a statistician I
immediately think of the RMSE, is it possible to calculate such a statistic
from the model? This would require kniowledge of the degrees of freedom in
the model (if there is an equivalent in a neural netwrok!?)  Ideally I would
like the proportion of the RMSE to total error. For example, one criteria of
model fit may be no worse than 10% error rate. It is this sort of statistic
that i am desperate to calc.Any help greatly appreciated.

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] aligned memory allocation in C

2008-08-13 Thread Christophe Dutang
Yes, it seems a good idea but your two questions are also good  
questions!


Le 13 août 08 à 04:52, Jeffrey Horner a écrit :


Christophe Dutang1 wrote:

Hi,
I'm currently R porting SF Mersenne Twister algorithm of Matsumoto  
and Saito. To get the full power of their code, I want to use their  
fonction fill_array32 which need aligned memory. That is to say I  
need to use the C function memalign on windows, posix_memalign on  
linux and classic malloc on Mac OS. In 'writing R extenstion', they  
recommand to use R_alloc function to allocate memory in C.

Does R_alloc return a pointer to aligned memory?
if not how can I do this?
probably no, because R crashes when I succesively R_alloc and  
fill_array32 (cf below) on my macbook with R 2.7.1.


You can still do this. Just take the address returned from R_alloc  
and test for alignment. If it's not, then just use an aligned  
address beyond the one returned.


(But then the question is, which direction beyond the one returned?  
How does one test for that?)


Jeff


Thanks in advance
Kind regards
Christophe
PS : http://www.math.sci.hiroshima-u.ac.jp/~m-mat/MT/SFMT/howto-compile.html 
 provides an example of memalign.

PPS : mac os report
Thread 0 Crashed:
0   libSystem.B.dylib 0x9341bb9e __kill + 10
1   libSystem.B.dylib 0x93492ec2 raise + 26
2   libSystem.B.dylib 0x934a247f abort + 73
3   randtoolbox.so0x15e65f1d 0x15e5d000 + 36637
4   randtoolbox.so0x15e614ef fill_array32 + 4038
5   randtoolbox.so0x15e6513d SFmersennetwister  
+ 335
6   randtoolbox.so0x15e652c6  
doSFMersenneTwister + 255

7   libR.dylib0x00367a52 do_dotcall + 1394
8   libR.dylib0x0038b5a2 Rf_eval + 1754
9   libR.dylib0x0038f9a2 do_set + 592
10  libR.dylib0x0038b366 Rf_eval + 1182
11  libR.dylib0x0038b366 Rf_eval + 1182
12  libR.dylib0x0038c140 do_begin + 58
13  libR.dylib0x0038b366 Rf_eval + 1182
14  libR.dylib0x0038b366 Rf_eval + 1182
15  libR.dylib0x0038c140 do_begin + 58
16  libR.dylib0x0038b366 Rf_eval + 1182
17  libR.dylib0x0038d9a6 Rf_applyClosure +  
663

18  libR.dylib0x0038b25d Rf_eval + 917
19  org.R-project.R   0x000189c3  
run_REngineRmainloop + 569 (Rinit.m:442)
20  org.R-project.R   0x0001142a -[REngine runREPL]  
+ 260 (REngine.m:181)
21  org.R-project.R   0x2e91 main + 795 (main.m: 
126)

22  org.R-project.R   0x2b5a _start + 216
23  org.R-project.R   0x2a81 start + 41
__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



--
http://biostat.mc.vanderbilt.edu/JeffreyHorner


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] dixon test

2008-08-13 Thread giov

Hi,
thank you very much for your useful help =). just a question...I don't know
what is the distribution of my data (normal, T, etc...). So, how can I set
the type parameter? There is a type value to use in case of a
distribution-free statistical test? 

Thank you so much!


Fernando Marmolejo-Ramos wrote:
 
 hi giov
 
 about the dixon test... i just run a simple test with a sample of 40 and I
 got:
 
 Error in dixon.test(x) : Sample size must be in range 3-30
 
 So it seems that most of the test in the outliers package are designed
 for small samples. See also the Rnews article published in May 2006 (vol
 6/2) called processing data for outliers by Lukasz Komsta (the developer
 of the package).
 
 However there is in that package a function called scores which works
 for big samples. You can also see the p-values and z scores for the
 observations you have and determine which values are considered outliers.
 
 Try this simple syntax:
 
 library(outliers)
 library(gamlss.dist)
 
 # this produces a exponential+Gaussian distribution (which usually has
 heaps of outliers!)
 x - rexGAUS(100,2000,3000,5000)
 
 # this confirms that Dixon works for samples between 3 and 30!!!
 dixon.test(x)
 
 # just to see what the data set looks like and visually confirm the
 outliers
 boxplot(x, notch=T)
 
 # sort the scores in ascending order
 sort(x)
 
 # returns probability of each score (using z scores) to be an outlier in
 order
 sort(scores(x, type=z, prob=1))
 
 # determines which scores are considered outliers with a 95% confidence
 sort(scores(x, prob=0.95))
 
 The author points regarding the prob part...
 
 prob  If set, the corresponding p-values instead of scores are given.
 If value is set to 1, p-value are returned. Otherwise, a logical vector is
 formed, indicating which values are exceeding specified probability. In
 z and mad types, there is also possibility to set this value to zero,
 and then scores are confirmed to (n-1)/sqrt(n) value, according to
 Shiffler (1998). The iqr type does not support probabilities, but lim
 value can be specified. 
 
 The reference of Shiffler is not as the one that appears in the help. It
 is this one:
 
 Schiffler, R.E (1988). Maximum Z scores and outliers. Am. Stat. 42, 1,
 79-80. 
 
 I hope this helps,
 
 Fernando
 
 

-- 
View this message in context: 
http://www.nabble.com/dixon-test-tp18940260p18960162.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] fPortfolio constraints, maxsumW

2008-08-13 Thread Yohan Chalabi
 JPB == John P. Burkett [EMAIL PROTECTED]
 on Tue, 12 Aug 2008 10:46:28 -0400

   JPB Running R version 2.6.1 under Gentoo Linux and using the fPortfolio 
   JPB package, I am having trouble specifying a sector constraint. One of the 
   JPB constraints to be imposed is that assets 1 and 2 together account for 
no 
   JPB more than 13.63% of the portfolio.  My attempt at coding that 
   JPB constraint, maxsumW[1:2Assets]=13.63  fails.  The relevant section of 
   JPB my code file and the resulting error message are pasted below. 
   JPB Suggestions about how to correct my coding would be most welcome.
   JPB 
   JPB *Code beings here
   JPB Data = as.timeSeries(Jdata)
   JPB Spec = portfolioSpec()
   JPB setNFrontierPoints(Spec) = 150
   JPB Spec
   JPB Constraint = c(minW[1:nAssets]=0, maxsumW[1:2Assets]=13.63)
   JPB frontier = portfolioFrontier(Data, Spec, Constraint)
   JPB **Error message begins here***
   JPB Error in parse(text = constraints[i]) :
   JPBunexpected symbol in maxsumW[1:2Assets
   JPB **Error message ends here**
   JPB 
   JPB -John



Hi John,

you should use 0.1363 instead of 13.63...

hope this helps,
yohan

-- 
PhD student
Swiss Federal Institute of Technology
Zurich

www.ethz.ch

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] problems with packages tseries and robustbase

2008-08-13 Thread tolga . i . uzuner
Dear R Users,

Is there a known problem with downloading packages robustbase and tseries 
from the UK CRAN website ?

Thanks in advance,
Tolga

=
R version 2.7.1 (2008-06-23)
Copyright (C) 2008 The R Foundation for Statistical Computing
ISBN 3-900051-07-0

R is free software and comes with ABSOLUTELY NO WARRANTY.
You are welcome to redistribute it under certain conditions.
Type 'license()' or 'licence()' for distribution details.

  Natural language support but running in an English locale

R is a collaborative project with many contributors.
Type 'contributors()' for more information and
'citation()' on how to cite R or R packages in publications.

Type 'demo()' for some demos, 'help()' for on-line help, or
'help.start()' for an HTML browser interface to help.
Type 'q()' to quit R.

 utils:::menuInstallPkgs()
--- Please select a CRAN mirror for use in this session ---
trying URL 
'http://cran.uk.r-project.org/bin/windows/contrib/2.7/robustbase_0.2-8.zip'
Error in download.file(url, destfile, method, mode = wb, ...) : 
  cannot open URL 
'http://cran.uk.r-project.org/bin/windows/contrib/2.7/robustbase_0.2-8.zip'
In addition: Warning message:
In download.file(url, destfile, method, mode = wb, ...) :
  cannot open: HTTP status was '404 Not Found'
Warning in download.packages(p0, destdir = tmpd, available = available,  :
  download of package 'robustbase' failed
 utils:::menuInstallPkgs()
trying URL 
'http://cran.uk.r-project.org/bin/windows/contrib/2.7/tseries_0.10-15.zip'
Error in download.file(url, destfile, method, mode = wb, ...) : 
  cannot open URL 
'http://cran.uk.r-project.org/bin/windows/contrib/2.7/tseries_0.10-15.zip'
In addition: Warning message:
In download.file(url, destfile, method, mode = wb, ...) :
  cannot open: HTTP status was '404 Not Found'
Warning in download.packages(p0, destdir = tmpd, available = available,  :
  download of package 'tseries' failed
=

Generally, this communication is for informational purposes only
and it is not intended as an offer or solicitation for the purchase
or sale of any financial instrument or as an official confirmation
of any transaction. In the event you are receiving the offering
materials attached below related to your interest in hedge funds or
private equity, this communication may be intended as an offer or
solicitation for the purchase or sale of such fund(s).  All market
prices, data and other information are not warranted as to
completeness or accuracy and are subject to change without notice.
Any comments or statements made herein do not necessarily reflect
those of JPMorgan Chase  Co., its subsidiaries and affiliates.

This transmission may contain information that is privileged,
confidential, legally privileged, and/or exempt from disclosure
under applicable law. If you are not the intended recipient, you
are hereby notified that any disclosure, copying, distribution, or
use of the information contained herein (including any reliance
thereon) is STRICTLY PROHIBITED. Although this transmission and any
attachments are believed to be free of any virus or other defect
that might affect any computer system into which it is received and
opened, it is the responsibility of the recipient to ensure that it
is virus free and no responsibility is accepted by JPMorgan Chase 
Co., its subsidiaries and affiliates, as applicable, for any loss
or damage arising in any way from its use. If you received this
transmission in error, please immediately contact the sender and
destroy the material in its entirety, whether in electronic or hard
copy format. Thank you.
Please refer to http://www.jpmorgan.com/pages/disclosures for
disclosures relating to UK legal entities.
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Help me: nls and try function

2008-08-13 Thread jarod_v6
Dear All,
I have these problems:

1) How can use the function try in nls model:

try(nls(...))

2) I have 100 colun with data and I want ro prepare 99 file with the first 
colun with the others

Time A1 A2 A3 A4 AN.

I want to have 99 files with 
a)Time and A1
b)Time and A2
n) Time AN

thanks for any help
M

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] problems with *downloading* packages tseries and robustbase

2008-08-13 Thread Prof Brian Ripley
It works from my home ISP (Virgin Media) and from .ox.ac.uk, so I think 
the problem is local to you, perhaps your DNS server.  Ask you IT support 
for erm ... support.


Please do try to use an accurate subject line (see the posting guide).

(Why don't peoople just try a different mirror or some IP/TCP debugging 
tools instead of asking here about problems with mirrors?  At most a 
handful of readers can do anything about this.)


On Wed, 13 Aug 2008, [EMAIL PROTECTED] wrote:


Dear R Users,

Is there a known problem with downloading packages robustbase and tseries
from the UK CRAN website ?

Thanks in advance,
Tolga

=
R version 2.7.1 (2008-06-23)
Copyright (C) 2008 The R Foundation for Statistical Computing
ISBN 3-900051-07-0

R is free software and comes with ABSOLUTELY NO WARRANTY.
You are welcome to redistribute it under certain conditions.
Type 'license()' or 'licence()' for distribution details.

 Natural language support but running in an English locale

R is a collaborative project with many contributors.
Type 'contributors()' for more information and
'citation()' on how to cite R or R packages in publications.

Type 'demo()' for some demos, 'help()' for on-line help, or
'help.start()' for an HTML browser interface to help.
Type 'q()' to quit R.


utils:::menuInstallPkgs()

--- Please select a CRAN mirror for use in this session ---
trying URL
'http://cran.uk.r-project.org/bin/windows/contrib/2.7/robustbase_0.2-8.zip'
Error in download.file(url, destfile, method, mode = wb, ...) :
 cannot open URL
'http://cran.uk.r-project.org/bin/windows/contrib/2.7/robustbase_0.2-8.zip'
In addition: Warning message:
In download.file(url, destfile, method, mode = wb, ...) :
 cannot open: HTTP status was '404 Not Found'
Warning in download.packages(p0, destdir = tmpd, available = available,  :
 download of package 'robustbase' failed

utils:::menuInstallPkgs()

trying URL
'http://cran.uk.r-project.org/bin/windows/contrib/2.7/tseries_0.10-15.zip'
Error in download.file(url, destfile, method, mode = wb, ...) :
 cannot open URL
'http://cran.uk.r-project.org/bin/windows/contrib/2.7/tseries_0.10-15.zip'
In addition: Warning message:
In download.file(url, destfile, method, mode = wb, ...) :
 cannot open: HTTP status was '404 Not Found'
Warning in download.packages(p0, destdir = tmpd, available = available,  :
 download of package 'tseries' failed


--
Brian D. Ripley,  [EMAIL PROTECTED]
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford, Tel:  +44 1865 272861 (self)
1 South Parks Road, +44 1865 272866 (PA)
Oxford OX1 3TG, UKFax:  +44 1865 272595

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] problems with *downloading* packages tseries and robustbase

2008-08-13 Thread tolga . i . uzuner
Hi,

It may be a problem with the mirror and my location. However, I should 
have added that the following packages had no problems installing from 
that mirror and location, which is why I suspected it was something more.

RBloomberg
strucchange
car
lmtest
nlme
corrgram
RODBC
MSBVAR
xtable
vars
tseries
PerformanceAnalytics
fArma

In addition to tseries and robustbase, I am also having problems with 
numDeriv.  Strange.

Nothing dire, as one can just download the zip files locallly and install, 
which is what I did. 

I thought I would bring it to the lists attention in case it is something 
package specific, and if so, the fix would benefit others. If there is 
another list to which one should report suspected package/mirror issues, 
please let me know and I can use that in the future.

The commands used are below.

Regards,
Tolga

install.packages(RDCOMClient, repos = http://www.omegahat.org/R;)
install.packages(RBloomberg,repos=http://cran.uk.r-project.org;)
install.packages(strucchange,repos=http://cran.uk.r-project.org;)
install.packages(car,repos=http://cran.uk.r-project.org;)
install.packages(lmtest,repos=http://cran.uk.r-project.org;)
install.packages(nlme,repos=http://cran.uk.r-project.org;)
install.packages(corrgram,repos=http://cran.uk.r-project.org;)
install.packages(RODBC,repos=http://cran.uk.r-project.org;)
install.packages(MSBVAR,repos=http://cran.uk.r-project.org;)
install.packages(xtable,repos=http://cran.uk.r-project.org;)
install.packages(vars,repos=http://cran.uk.r-project.org;)
install.packages(tseries,repos=http://cran.uk.r-project.org;)
install.packages(PerformanceAnalytics,repos=http://cran.uk.r-project.org;)
install.packages(fArma,repos=http://cran.uk.r-project.org;)
install.packages(numDeriv,repos=http://cran.uk.r-project.org;)
install.packages(nortest,repos=http://cran.uk.r-project.org;)
install.packages(chron,repos=http://cran.uk.r-project.org;)





Prof Brian Ripley [EMAIL PROTECTED] 
13/08/2008 11:48

To
[EMAIL PROTECTED]
cc
r-help@r-project.org
Subject
Re: [R] problems with *downloading* packages tseries and robustbase






It works from my home ISP (Virgin Media) and from .ox.ac.uk, so I think 
the problem is local to you, perhaps your DNS server.  Ask you IT support 
for erm ... support.

Please do try to use an accurate subject line (see the posting guide).

(Why don't peoople just try a different mirror or some IP/TCP debugging 
tools instead of asking here about problems with mirrors?  At most a 
handful of readers can do anything about this.)

On Wed, 13 Aug 2008, [EMAIL PROTECTED] wrote:

 Dear R Users,

 Is there a known problem with downloading packages robustbase and 
tseries
 from the UK CRAN website ?

 Thanks in advance,
 Tolga

 =
 R version 2.7.1 (2008-06-23)
 Copyright (C) 2008 The R Foundation for Statistical Computing
 ISBN 3-900051-07-0

 R is free software and comes with ABSOLUTELY NO WARRANTY.
 You are welcome to redistribute it under certain conditions.
 Type 'license()' or 'licence()' for distribution details.

  Natural language support but running in an English locale

 R is a collaborative project with many contributors.
 Type 'contributors()' for more information and
 'citation()' on how to cite R or R packages in publications.

 Type 'demo()' for some demos, 'help()' for on-line help, or
 'help.start()' for an HTML browser interface to help.
 Type 'q()' to quit R.

 utils:::menuInstallPkgs()
 --- Please select a CRAN mirror for use in this session ---
 trying URL
 
'http://cran.uk.r-project.org/bin/windows/contrib/2.7/robustbase_0.2-8.zip'
 Error in download.file(url, destfile, method, mode = wb, ...) :
  cannot open URL
 
'http://cran.uk.r-project.org/bin/windows/contrib/2.7/robustbase_0.2-8.zip'
 In addition: Warning message:
 In download.file(url, destfile, method, mode = wb, ...) :
  cannot open: HTTP status was '404 Not Found'
 Warning in download.packages(p0, destdir = tmpd, available = available, 
:
  download of package 'robustbase' failed
 utils:::menuInstallPkgs()
 trying URL
 
'http://cran.uk.r-project.org/bin/windows/contrib/2.7/tseries_0.10-15.zip'
 Error in download.file(url, destfile, method, mode = wb, ...) :
  cannot open URL
 
'http://cran.uk.r-project.org/bin/windows/contrib/2.7/tseries_0.10-15.zip'
 In addition: Warning message:
 In download.file(url, destfile, method, mode = wb, ...) :
  cannot open: HTTP status was '404 Not Found'
 Warning in download.packages(p0, destdir = tmpd, available = available, 
:
  download of package 'tseries' failed

-- 
Brian D. Ripley,  [EMAIL PROTECTED]
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford, Tel:  +44 1865 272861 (self)
1 South Parks Road, +44 1865 272866 (PA)
Oxford OX1 3TG, UKFax:  +44 1865 272595



Generally, this communication is for informational purposes only
and it is not intended as an offer or solicitation for the 

Re: [R] Help me: nls and try function

2008-08-13 Thread jim holtman
For the first question, you have provided the answer --
try(nls(...)).  Was there something else you wanted?

For part 2, this should work:

for (i in names(myData)[-1]){  # skip first column with Time
write.table(myData[, c(Time, i)], file=i)
}


 1) How can use the function try in nls model:

 try(nls(...))

 2) I have 100 colun with data and I want ro prepare 99 file with the first 
 colun with the others

 Time A1 A2 A3 A4 AN.

 I want to have 99 files with
 a)Time and A1
 b)Time and A2
 n) Time AN

 thanks for any help
 M

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




-- 
Jim Holtman
Cincinnati, OH
+1 513 646 9390

What is the problem that you are trying to solve?

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Location of HTML help files [on Firefox 3]

2008-08-13 Thread Prof Brian Ripley
I've added a couple of workarounds to this in R-patched and hence the 
upcoming R 2.7.2.


1) There is a new menu item on Rgui to go directly to SearchEngine.html.

2) help.start() has a new argument searchEngine=TRUE to do the same.

R 2.7.2 is 12 days' away, so it would be appreciated if Firefox 3 users 
would do some testing (especially on platforms other than Windows, which 
is all I have tested with Firefox 3).  Hopefully these changes will appear 
in R-patched in tonight's tarball and Windows binary build (from the CRAN 
master).



On Thu, 31 Jul 2008, Keith Ponting wrote:


On Wed, Jul 16, 2008 at 7:27 PM, Jan Smit smit1 at un.org wrote:

I am using R 2.7.1 under Windows XP, installed at C:/R/R-2.7.1.

The location of the HTML SearchEngine is
file:///C:/R/R-2.7.1/doc/html/search/SearchEngine.html. Now, when I

type
a

phrase, say reshape, in the search text field, the Search Results

page

suggest that the location of the reshape HTML help file is
file:///C:/R/library/stats/html/reshape.html, while in reality it is
file:///C:/R/R-2.7.1/library/stats/html/reshape.html.

Is there an easy way in which I can fix this?


I too had this problem with Firefox 3.0 and 3.0.1 (but not with 3.0 RC1
by the way).
A work-around which works for me is to go directly to the search engine
page (enter URI
file:///C:/Program%20Files%20(x86)/R/R-2.7.1/doc/html/search/SearchEngin
e.html on my Windows Vista installation), rather than going there by
following the link on the R documentation page
(file:///C:/Program%20Files%20(x86)/R/R-2.7.1/doc/html/index.html)

Keith Ponting
Aurix Ltd, Malvern WR14 3SZ  UK

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



--
Brian D. Ripley,  [EMAIL PROTECTED]
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford, Tel:  +44 1865 272861 (self)
1 South Parks Road, +44 1865 272866 (PA)
Oxford OX1 3TG, UKFax:  +44 1865 272595

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] dixon test

2008-08-13 Thread S Ellison


 giov [EMAIL PROTECTED] 13/08/2008 10:59:32 

 just a question...I don't know
what is the distribution of my data (normal, T, etc...). So, how can I
set
the type parameter? 

You must assume an underlying distribution or you can't do an outlier
test.

Outliers are just unusually extreme data points. They can only be
considered 'unusual' if there is some basis - a distribution assumption
- for deciding what is 'usual'.  The assumed underlying distribution
describes what is expected to be 'usual'. 

With no distribution assumption, there is no basis for considering any
data point unusual, so the idea of an outlier really has no meaning. 

Steve E




***
This email and any attachments are confidential. Any use...{{dropped:8}}

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Threshold vector error correction models

2008-08-13 Thread Matthieu Stigler

Hello

I worked on threshold cointegration for my master thesis and wrote code 
for R. This code will be published in a next release of package tsDyn, 
when I will have time to finish it (there is a difference between using 
its own code and making it avalaible on R... I did'nt realize there is 
so much more work). You can download the actual code at 
http://code.google.com/p/tsdyn/source/checkout and compile it yourself 
(hope you are on Linux, otherwise it will take some time).


The code avalaible actually  entails:

in a good form:
-OLS grid search estimator for TVECM and TVAR and methods (especiallay 
nice latex exportation)


in a not in method implemented form:
-hansen seo linear against threshold test and theri estimator (MLE)
-seo test: no cointegration against threshold cointegartion
-hansen linearity test fot TAR and extension to multivariate case by Lo 
and Zivot

-simple wald test fo the coefficients
-function to simulate and bootstrap TAR and TVAR

So see the doc pages (provisory) explore it and let me know if you need 
more infos (auch auf deutsch möglich)


Matthieu





Message: 7
Date: Tue, 12 Aug 2008 11:08:27 + (GMT)
From: Werner Wernersen [EMAIL PROTECTED]
Subject: [R] Threshold vector error correction models
To: [EMAIL PROTECTED]
Message-ID: [EMAIL PROTECTED]
Content-Type: text/plain; charset=iso-8859-1

Hi,

is anyone aware of estimation functions for threshold vector error correction / 
threshold cointegration models?
I didn't find anything for R using RSeek or Google.

Thanks a lot for any pointers,
  Werner


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] rgl/compiz problem

2008-08-13 Thread Barry Rowlingson
I have just encountered the problem with rgl where plot3d figures
don't interact with the mouse. My plots zoom in and out with the mouse
wheel but the mouse buttons do nothing. I can't rotate the plot.

This has been mentioned and discussed here and in other lists before,
and the solution is to turn off Ubuntu's fancy graphics.  Back in
March, Ben Bolker said:


unfortunately rgl and compiz/etc. both try to use
the same OpenGL interface, so you can't use both at
the same time.


This has echoes of when TCP/IP was in its infancy back in the days of
DOS, and only one program could access the network interface at a time
(until TCP/IP software got its act together). Is OpenGL really in the
same position now? Or is Compiz being greedy in some sense? Surely
two OpenGL applications can run at the same time? Or is it because rgl
is running 'within' another OpenGL window already, so there's some
nesting problem going on?

 Google Earth works fine, and I think that uses OpenGL. Anyone had any
ideas since March?

I'm on Ubuntu 8.04 and R 2.7.1

Barry

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] mob(party) formula question (example)

2008-08-13 Thread Birgitle

Here is an example that produces the same error:

Read in the following as textfile (save as DFExample.txt):

1   2   3   4   7   8   9   10  12  
13  14  15  16  17  18  19  21  22  23  
25  27  28  29  30  31  33  34
35  36  37  38  39  40  41  42  43  44  
45  46  47  48  49  50  51  52  53  54  
55  56  58  59  60
61  62  63  64  65  66  67  68  69  70  
71  72  73  74  75  76  77  78  79  80
AX  1   1   0   0   1   0   0   1   0   
0   0   0   0   0   0   1   0   1   0   
0   1   1   0   0   0   0   0   1   0   
0   0   1   0   0   0   0   0
1   0   0   0   0   0   0   0   0   0   
1   1   0   0   1   0   1   25  5   9   
1   8.5 2.5 3   5   2   2   3   3   1   
1   1   2   1   2
BX  1   1   0   0   1   0   0   1   NA  
NA  NA  0   0   0   0   1   0   0   1   
0   0   1   0   NA  NA  NA  NA  NA  NA  
NA  NA
0   0   0   1   0   NA  NA  NA  NA  NA  
NA  NA  0   0   0   0   1   1   0   0   
0   1   1   NA  NA  6   1   3.252.255   
5
2   2   3   3   1   1   1   1   1   1
CX  1   1   0   0   1   0   0   1   1   
0   0   0   1   0   1   0   0   1   0   
1   0   0   0   0   1   1   0   0   0   
0   0   0   1   0   0   0   0
1   0   0   0   0   0   0   0   0   1   
0   0   0   0   1   0   0   15  3.5 6   
1   5.5 5.5 5   5   2   2   1   2   1   
1   1   1   2   2
DX  1   1   0   0   1   0   0   1   0   
0   0   0   0   0   0   1   0   0   1   
0   0   1   0   0   1   0   1   0   0   
0   1   0   0   0   0   1   1
0   0   0   0   0   0   0   0   0   1   
0   1   0   1   0   1   0   50  17.57.5 
2.5 8.5 5   5   5   2   2   2   3   1   
1   1   1
3   3
EX  1   0   1   0   1   0   0   1   NA  
NA  NA  0   0   0   1   1   0   1   1   
0   1   0   0   0   0   0   0   1   0   
0   0   0   1   0   0
0   0   0   0   0   1   0   0   0   0   
0   1   0   1   0   0   0   1   0   NA  
NA  14.530  13  2.5 3   3   1   1   4   
4   1   1   1
1   1   1
FX  1   0   1   0   1   0   0   1   0   
0   0   0   0   0   0   1   0   0   1   
0   1   0   0   0   1   1   1   0   0   
1   1   0   0   0   1   1   1
0   0   0   1   0   0   0   0   0   1   
0   1   0   1   1   0   0   165 25  11.5
15  12  6.5 5   5   1   1   3   3   1   
1   1   1
4   5
GX  1   0   1   0   1   0   0   1   0   
0   1   0   0   0   0   1   1   1   0   
0   1   0   0   0   0   0   1   0   0   
1   1   0   1   0   0   1   0
0   0   0   1   0   0   0   1   0   0   
0   0   0   1   0   1   0   40  20  14.5
9.5 11  10  3   3   1   1   1   3   1   
1   3   4   1
3
HX  1   1   0   0   1   0   0   1   0   
0   0   0   0   0   0   1   0   1   0   
0   1   0   0   NA  NA  NA  NA  NA  NA  
NA  NA  NA
NA  NA  NA  NA  1  

Re: [R] rgl/compiz problem

2008-08-13 Thread Simon Blomberg
Two days ago I installed compiz on my Debian laptop. It plays fine with the 
OpenGL games that I also have on that computer. (My son plays the games in the 
brief interludes between my intense R hacking sessions. I, of course, have no 
time for such frivolity. The production cycle is sleep - eat - R - eat - R 
- eat - sleep.) I can rotate and zoom using the touchpad, when both rgl and 
compiz are running.

Not sure if any of this is of help,

Simon.

Simon Blomberg, BSc (Hons), PhD, MAppStat. 
Lecturer and Consulta-nt Statistician 
Faculty of Biological and Chemical Sciences 
The University of Queensland 
St. Lucia Queensland 4072 
Australia 
T: +61 7 3365 2506 
email: S.Blomberg1_at_uq.edu.au
http://www.uq.edu.au/~uqsblomb/

Policies:
1.  I will NOT analyse your data for you.
2.  Your deadline is your problem.

The combination of some data and an aching desire for 
an answer does not ensure that a reasonable answer can 
be extracted from a given body of data. - John Tukey.



-Original Message-
From: [EMAIL PROTECTED] on behalf of Barry Rowlingson
Sent: Wed 13/08/2008 9:45 PM
To: r-help@r-project.org
Subject: [R] rgl/compiz problem
 
I have just encountered the problem with rgl where plot3d figures
don't interact with the mouse. My plots zoom in and out with the mouse
wheel but the mouse buttons do nothing. I can't rotate the plot.

This has been mentioned and discussed here and in other lists before,
and the solution is to turn off Ubuntu's fancy graphics.  Back in
March, Ben Bolker said:


unfortunately rgl and compiz/etc. both try to use
the same OpenGL interface, so you can't use both at
the same time.


This has echoes of when TCP/IP was in its infancy back in the days of
DOS, and only one program could access the network interface at a time
(until TCP/IP software got its act together). Is OpenGL really in the
same position now? Or is Compiz being greedy in some sense? Surely
two OpenGL applications can run at the same time? Or is it because rgl
is running 'within' another OpenGL window already, so there's some
nesting problem going on?

 Google Earth works fine, and I think that uses OpenGL. Anyone had any
ideas since March?

I'm on Ubuntu 8.04 and R 2.7.1

Barry

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] mob(party) formula question

2008-08-13 Thread Achim Zeileis

On Wed, 13 Aug 2008, Birgitle wrote:


I try tu use mob() with my data.frame ('data.frame':288 obs. of  81
variables; factors, numerics and ordered factors)
My response is a binary variable and I should use for modelling a logistic
regression (family=binomial).

I read in the MOB Vignette that I could use a formula like this if I would
like to have only partitioning variables apart from the response.

Test.mob-mob(Resp~1|Var1+Var2+, data=dataframe, model=glinearModel,
family=binomial())


This works for me. Considering an example that is easily reproducible: 
classifying just two (out of three) species in the iris data.


iris2 - iris[-(1:50),]
iris2$Species - factor(iris2$Species)
mb - mob(Species ~ 1 | Petal.Length + Petal.Width + Sepal.Length +
   Sepal.Width, data = iris2, model = glinearModel, family = binomial())

and this runs fine, just selecting a single split

R mb
1) Petal.Width = 1.7; criterion = 1, statistic = 81.818
   2)*  weights = 54
Terminal node model
Binomial GLM with coefficients:
(Intercept)
  -2.282

1) Petal.Width  1.7
   3)*  weights = 46
Terminal node model
Binomial GLM with coefficients:
(Intercept)
   3.807


but this gives me back an error-message:

Error in `[.data.frame`(x, r, vars, drop = drop) :
 undefined columns selected

But Var1, Var2 and Resp are in my dataframe. Why do I get this error?


More importantly, when do you get this error? My guess is that this is 
during plotting, right?


If so, then the problem is that the plot() method for mob object by 
default calls node_bivplot() in each terminal node which is designed for 
generating partial regressor plots. In this situation this does not make 
sense because you don't have regressors in the terminal nodes.


We haven't got a panel function for the type of model you are looking at 
but I've just hacked a simple one that should be sufficient for your 
purposes. It is essentially like node_barplot() but exploits the binomial 
model. It is attached below. With this you can do

   plot(mb, terminal_panel = myplot, tnex = 2)


I am also wondering how I can find out which variables I should use for
partitioning and which for modelling?


For the variables for which a linear specification makes sense (at least 
in each component) then you should include them for modeling. And those 
variables for which it is not clear a priori what a useful parametric 
specification would be should be used as partitioning variables.



There are correlations between some variables in my dataframe. Would it be a
possibility to use always one variable of the correlated variable-pairs for
partitioning and one for modelling?


You can do that, but you could also do other combinations. That probably 
depends on your application.


hth,
Z

myplot - function(ctreeobj,
  col = black,
 fill = NULL,
 beside = NULL,
 ymax = NULL,
 ylines = NULL,
 widths = 1,
 gap = NULL,
 reverse = NULL,
 id = TRUE)
{
 getMaxPred - function(x) {
   mp - max(x$prediction)
   mpl - ifelse(x$terminal, 0, getMaxPred(x$left))
   mpr - ifelse(x$terminal, 0, getMaxPred(x$right))
   return(max(c(mp, mpl, mpr)))
 }

 y - response(ctreeobj)[[1]]

 if(is.factor(y) || class(y) == was_ordered) {
 ylevels - levels(y)
if(is.null(beside)) beside - if(length(ylevels)  3) FALSE else TRUE
 if(is.null(ymax)) ymax - if(beside) 1.1 else 1
if(is.null(gap)) gap - if(beside) 0.1 else 0
 } else {
 if(is.null(beside)) beside - FALSE
 if(is.null(ymax)) ymax - getMaxPred([EMAIL PROTECTED]) * 1.1
 ylevels - seq(along = [EMAIL PROTECTED])
 if(length(ylevels)  2) ylevels - 
if(is.null(gap)) gap - 1
 }
 if(is.null(reverse)) reverse - !beside
 if(is.null(fill)) fill - gray.colors(length(ylevels))
 if(is.null(ylines)) ylines - if(beside) c(3, 2) else c(1.5, 2.5)

 ### panel function for barplots in nodes
 rval - function(node) {

 ## parameter setup
fm - node$model
 pred - fm$family$linkinv(coef(fm))
if(reverse) {
  pred - rev(pred)
  ylevels - rev(ylevels)
}
 np - length(pred)
nc - if(beside) np else 1

fill - rep(fill, length.out = np)
 widths - rep(widths, length.out = nc)
col - rep(col, length.out = nc)
ylines - rep(ylines, length.out = 2)

gap - gap * sum(widths)
 yscale - c(0, ymax)
 xscale - c(0, sum(widths) + (nc+1)*gap)

 top_vp - viewport(layout = grid.layout(nrow = 2, ncol = 3,
widths = unit(c(ylines[1], 1, ylines[2]), c(lines, null, 
lines)),
heights = unit(c(1, 1), c(lines, null))),
width = unit(1, npc),
  

Re: [R] aligned memory allocation in C

2008-08-13 Thread Luke Tierney

On Wed, 13 Aug 2008, Christophe Dutang1 wrote:


Hi,

I'm currently R porting SF Mersenne Twister algorithm of Matsumoto and Saito. 
To get the full power of their code, I want to use their fonction 
fill_array32 which need aligned memory. That is to say I need to use the C 
function memalign on windows, posix_memalign on linux and classic malloc on 
Mac OS. In 'writing R extenstion', they recommand to use R_alloc function to 
allocate memory in C.


Does R_alloc return a pointer to aligned memory?
if not how can I do this?
probably no, because R crashes when I succesively R_alloc and fill_array32 
(cf below) on my macbook with R 2.7.1.


R_alloc's alignment will be appropriate for holding any data type. It
will be offset from a value returned by malloc by a multiple of 8
bytes.

My recollection, which may be wrong, is that on both Intel and PPC
unaligned access to all basic data types is permitted but may be
inefficient (in particular on Intel), so the reason for your crash is
probably elsewhere.

Best,

luke




Thanks in advance

Kind regards

Christophe


PS : http://www.math.sci.hiroshima-u.ac.jp/~m-mat/MT/SFMT/howto-compile.html 
provides an example of memalign.


PPS : mac os report

Thread 0 Crashed:
0   libSystem.B.dylib   0x9341bb9e __kill + 10
1   libSystem.B.dylib   0x93492ec2 raise + 26
2   libSystem.B.dylib   0x934a247f abort + 73
3   randtoolbox.so  0x15e65f1d 0x15e5d000 + 36637
4   randtoolbox.so  0x15e614ef fill_array32 + 4038
5   randtoolbox.so  0x15e6513d SFmersennetwister + 335
6   randtoolbox.so  0x15e652c6 doSFMersenneTwister + 255
7   libR.dylib  0x00367a52 do_dotcall + 1394
8   libR.dylib  0x0038b5a2 Rf_eval + 1754
9   libR.dylib  0x0038f9a2 do_set + 592
10  libR.dylib  0x0038b366 Rf_eval + 1182
11  libR.dylib  0x0038b366 Rf_eval + 1182
12  libR.dylib  0x0038c140 do_begin + 58
13  libR.dylib  0x0038b366 Rf_eval + 1182
14  libR.dylib  0x0038b366 Rf_eval + 1182
15  libR.dylib  0x0038c140 do_begin + 58
16  libR.dylib  0x0038b366 Rf_eval + 1182
17  libR.dylib  0x0038d9a6 Rf_applyClosure + 663
18  libR.dylib  0x0038b25d Rf_eval + 917
19  org.R-project.R   	0x000189c3 run_REngineRmainloop + 569 
(Rinit.m:442)
20  org.R-project.R   	0x0001142a -[REngine runREPL] + 260 
(REngine.m:181)

21  org.R-project.R 0x2e91 main + 795 (main.m:126)
22  org.R-project.R 0x2b5a _start + 216
23  org.R-project.R 0x2a81 start + 41

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


--
Luke Tierney
Chair, Statistics and Actuarial Science
Ralph E. Wareham Professor of Mathematical Sciences
University of Iowa  Phone: 319-335-3386
Department of Statistics andFax:   319-335-3017
   Actuarial Science
241 Schaeffer Hall  email:  [EMAIL PROTECTED]
Iowa City, IA 52242 WWW:  http://www.stat.uiowa.edu

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] R freezes on Xeon multiprocessor and Win 64-bit

2008-08-13 Thread Valerio Orioli

Hi everybody,

Performing a stepAIC on a glm.nb object, from a database of more than 
10,000 records and about 50 independent variables, on a 64-bit 
workstation with two Intel Xeon 3.20Ghz processors (keeping the 
HyperThreading option disabled in the BIOS), using 4 out of 7Gb 
available RAM, and Windows XP professional 64-bit, the system always 
freezes and it is no longer possible to use keyboard or mouse. In spite 
of that, the same analysis always works on other 32-bit old computers 
(Pentium IV with 1Gb RAM) with Windows XP professional 32-bit, although 
it tooks about 10 hours.

Has someone any suggestion to solve that problem?

Thank you in advance,

Valerio Orioli


Valerio Orioli
Biodiversity Conservation Unit - 
http://www.disat.unimib.it/Biodiversity/index.htm

Department of Environmental and Landscape Sciences
University of Milano-Bicocca
Piazza della Scienza 1
I-20126 - Milano
ITALY

Phone: +39.02.6448.2918
E-mail: [EMAIL PROTECTED]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] merging data sets to match data to date

2008-08-13 Thread rcoder

Hi everyone,

I want to extract data from a data set according to dates specified in a
vector. I have created a blank matrix with row names (dates) that I want to
extract from the full data set. I have then performed a merge to try to o/p
rows corresponding to common dates to a results matrix, but the operation
did not fill the results matrix. Coulc anyone offer any advice to assist
with this operation?

Thanks,

rcoder
-- 
View this message in context: 
http://www.nabble.com/merging-data-sets-to-match-data-to-date-tp18962197p18962197.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Comination of two barcharts and one xyplot

2008-08-13 Thread ravi
Hi Rhelpers,
I would like to have some help with a plot which is beyond my capabilities. 
This plot that I am seeking involves an overlay of two different barcharts and 
one xyplot.
The code that I have used is the following :
#save(df1,file=M:\\KBR\\df1.RData)
load(file=M:\\KBR\\df1.RData)
# df1$Year.ord created to obtain the right order i.e. 2015M  2015K
Year.ord-ordered(Year,levels=c('2003','2005','2007','2009','20011','2013','2015M','2015K'))
# Use reshape package to melt the data frame
library(reshape)
df1m-melt(df1,id=c(Year,Year.ord))
library(lattice)
attach(df1m)
barchart(value~Year.ord|variable,scales=list(y=free,x=list(rot=90)),xlab=Year,ylab=No.
 of Tests *1000,col=blue)
This plot works just fine. But I want to go beyond this. My first data frame 
(df1) is :
Year,KI,G48,AvCell,HB,Htens,Impact,Struct,Tens,Year.ord
1,2003,15.53,0.3,0.24,37.45,0.76,1.16,3.02,34.05,2003
2,2005,15.64,0.29,0.33,34.64,1.12,1.78,4.2,32.88,2005
3,2007,16.18,0.49,0.59,30.32,1.63,4.23,6.67,30.06,2007
4,2009,17.09,0.67,0.91,29.47,2.27,6.76,9.68,29.25,2009
5,2011,22.39,0.93,1.24,38.03,3.11,9.17,13.18,37.84,2011
6,2013,33.83,1.29,1.87,58.37,4.43,14.06,19.41,57.6,2013
7,2015M,44.91,1.83,2.71,75.54,6.28,20.57,27.51,74.5,2015M
8,2015K,52.22,2.14,3.15,87.71,7.34,23.88,31.98,86.57,2015K
My second data frame is (L1) is :
Year,KIL,G48L,AvCellL,HBL,HtensL,ImpactL,StructL,TensL
1,2009,20,1,1,30,2,10,10,35
2,2011,24,1,1.5,35,3,12,13,38
3,2013,30,1,2,40,4,14,16,45
What I want, in each panel of the lattice barchart, is to plot histograms of 
the relevant variable (KI, G48 etc) in one colour for the years 2003 to 2007, 
and in another colour for the other years. On top of this, I want to have a 
line plot in each panel with the limits for different years given in the second 
data frame L1 (as bold lines). 
I would like to have information on the following points :
1. How can I get a combination of these plots in every panel (two histograms 
and one line plot)? Is it possible?
2. Is it easier to do this with ggplot?
3. I would like to know how I can present the legend also.
Will appreciate any help that I can get.

Thanking You,
Ravi


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] merging data sets to match data to date

2008-08-13 Thread Erik Iverson

rcoder wrote:

Hi everyone,

I want to extract data from a data set according to dates specified in a
vector. I have created a blank matrix with row names (dates) that I want to
extract from the full data set. I have then performed a merge to try to o/p
rows corresponding to common dates to a results matrix, but the operation
did not fill the results matrix. Coulc anyone offer any advice to assist
with this operation?


Yes, follow the posting guide and provide commented, minimal, 
self-contained, reproducible code of your problem.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] merging data sets to match data to date

2008-08-13 Thread Henrique Dallazuanna
Try this:

x - data.frame(Dates = seq(as.Date('2008-01-01'),
  as.Date('2008-01-31'), by = 'days'),
   Values = sample(31))

subset(x, Dates %in% as.Date(c('2008-01-05', '2008-01-20')))

On 8/13/08, rcoder [EMAIL PROTECTED] wrote:

 Hi everyone,

 I want to extract data from a data set according to dates specified in a
 vector. I have created a blank matrix with row names (dates) that I want to
 extract from the full data set. I have then performed a merge to try to o/p
 rows corresponding to common dates to a results matrix, but the operation
 did not fill the results matrix. Coulc anyone offer any advice to assist
 with this operation?

 Thanks,

 rcoder
 --
 View this message in context: 
 http://www.nabble.com/merging-data-sets-to-match-data-to-date-tp18962197p18962197.html
 Sent from the R help mailing list archive at Nabble.com.

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.



-- 
Henrique Dallazuanna
Curitiba-Paraná-Brasil
25° 25' 40 S 49° 16' 22 O

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] merging data sets to match data to date

2008-08-13 Thread stephen sefick
zoo and merge.zoo- read the help files.  Use chron to generate a list
of dates that correspond to the one that you want, and then merge
away.
This should get you started

Stephen Sefick

On Wed, Aug 13, 2008 at 9:08 AM, Erik Iverson [EMAIL PROTECTED] wrote:
 rcoder wrote:

 Hi everyone,

 I want to extract data from a data set according to dates specified in a
 vector. I have created a blank matrix with row names (dates) that I want
 to
 extract from the full data set. I have then performed a merge to try to
 o/p
 rows corresponding to common dates to a results matrix, but the operation
 did not fill the results matrix. Coulc anyone offer any advice to assist
 with this operation?

 Yes, follow the posting guide and provide commented, minimal,
 self-contained, reproducible code of your problem.

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




-- 
Let's not spend our time and resources thinking about things that are
so little or so large that all they really do for us is puff us up and
make us feel like gods. We are mammals, and have not exhausted the
annoying little problems of being mammals.

-K. Mullis

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Comination of two barcharts and one xyplot

2008-08-13 Thread stephen sefick
not reproducible

On Wed, Aug 13, 2008 at 9:07 AM, ravi [EMAIL PROTECTED] wrote:
 Hi Rhelpers,
 I would like to have some help with a plot which is beyond my capabilities. 
 This plot that I am seeking involves an overlay of two different barcharts 
 and one xyplot.
 The code that I have used is the following :
 #save(df1,file=M:\\KBR\\df1.RData)
 load(file=M:\\KBR\\df1.RData)
 # df1$Year.ord created to obtain the right order i.e. 2015M  2015K
 Year.ord-ordered(Year,levels=c('2003','2005','2007','2009','20011','2013','2015M','2015K'))
 # Use reshape package to melt the data frame
 library(reshape)
 df1m-melt(df1,id=c(Year,Year.ord))
 library(lattice)
 attach(df1m)
 barchart(value~Year.ord|variable,scales=list(y=free,x=list(rot=90)),xlab=Year,ylab=No.
  of Tests *1000,col=blue)
 This plot works just fine. But I want to go beyond this. My first data frame 
 (df1) is :
 Year,KI,G48,AvCell,HB,Htens,Impact,Struct,Tens,Year.ord
 1,2003,15.53,0.3,0.24,37.45,0.76,1.16,3.02,34.05,2003
 2,2005,15.64,0.29,0.33,34.64,1.12,1.78,4.2,32.88,2005
 3,2007,16.18,0.49,0.59,30.32,1.63,4.23,6.67,30.06,2007
 4,2009,17.09,0.67,0.91,29.47,2.27,6.76,9.68,29.25,2009
 5,2011,22.39,0.93,1.24,38.03,3.11,9.17,13.18,37.84,2011
 6,2013,33.83,1.29,1.87,58.37,4.43,14.06,19.41,57.6,2013
 7,2015M,44.91,1.83,2.71,75.54,6.28,20.57,27.51,74.5,2015M
 8,2015K,52.22,2.14,3.15,87.71,7.34,23.88,31.98,86.57,2015K
 My second data frame is (L1) is :
 Year,KIL,G48L,AvCellL,HBL,HtensL,ImpactL,StructL,TensL
 1,2009,20,1,1,30,2,10,10,35
 2,2011,24,1,1.5,35,3,12,13,38
 3,2013,30,1,2,40,4,14,16,45
 What I want, in each panel of the lattice barchart, is to plot histograms of 
 the relevant variable (KI, G48 etc) in one colour for the years 2003 to 2007, 
 and in another colour for the other years. On top of this, I want to have a 
 line plot in each panel with the limits for different years given in the 
 second data frame L1 (as bold lines).
 I would like to have information on the following points :
 1. How can I get a combination of these plots in every panel (two histograms 
 and one line plot)? Is it possible?
 2. Is it easier to do this with ggplot?
 3. I would like to know how I can present the legend also.
 Will appreciate any help that I can get.

 Thanking You,
 Ravi


 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




-- 
Let's not spend our time and resources thinking about things that are
so little or so large that all they really do for us is puff us up and
make us feel like gods. We are mammals, and have not exhausted the
annoying little problems of being mammals.

-K. Mullis

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] rgl/compiz problem

2008-08-13 Thread Ben Bolker
Barry Rowlingson b.rowlingson at lancaster.ac.uk writes:

 
 I have just encountered the problem with rgl where plot3d figures
 don't interact with the mouse. My plots zoom in and out with the mouse
 wheel but the mouse buttons do nothing. I can't rotate the plot.
 
 This has been mentioned and discussed here and in other lists before,
 and the solution is to turn off Ubuntu's fancy graphics.  Back in
 March, Ben Bolker said:
 
 
 unfortunately rgl and compiz/etc. both try to use
 the same OpenGL interface, so you can't use both at
 the same time.
 
 
 This has echoes of when TCP/IP was in its infancy back in the days of
 DOS, and only one program could access the network interface at a time
 (until TCP/IP software got its act together). Is OpenGL really in the
 same position now? Or is Compiz being greedy in some sense? Surely
 two OpenGL applications can run at the same time? Or is it because rgl
 is running 'within' another OpenGL window already, so there's some
 nesting problem going on?
 
  Google Earth works fine, and I think that uses OpenGL. Anyone had any
 ideas since March?
 
 I'm on Ubuntu 8.04 and R 2.7.1
 
 Barry

  Unfortunately, an apparently knowledgeable compiz person
said:

This is a limitation of DRI, DRI2 should fix this, and should hopefully be in
most drivers by Xorg 7.5(maybe 7.6), nvidia has there on implementation, that's
why it works on it

http://forum.compiz-fusion.org/showthread.php?t=8462

  And poking around,

http://www.phoronix.com/scan.php?page=news_itempx=NjYzNw

sometime in 2009 is the closest I could get to finding
an expected date when this would be available ...

  Ben Bolker

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] mob(party) formula question

2008-08-13 Thread Birgitle

Many thanks for your answer and the code that you offered me.

I get this error message after calling mob (look at my given example).
I guess it has something to do with the missings?

The iris example works also fine for me.

Sorry that I am not enough into statistics to really understand the
following:


Achim Zeileis wrote:
 
 
 .
 For the variables for which a linear specification makes sense (at least
 in each component) then you should include them for modeling. And those
 variables for which it is not clear a priori what a useful parametric
 specification would be should be used as partitioning variables. 
 ...
 
 

What do you mean with linear specification? I would be very happy if you
could explain.

Thanks again

B.



Achim Zeileis wrote:
 
 On Wed, 13 Aug 2008, Birgitle wrote:
 
 I try tu use mob() with my data.frame ('data.frame': 288 obs. of  81
 variables; factors, numerics and ordered factors)
 My response is a binary variable and I should use for modelling a
 logistic
 regression (family=binomial).

 I read in the MOB Vignette that I could use a formula like this if I
 would
 like to have only partitioning variables apart from the response.

 Test.mob-mob(Resp~1|Var1+Var2+, data=dataframe, model=glinearModel,
 family=binomial())
 
 This works for me. Considering an example that is easily reproducible: 
 classifying just two (out of three) species in the iris data.
 
 iris2 - iris[-(1:50),]
 iris2$Species - factor(iris2$Species)
 mb - mob(Species ~ 1 | Petal.Length + Petal.Width + Sepal.Length +
 Sepal.Width, data = iris2, model = glinearModel, family = binomial())
 
 
 
 


-
The art of living is more like wrestling than dancing.
(Marcus Aurelius)
-- 
View this message in context: 
http://www.nabble.com/mob%28party%29-formula-question-tp18959898p18962866.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Tiny help for tiny function

2008-08-13 Thread Birgitle

I just started to write tiny functions and therefore I appologise in advance
if I am asking stupid question.

I wrote a tiny function to give me back from the original matrix, a matrix
showing only the values smaller -0.8 and bigger 0.8.

y-c(0.1,0.2,0.3,-0.8,-0.4,0.9) 
x-c(0.5,0.3,0.9,-0.9,-0.7,0.3)

 XY-rbind(x,y)

extract.values-function (x)
{
if(x=0.8|x=-0.8)x
else(low corr.)

}

works:

Test-sapply(XY,extract.values,simplify=FALSE)

but now I try to solve the problem of having NA in the matrix.
I tried like that:

extract.values-function (x)
{
if(x=0.8|x=-0.8|x==NA)x
else(low corr.)

}

woks not:

x-c(0.5,0.3,0.9,-0.9,-0.7,0.3)
y-c(0.1,0.2,NA,-0.8,-0.4,0.9)

XY-rbind(x,y)

Testi-sapply(XY,extract.values,simplify=FALSE)

Fehler in if (x = 0.8 | x = -0.8 | x == NA) x else (low corr.) : 
  Fehlender Wert, wo TRUE/FALSE nötig ist

Error in if (x = 0.8 | x = -0.8 | x == NA) x else (low corr.) : 
  Missing value, where TRUE/FALSE is needed

How can I do this right.

Thanks for help

B.

-
The art of living is more like wrestling than dancing.
(Marcus Aurelius)
-- 
View this message in context: 
http://www.nabble.com/Tiny-help-for-tiny-function-tp18963310p18963310.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] The standard deviation of measurement 1 with respect to measurement 2

2008-08-13 Thread Firas Swidan
Hi,

I have two (different types of) measurements, say X and Y, resulting from
the same set of experiments. So X and Y are paired: (x_1, y_1), (x_2, y_2),
...

I am trying to calculate the standard deviation of Y with respect to X. In
other words, in terms of the scatter plot of X and Y, I would like to divide
it into bins along the X-axis and for each bin calculate the standard
deviation along the Y results in that bin. (Though I am not totally sure,
this seems to remind me of the conditional expectation of Y given X - maybe
it is called the conditional deviation?)

Is their a built in procedure in R for calculating the above? Otherwise,
what would be the easiest way to achieve it? (factors maybe?)

Thankful for the help,
Firas.

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Tiny help for tiny function

2008-08-13 Thread Henrique Dallazuanna
You can do this:

ifelse(XY = 0.8 | XY = -0.8 | is.na(XY), XY, low corr)

On 8/13/08, Birgitle [EMAIL PROTECTED] wrote:

 I just started to write tiny functions and therefore I appologise in advance
 if I am asking stupid question.

 I wrote a tiny function to give me back from the original matrix, a matrix
 showing only the values smaller -0.8 and bigger 0.8.

 y-c(0.1,0.2,0.3,-0.8,-0.4,0.9)
 x-c(0.5,0.3,0.9,-0.9,-0.7,0.3)

  XY-rbind(x,y)

 extract.values-function (x)
 {
 if(x=0.8|x=-0.8)x
 else(low corr.)

}

 works:

 Test-sapply(XY,extract.values,simplify=FALSE)

 but now I try to solve the problem of having NA in the matrix.
 I tried like that:

 extract.values-function (x)
 {
 if(x=0.8|x=-0.8|x==NA)x
 else(low corr.)

}

 woks not:

 x-c(0.5,0.3,0.9,-0.9,-0.7,0.3)
 y-c(0.1,0.2,NA,-0.8,-0.4,0.9)

 XY-rbind(x,y)

 Testi-sapply(XY,extract.values,simplify=FALSE)

 Fehler in if (x = 0.8 | x = -0.8 | x == NA) x else (low corr.) :
  Fehlender Wert, wo TRUE/FALSE nötig ist

 Error in if (x = 0.8 | x = -0.8 | x == NA) x else (low corr.) :
  Missing value, where TRUE/FALSE is needed

 How can I do this right.

 Thanks for help

 B.

 -
 The art of living is more like wrestling than dancing.
 (Marcus Aurelius)
 --
 View this message in context: 
 http://www.nabble.com/Tiny-help-for-tiny-function-tp18963310p18963310.html
 Sent from the R help mailing list archive at Nabble.com.

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.



-- 
Henrique Dallazuanna
Curitiba-Paraná-Brasil
25° 25' 40 S 49° 16' 22 O

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Tiny help for tiny function

2008-08-13 Thread Birgitle

Many thanks.

Much easier than my solution

B.


Birgitle wrote:
 
 I just started to write tiny functions and therefore I appologise in
 advance if I am asking stupid question.
 
 I wrote a tiny function to give me back from the original matrix, a matrix
 showing only the values smaller -0.8 and bigger 0.8.
 
 y-c(0.1,0.2,0.3,-0.8,-0.4,0.9)   
 x-c(0.5,0.3,0.9,-0.9,-0.7,0.3)
 
  XY-rbind(x,y)
 
 extract.values-function (x)
 {
 if(x=0.8|x=-0.8)x
 else(low corr.)
   
   }
 
 works:
 
 Test-sapply(XY,extract.values,simplify=FALSE)
 
 but now I try to solve the problem of having NA in the matrix.
 I tried like that:
 
 extract.values-function (x)
 {
 if(x=0.8|x=-0.8|x==NA)x
 else(low corr.)
   
   }
 
 woks not:
 
 x-c(0.5,0.3,0.9,-0.9,-0.7,0.3)
 y-c(0.1,0.2,NA,-0.8,-0.4,0.9)
 
 XY-rbind(x,y)
 
 Testi-sapply(XY,extract.values,simplify=FALSE)
 
 Fehler in if (x = 0.8 | x = -0.8 | x == NA) x else (low corr.) : 
   Fehlender Wert, wo TRUE/FALSE nötig ist
 
 Error in if (x = 0.8 | x = -0.8 | x == NA) x else (low corr.) : 
   Missing value, where TRUE/FALSE is needed
 
 How can I do this right.
 
 Thanks for help
 
 B.
 


-
The art of living is more like wrestling than dancing.
(Marcus Aurelius)
-- 
View this message in context: 
http://www.nabble.com/Tiny-help-for-tiny-function-tp18963310p18963906.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] mob(party) formula question

2008-08-13 Thread Achim Zeileis

On Wed, 13 Aug 2008, Birgitle wrote:


Many thanks for your answer and the code that you offered me.

I get this error message after calling mob (look at my given example).
I guess it has something to do with the missings?


Yes, you have to handle NAs in advance if you want to fit that model. 
We'll try to fix that in future versions.



The iris example works also fine for me.

Sorry that I am not enough into statistics to really understand the
following:


Achim Zeileis wrote:



.
For the variables for which a linear specification makes sense (at least
in each component) then you should include them for modeling. And those
variables for which it is not clear a priori what a useful parametric
specification would be should be used as partitioning variables.
...




What do you mean with linear specification? I would be very happy if you
could explain.


Well, in each node you fit a logistic regression model. This is a 
(generalized) linear model, hence the variables included have a linear 
influence (on the link scale) within each node. The partitioning variables 
on the other hand capture step-shaped influences (if they are selected by 
the algorithm). See the references on ?mob for further details.


Best,
Z

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] summary.manova rank deficiency error + data

2008-08-13 Thread Pedro Mardones
Thanks for the reply. The SAS output is attached but seems to me that
doesn't correspond to the wihtin-row contrasts as you suggested. By
the way, yes the data are highly correlated, in fact each row
correspond to the first part of a signal vector. Thanks anyway
PM


  The GLM Procedure
  Multivariate Analysis of Variance
E = Error SSCP Matrix
   y1y2y3
  y4y5
 y1  0.0353518799   0.035256904  0.0351327804
0.0349749601  0.0347868018
 y2   0.035256904  0.0351627227  0.0350395053
0.0348827098  0.0346956744
 y3  0.0351327804  0.0350395053  0.0349173343
0.0347617352  0.0345760232
 y4  0.0349749601  0.0348827098  0.0347617352
0.0346075203  0.0344233531
 y5  0.0347868018  0.0346956744  0.0345760232
0.0344233531  0.0342409225

Partial Correlation Coefficients from the Error SSCP Matrix / Prob  |r|
  DF = 28 y1 y2 y3
y4 y5
  y11.00   0.92   0.67
0.21   0.999852
 .0001 .0001
.0001 .0001
  y20.92   1.00   0.91
0.63   0.11
  .0001.0001
.0001 .0001
  y30.67   0.91   1.00
0.90   0.58
  .0001 .0001
.0001 .0001
  y40.21   0.63   0.90
1.00   0.89
  .0001 .0001 .0001
   .0001
  y50.999852   0.11   0.58
0.89   1.00
  .0001 .0001 .0001 .0001

The SAS System 10:33 Wednesday, August 13, 2008   8
  The GLM Procedure
  Multivariate Analysis of Variance
  H = Type III SSCP Matrix for group
   y1y2y3
  y4y5
 y1  0.0023822408   0.002365848  0.0023471328
0.0023261249  0.0023030993
 y2   0.002365848  0.0023495679  0.0023309816
0.0023101183  0.0022872511
 y3  0.0023471328  0.0023309816  0.0023125426
0.0022918453  0.0022691608
 y4  0.0023261249  0.0023101183  0.0022918453
0.0022713359  0.0022488593
 y5  0.0023030993  0.0022872511  0.0022691608
0.0022488593  0.0022266141

 Characteristic Roots and Vectors of: E Inverse * H, where
  H = Type III SSCP Matrix for group
E = Error SSCP Matrix
Characteristic   Characteristic Vector  V'EV=1
  Root  Percenty1y2y3
  y4y5
0.4184010371.72 -7542.628 17131.814  5347.394
-31627.317 16700.100
0.1649601128.28 -4180.854 -4413.446 32096.035
-35545.204 12040.697
0.0001 0.00-41004.875107291.004-95905.664
32641.189 -3028.470
0. 0.00  -416.226  -111.206   410.721
 295.193  -171.953
0. 0.00-14678.651  5787.997 54718.250
-69055.249 23218.580

   MANOVA Test Criteria and F Approximations for the Hypothesis of No
Overall group Effect
 H = Type III SSCP Matrix for group
E = Error SSCP Matrix
 S=2M=1N=11
   StatisticValueF ValueNum DF
Den DFPr  F
   Wilks' Lambda   0.60518744   1.3710
   480.2227
   Pillai's Trace  0.43658228   1.4010
   500.2095
   Hotelling-Lawley Trace  0.58336114   1.3710
33.3620.2385
   Roy's Greatest Root 0.41840103   2.09 5
   250.1000










On Wed, Aug 13, 2008 at 4:34 AM, Peter Dalgaard
[EMAIL PROTECTED] wrote:
 Pedro Mardones wrote:

 Dear R-users;

 Previously I posted a question about the problem of rank deficiency in
 summary.manova. As somebody suggested, I'm attaching a small part of
 the data set.

 #***

 test -

 structure(.Data = list(structure(.Data = c(rep(1,3),rep(2,18),rep(3,10)),
 levels = c(1, 2, 3),
 class = factor)


 ,c(0.181829,0.090159,0.115824,0.112804,0.134650,0.249136,0.163144,0.122012,0.157554,0.126283,

 0.105344,0.125125,0.126232,0.084317,0.092836,0.108546,0.159165,0.121620,0.142326,0.122770,

 0.117480,0.153762,0.156551,0.185058,0.161651,0.182331,0.139531,0.188101,0.103196,0.116877,0.113733)


 ,c(0.181445,0.090254,0.115840,0.112863,0.134610,0.249003,0.163116,0.122135,0.157206,0.126129,

 

Re: [R] mob(party) formula question

2008-08-13 Thread Birgitle

Thanks again.
Unfortunately I have always this missing values problem.
But the missings have also a meaning and its impossible to code it
differently or impute.

Also thanks for the explanation. Now I understand.

B.


Achim Zeileis wrote:
 
 On Wed, 13 Aug 2008, Birgitle wrote:
 
 Many thanks for your answer and the code that you offered me.

 I get this error message after calling mob (look at my given example).
 I guess it has something to do with the missings?
 
 Yes, you have to handle NAs in advance if you want to fit that model. 
 We'll try to fix that in future versions.
 
 The iris example works also fine for me.

 Sorry that I am not enough into statistics to really understand the
 following:


 Achim Zeileis wrote:


 .
 For the variables for which a linear specification makes sense (at least
 in each component) then you should include them for modeling. And those
 variables for which it is not clear a priori what a useful parametric
 specification would be should be used as partitioning variables.
 ...



 What do you mean with linear specification? I would be very happy if
 you
 could explain.
 
 Well, in each node you fit a logistic regression model. This is a 
 (generalized) linear model, hence the variables included have a linear 
 influence (on the link scale) within each node. The partitioning variables 
 on the other hand capture step-shaped influences (if they are selected by 
 the algorithm). See the references on ?mob for further details.
 
 Best,
 Z
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.
 
 


-
The art of living is more like wrestling than dancing.
(Marcus Aurelius)
-- 
View this message in context: 
http://www.nabble.com/mob%28party%29-formula-question-tp18959898p18964864.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] mob(party) formula question

2008-08-13 Thread Achim Zeileis

On Wed, 13 Aug 2008, Birgitle wrote:


Thanks again.
Unfortunately I have always this missing values problem.
But the missings have also a meaning and its impossible to code it
differently or impute.


That's ok. Just to clarify: NAs are not allowed in the response or the 
modeling variables. In principle, it would be possible to have NAs in the 
partitioning variables and try to handle it with surrogate splits. 
Currently, surrogates are not implemented in mob(), but we are currently 
working on infrastructure for this.


So the only work-around easily available at the moment is to call 
na.omit() (on the relevant variables only).


Best,
Z


Also thanks for the explanation. Now I understand.

B.


Achim Zeileis wrote:


On Wed, 13 Aug 2008, Birgitle wrote:


Many thanks for your answer and the code that you offered me.

I get this error message after calling mob (look at my given example).
I guess it has something to do with the missings?


Yes, you have to handle NAs in advance if you want to fit that model.
We'll try to fix that in future versions.


The iris example works also fine for me.

Sorry that I am not enough into statistics to really understand the
following:


Achim Zeileis wrote:



.
For the variables for which a linear specification makes sense (at least
in each component) then you should include them for modeling. And those
variables for which it is not clear a priori what a useful parametric
specification would be should be used as partitioning variables.
...




What do you mean with linear specification? I would be very happy if
you
could explain.


Well, in each node you fit a logistic regression model. This is a
(generalized) linear model, hence the variables included have a linear
influence (on the link scale) within each node. The partitioning variables
on the other hand capture step-shaped influences (if they are selected by
the algorithm). See the references on ?mob for further details.

Best,
Z

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.





-
The art of living is more like wrestling than dancing.
(Marcus Aurelius)
--
View this message in context: 
http://www.nabble.com/mob%28party%29-formula-question-tp18959898p18964864.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.




__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Comination of two barcharts and one xyplot

2008-08-13 Thread ravi
Hi Rhelpers,
Thanks a lot, Stephen, for showing me the way to get a data frame into a 
pasteable format with the dput command. 
My code is given below with the new correction. This should work, as Stephen 
says, right off the bat :-)
## df1 is the first data frame
df1 -structure(list(Year = structure(c(1L, 2L, 3L, 4L, 5L, 6L, 8L, 
7L), .Label = c(2003, 2005, 2007, 2009, 2011, 2013, 
2015K, 2015M), class = factor), KI = c(15.53, 15.64, 16.18, 
17.09, 22.39, 33.83, 44.91, 52.22), G48 = c(0.3, 0.29, 0.49, 
0.67, 0.93, 1.29, 1.83, 2.14), AvCell = c(0.24, 0.33, 0.59, 0.91, 
1.24, 1.87, 2.71, 3.15), HB = c(37.45, 34.64, 30.32, 29.47, 38.03, 
58.37, 75.54, 87.71), Htens = c(0.76, 1.12, 1.63, 2.27, 3.11, 
4.43, 6.28, 7.34), Impact = c(1.16, 1.78, 4.23, 6.76, 9.17, 14.06, 
20.57, 23.88), Struct = c(3.02, 4.2, 6.67, 9.68, 13.18, 19.41, 
27.51, 31.98), Tens = c(34.05, 32.88, 30.06, 29.25, 37.84, 57.6, 
74.5, 86.57), Year.ord = structure(1:8, .Label = c(2003, 2005, 
2007, 2009, 2011, 2013, 2015M, 2015K), class = c(ordered, 
factor))), .Names = c(Year, KI, G48, AvCell, HB, 
Htens, Impact, Struct, Tens, Year.ord), row.names = c(NA, 
-8L), class = data.frame)
## L1 is the second data frame
L1-structure(list(Year = c(2009L, 2011L, 2013L), KIL = c(20, 24, 
30), G48L = c(1, 1, 1), AvCellL = c(1, 1.5, 2), HBL = c(30, 35, 
40), HtensL = c(2, 3, 4), ImpactL = c(10, 12, 14), StructL = c(10, 
13, 16), TensL = c(35, 38, 45)), .Names = c(Year, KIL, G48L, 
AvCellL, HBL, HtensL, ImpactL, StructL, TensL), class = 
data.frame, row.names = c(NA, 
-3L))
# # Use the reshape package to melt the data frame
library(reshape)
df1m-melt(df1,id=c(Year,Year.ord))
## Use the lattice package to plot the barchart
library(lattice)
attach(df1m)
barchart(value~Year.ord|variable,scales=list(y=free,x=list(rot=90)),xlab=Year,ylab=No.
 of Tests *1000,col=blue)
This plot works just fine. But I want to go beyond this.What I want, in each 
panel of the lattice barchart, is to plot histograms of the relevant variable 
(KI, G48 etc) in one colour for the years 2003 to 2007, and in another colour 
for the other years. On top of this, I want to have a line plot in each panel 
with the limits for different years given in the second data frame L1 (as bold 
lines). 
I would like to have information on the following points :
1. How can I get a combination of these plots in every panel (two histograms 
and one line plot)? Is it possible?
2. Is it easier to do this with ggplot?
3. I would like to know how I can present the legend also.
Will appreciate any help that I can get.
Thanking You,
Ravi


- Original Message 
From: stephen sefick [EMAIL PROTECTED]
To: ravi [EMAIL PROTECTED]
Cc: r-help@r-project.org
Sent: Wednesday, 13 August, 2008 3:14:54 PM
Subject: Re: [R] Comination of two barcharts and one xyplot

not reproducible

On Wed, Aug 13, 2008 at 9:07 AM, ravi [EMAIL PROTECTED] wrote:
 Hi Rhelpers,
 I would like to have some help with a plot which is beyond my capabilities. 
 This plot that I am seeking involves an overlay of two different barcharts 
 and one xyplot.
 The code that I have used is the following :
 #save(df1,file=M:\\KBR\\df1.RData)
 load(file=M:\\KBR\\df1.RData)
 # df1$Year.ord created to obtain the right order i.e. 2015M  2015K
 Year.ord-ordered(Year,levels=c('2003','2005','2007','2009','20011','2013','2015M','2015K'))
 # Use reshape package to melt the data frame
 library(reshape)
 df1m-melt(df1,id=c(Year,Year.ord))
 library(lattice)
 attach(df1m)
 barchart(value~Year.ord|variable,scales=list(y=free,x=list(rot=90)),xlab=Year,ylab=No.
  of Tests *1000,col=blue)
 This plot works just fine. But I want to go beyond this. My first data frame 
 (df1) is :
 Year,KI,G48,AvCell,HB,Htens,Impact,Struct,Tens,Year.ord
 1,2003,15.53,0.3,0.24,37.45,0.76,1.16,3.02,34.05,2003
 2,2005,15.64,0.29,0.33,34.64,1.12,1.78,4.2,32.88,2005
 3,2007,16.18,0.49,0.59,30.32,1.63,4.23,6.67,30.06,2007
 4,2009,17.09,0.67,0.91,29.47,2.27,6.76,9.68,29.25,2009
 5,2011,22.39,0.93,1.24,38.03,3.11,9.17,13.18,37.84,2011
 6,2013,33.83,1.29,1.87,58.37,4.43,14.06,19.41,57.6,2013
 7,2015M,44.91,1.83,2.71,75.54,6.28,20.57,27.51,74.5,2015M
 8,2015K,52.22,2.14,3.15,87.71,7.34,23.88,31.98,86.57,2015K
 My second data frame is (L1) is :
 Year,KIL,G48L,AvCellL,HBL,HtensL,ImpactL,StructL,TensL
 1,2009,20,1,1,30,2,10,10,35
 2,2011,24,1,1.5,35,3,12,13,38
 3,2013,30,1,2,40,4,14,16,45
 What I want, in each panel of the lattice barchart, is to plot histograms of 
 the relevant variable (KI, G48 etc) in one colour for the years 2003 to 2007, 
 and in another colour for the other years. On top of this, I want to have a 
 line plot in each panel with the limits for different years given in the 
 second data frame L1 (as bold lines).
 I would like to have information on the following points :
 1. How can I get a combination of these plots in every panel (two histograms 
 and one line plot)? Is it possible?
 2. Is it easier to do this with ggplot?
 3. I would like to know how I can present the legend 

Re: [R] ignoring zeros or converting to NA

2008-08-13 Thread Roland Rau

Hi,

since many suggestions are following the form of
x[x==0] (or similar)
I would like to ask if this is really recommended?
What I have learned (the hard way) is that one should not test for 
equality of floating point numbers (which is the default for R's numeric 
values, right?) since the binary representation of these (decimal) 
floating point numbers is not necessarily exact (with the classic 
example of decimal 0.1).
Is it okay in this case for the value zero where all binary elements are 
zero? Or does R somehow recognize that it is an integer?


Just some questions out of curiosity.

Thank you,
Roland


rcoder wrote:

Hi everyone,

I have a matrix that has a combination of zeros and NAs. When I perform
certain calculations on the matrix, the zeros generate Inf values. Is
there a way to either convert the zeros in the matrix to NAs, or only
perform the calculations if not zero (i.e. like using something similar to
an !all(is.na() construct)?

Thanks,

rcoder


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] which(df$name==A) takes ~1 second! (df is very large), but can it be speeded up?

2008-08-13 Thread Emmanuel Levy
Dear Peter and Henrik,

Thanks for your replies - this helps speed up a bit, but I thought
there would be something much faster.

What I mean is that I thought that a particular value of a level
could be accessed instantly, similarly to a hash key.

Since I've got about 6000 levels in that data frame, it means that
making a list L of the form
L[[1]] = values of name 1
L[[2]] = values of name 2
L[[3]] = values of name 3
...
would take ~1hour.

Best,

Emmanuel




2008/8/12 Henrik Bengtsson [EMAIL PROTECTED]:
 To simplify:

 n - 2.7e6;
 x - factor(c(rep(A, n/2), rep(B, n/2)));

 # Identify 'A':s
 t1 - system.time(res - which(x == A));

 # To compare a factor to a string, the factor is in practice
 # coerced to a character vector.
 t2 - system.time(res - which(as.character(x) == A));

 # Interestingly enough, this seems to be faster (repeated many times)
 # Don't know why.
 print(t2/t1);
user   system  elapsed
 0.632653 1.60 0.754717

 # Avoid coercing the factor, but instead coerce the level compared to
 t3 - system.time(res - which(x == match(A, levels(x;

 # ...but gives no speed up
 print(t3/t1);
user   system  elapsed
 1.041667 1.00 1.018182

 # But coercing the factor to integers does
 t4 - system.time(res - which(as.integer(x) == match(A, levels(x
 print(t4/t1);
 usersystem   elapsed
 0.417 0.000 0.3636364

 So, the latter seems to be the fastest way to identify those elements.

 My $.02

 /Henrik


 On Tue, Aug 12, 2008 at 7:31 PM, Peter Cowan [EMAIL PROTECTED] wrote:
 Emmanuel,

 On Tue, Aug 12, 2008 at 4:35 PM, Emmanuel Levy [EMAIL PROTECTED] wrote:
 Dear All,

 I have a large data frame ( 270 lines and 14 columns), and I would like 
 to
 extract the information in a particular way illustrated below:


 Given a data frame df:

 col1=sample(c(0,1),10, rep=T)
 names = factor(c(rep(A,5),rep(B,5)))
 df = data.frame(names,col1)
 df
   names col1
 1  A1
 2  A0
 3  A1
 4  A0
 5  A1
 6  B0
 7  B0
 8  B1
 9  B0
 10 B0

 I would like to tranform it in the form:

 index = c(A,B)
 col1[[1]]=df$col1[which(df$name==A)]
 col1[[2]]=df$col1[which(df$name==B)]

 I'm not sure I fully understand your problem, you example would not run for 
 me.

 You could get a small speedup by omitting which(), you can subset by a
 logical vector also which give a small speedup.

 n - 270
 foo - data.frame(
 +   one = sample(c(0,1), n, rep = T),
 +   two = factor(c(rep(A, n/2 ),rep(B, n/2 )))
 +   )
 system.time(out - which(foo$two==A))
   user  system elapsed
  0.566   0.146   0.761
 system.time(out - foo$two==A)
   user  system elapsed
  0.429   0.075   0.588

 You might also find use for unstack(), though I didn't see a speedup.
 system.time(out - unstack(foo))
   user  system elapsed
  1.068   0.697   2.004

 HTH

 Peter

 My problem is that the command:  *** which(df$name==A) ***
 takes about 1 second because df is so big.

 I was thinking that a level could maybe be accessed instantly but I am not
 sure about how to do it.

 I would be very grateful for any advice that would allow me to speed this 
 up.

 Best wishes,

 Emmanuel

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.



__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] bmp header

2008-08-13 Thread rostam shahname
Hi R users,
I have a xml file. A value of one of the nodes of the xml file is a bmp
image(RAW format) encoded in base64. I would like to read this image by R. I
think I should do the following steps:

1. Decoding it from base64 to binary.
2. Removing the header of the image file
3. building the matrix

So I wonder if you know how to do the following using R functions:
1. decode from base64 to binary. base64decode does not decode to binary. The
binary file is an openable bmp file.
2. Remove the header of bmp image and produce a matrix which has the color
values.

My main goal is producing the matrix which has the color values, if the
aforementioned steps don't look plausible, what is your suggested steps.

Right now I produce the matrix, using the following steps, but I wonder if I
can avoid using Image Magic and python.

1. Decoding from base64 to binary using a python function. After decoding I
have a openable image file.
2. Converting bmp format to pnm using Image Magic program
3. Reading pnm format using pixmap library in R. The function read.pnm
produces a pixmap object
4. Producing matrices using pixmap object

Thanks for your help,
Rostam

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] rgl/compiz problem

2008-08-13 Thread Barry Rowlingson
2008/8/13 Ben Bolker [EMAIL PROTECTED]:
 Barry Rowlingson b.rowlingson at lancaster.ac.uk writes:


 I have just encountered the problem with rgl where plot3d figures
 don't interact with the mouse. My plots zoom in and out with the mouse
 wheel but the mouse buttons do nothing. I can't rotate the plot.


 I just showed this problem to a colleague who also uses fancy wobbly
windows on his Ubuntu box, and he had the same problem. But then with
some random frustrated clicking his scatterplot moved! It rotated
slightly! What? How did he do that? He didn't know!

 So he tried mouse buttons in combination. Holding B1 and then B3 and
moving the mouse resulted in a zoom operation. Holding first B3 and
then B1 resulted in rotation functionality. B3 and then B2 resulted in
the field-of-view change operation. These three operations were what
should have happened with B3, B1 and B2 presses on their own.
Seemingly the mouse presses didn't get through to rgl unless another
mouse button was held down. It's quite general. Hold Bx and then hold
By and you get the functionality of By.

 Back in my office, the same things worked for me too. So as long as I
do that, everything is fine for me. Unfortunately my colleague has the
problem of not having any window decorations and having the rgl window
go invisible when moved... Oh well, he can't win them all!

 Barry

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] which(df$name==A) takes ~1 second! (df is very large), but can it be speeded up?

2008-08-13 Thread Erik Iverson
I still don't understand what you are doing.  Can you make a small 
example that shows what you have and what you want?


Is ?split what you are after?

Emmanuel Levy wrote:

Dear Peter and Henrik,

Thanks for your replies - this helps speed up a bit, but I thought
there would be something much faster.

What I mean is that I thought that a particular value of a level
could be accessed instantly, similarly to a hash key.

Since I've got about 6000 levels in that data frame, it means that
making a list L of the form
L[[1]] = values of name 1
L[[2]] = values of name 2
L[[3]] = values of name 3
...
would take ~1hour.

Best,

Emmanuel




2008/8/12 Henrik Bengtsson [EMAIL PROTECTED]:

To simplify:

n - 2.7e6;
x - factor(c(rep(A, n/2), rep(B, n/2)));

# Identify 'A':s
t1 - system.time(res - which(x == A));

# To compare a factor to a string, the factor is in practice
# coerced to a character vector.
t2 - system.time(res - which(as.character(x) == A));

# Interestingly enough, this seems to be faster (repeated many times)
# Don't know why.
print(t2/t1);
   user   system  elapsed
0.632653 1.60 0.754717

# Avoid coercing the factor, but instead coerce the level compared to
t3 - system.time(res - which(x == match(A, levels(x;

# ...but gives no speed up
print(t3/t1);
   user   system  elapsed
1.041667 1.00 1.018182

# But coercing the factor to integers does
t4 - system.time(res - which(as.integer(x) == match(A, levels(x
print(t4/t1);
usersystem   elapsed
0.417 0.000 0.3636364

So, the latter seems to be the fastest way to identify those elements.

My $.02

/Henrik


On Tue, Aug 12, 2008 at 7:31 PM, Peter Cowan [EMAIL PROTECTED] wrote:

Emmanuel,

On Tue, Aug 12, 2008 at 4:35 PM, Emmanuel Levy [EMAIL PROTECTED] wrote:

Dear All,

I have a large data frame ( 270 lines and 14 columns), and I would like to
extract the information in a particular way illustrated below:


Given a data frame df:


col1=sample(c(0,1),10, rep=T)
names = factor(c(rep(A,5),rep(B,5)))
df = data.frame(names,col1)
df

  names col1
1  A1
2  A0
3  A1
4  A0
5  A1
6  B0
7  B0
8  B1
9  B0
10 B0

I would like to tranform it in the form:


index = c(A,B)
col1[[1]]=df$col1[which(df$name==A)]
col1[[2]]=df$col1[which(df$name==B)]

I'm not sure I fully understand your problem, you example would not run for me.

You could get a small speedup by omitting which(), you can subset by a
logical vector also which give a small speedup.


n - 270
foo - data.frame(

+   one = sample(c(0,1), n, rep = T),
+   two = factor(c(rep(A, n/2 ),rep(B, n/2 )))
+   )

system.time(out - which(foo$two==A))

  user  system elapsed
 0.566   0.146   0.761

system.time(out - foo$two==A)

  user  system elapsed
 0.429   0.075   0.588

You might also find use for unstack(), though I didn't see a speedup.

system.time(out - unstack(foo))

  user  system elapsed
 1.068   0.697   2.004

HTH

Peter


My problem is that the command:  *** which(df$name==A) ***
takes about 1 second because df is so big.

I was thinking that a level could maybe be accessed instantly but I am not
sure about how to do it.

I would be very grateful for any advice that would allow me to speed this up.

Best wishes,

Emmanuel

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] ignoring zeros or converting to NA

2008-08-13 Thread Thomas Lumley


Integers (up to a fairly high limit) are represented exactly, as are fractions 
whose denominator is a power of two (again up to a fairly high limit), so x==0 
is fine in that sense.

If x is computed by floating point operations you do have to worry whether 
these are exact, eg, with
  x-seq(-1,1,length=7)
it is not clear that the fourth element will be exactly zero.

-thomas


On Wed, 13 Aug 2008, Roland Rau wrote:


Hi,

since many suggestions are following the form of
x[x==0] (or similar)
I would like to ask if this is really recommended?
What I have learned (the hard way) is that one should not test for equality of 
floating point numbers (which is the default for R's numeric values, right?) 
since the binary representation of these (decimal) floating point numbers is 
not necessarily exact (with the classic example of decimal 0.1).
Is it okay in this case for the value zero where all binary elements are zero? 
Or does R somehow recognize that it is an integer?


Just some questions out of curiosity.

Thank you,
Roland


rcoder wrote:

Hi everyone,

I have a matrix that has a combination of zeros and NAs. When I perform
certain calculations on the matrix, the zeros generate Inf values. Is
there a way to either convert the zeros in the matrix to NAs, or only
perform the calculations if not zero (i.e. like using something similar to
an !all(is.na() construct)?

Thanks,

rcoder


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



Thomas Lumley   Assoc. Professor, Biostatistics
[EMAIL PROTECTED]   University of Washington, Seattle

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] rgl/compiz problem

2008-08-13 Thread Ben Bolker

Barry Rowlingson wrote:

2008/8/13 Ben Bolker [EMAIL PROTECTED]:

Barry Rowlingson b.rowlingson at lancaster.ac.uk writes:
oe

 So he tried mouse buttons in combination. Holding B1 and then B3 and
moving the mouse resulted in a zoom operation. Holding first B3 and
then B1 resulted in rotation functionality. B3 and then B2 resulted in
the field-of-view change operation. These three operations were what
should have happened with B3, B1 and B2 presses on their own.
Seemingly the mouse presses didn't get through to rgl unless another
mouse button was held down. It's quite general. Hold Bx and then hold
By and you get the functionality of By.

 Back in my office, the same things worked for me too. So as long as I
do that, everything is fine for me. Unfortunately my colleague has the
problem of not having any window decorations and having the rgl window
go invisible when moved... Oh well, he can't win them all!

 Barry


   Interesting.  I can confirm that this works for me too, although
since I'm emulating B3 with a two-button mouse that obviously doesn't
work ... also, the window behavior is extremely erratic (no decorations,
window doesn't always come to front when it should, doesn't disappear
immediately when closed, etc.)
   Since my machine tends to lock up on suspend when the fancy graphics
are on, I'll probably continue without them for the time being ...
(I suppose I could write a script to toggle the graphics mode on
suspend/wake, but ugh)

  Ben

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] which(df$name==A) takes ~1 second! (df is very large), but can it be speeded up?

2008-08-13 Thread Emmanuel Levy
Sorry for being unclear, I thought the example above was clear enough.

I have a data frame of the form:

  name   info
1  YAL001C 1
2  YAL001C 1
3  YAL001C 1
4  YAL001C 1
5  YAL001C 0
6  YAL001C 1
7  YAL001C 1
8  YAL001C 1
9  YAL001C 1
10 YAL001C 1
...
...
~270 lines, and ~6000 different names.

which corresponds to yeast proteins + some info.
So there are about 6000 names like YAL001C

I would like to transform this data frame into the following form:

1/ a list, where each protein corresponds to an index, and the info is
the vector
 L[[1]]
[1] 1 1 1 1 0 1 1 1 1 1 0 0 0 0 0 0 0 1 1 1 
 L[[2]]
[1] 0 0 0 0 0 0 1 0 1 1 0 0 0 0 0 0 0 0 0 0 
etc.

2/ an index, which gives me the position of each protein in the list:
 index
[1] YAL001C YAL002W YAL003W YAL005C YAL007C ...

I hope this will be clearer!

I'll have a look right now that the split and hash.mat functions.

Thanks for your help,

Emmanuel




2008/8/13 Erik Iverson [EMAIL PROTECTED]:
 I still don't understand what you are doing.  Can you make a small example
 that shows what you have and what you want?

 Is ?split what you are after?

 Emmanuel Levy wrote:

 Dear Peter and Henrik,

 Thanks for your replies - this helps speed up a bit, but I thought
 there would be something much faster.

 What I mean is that I thought that a particular value of a level
 could be accessed instantly, similarly to a hash key.

 Since I've got about 6000 levels in that data frame, it means that
 making a list L of the form
 L[[1]] = values of name 1
 L[[2]] = values of name 2
 L[[3]] = values of name 3
 ...
 would take ~1hour.

 Best,

 Emmanuel




 2008/8/12 Henrik Bengtsson [EMAIL PROTECTED]:

 To simplify:

 n - 2.7e6;
 x - factor(c(rep(A, n/2), rep(B, n/2)));

 # Identify 'A':s
 t1 - system.time(res - which(x == A));

 # To compare a factor to a string, the factor is in practice
 # coerced to a character vector.
 t2 - system.time(res - which(as.character(x) == A));

 # Interestingly enough, this seems to be faster (repeated many times)
 # Don't know why.
 print(t2/t1);
   user   system  elapsed
 0.632653 1.60 0.754717

 # Avoid coercing the factor, but instead coerce the level compared to
 t3 - system.time(res - which(x == match(A, levels(x;

 # ...but gives no speed up
 print(t3/t1);
   user   system  elapsed
 1.041667 1.00 1.018182

 # But coercing the factor to integers does
 t4 - system.time(res - which(as.integer(x) == match(A, levels(x
 print(t4/t1);
usersystem   elapsed
 0.417 0.000 0.3636364

 So, the latter seems to be the fastest way to identify those elements.

 My $.02

 /Henrik


 On Tue, Aug 12, 2008 at 7:31 PM, Peter Cowan [EMAIL PROTECTED] wrote:

 Emmanuel,

 On Tue, Aug 12, 2008 at 4:35 PM, Emmanuel Levy [EMAIL PROTECTED]
 wrote:

 Dear All,

 I have a large data frame ( 270 lines and 14 columns), and I would
 like to
 extract the information in a particular way illustrated below:


 Given a data frame df:

 col1=sample(c(0,1),10, rep=T)
 names = factor(c(rep(A,5),rep(B,5)))
 df = data.frame(names,col1)
 df

  names col1
 1  A1
 2  A0
 3  A1
 4  A0
 5  A1
 6  B0
 7  B0
 8  B1
 9  B0
 10 B0

 I would like to tranform it in the form:

 index = c(A,B)
 col1[[1]]=df$col1[which(df$name==A)]
 col1[[2]]=df$col1[which(df$name==B)]

 I'm not sure I fully understand your problem, you example would not run
 for me.

 You could get a small speedup by omitting which(), you can subset by a
 logical vector also which give a small speedup.

 n - 270
 foo - data.frame(

 +   one = sample(c(0,1), n, rep = T),
 +   two = factor(c(rep(A, n/2 ),rep(B, n/2 )))
 +   )

 system.time(out - which(foo$two==A))

  user  system elapsed
  0.566   0.146   0.761

 system.time(out - foo$two==A)

  user  system elapsed
  0.429   0.075   0.588

 You might also find use for unstack(), though I didn't see a speedup.

 system.time(out - unstack(foo))

  user  system elapsed
  1.068   0.697   2.004

 HTH

 Peter

 My problem is that the command:  *** which(df$name==A) ***
 takes about 1 second because df is so big.

 I was thinking that a level could maybe be accessed instantly but I
 am not
 sure about how to do it.

 I would be very grateful for any advice that would allow me to speed
 this up.

 Best wishes,

 Emmanuel

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.



[R] Flag variable

2008-08-13 Thread Michael Pearmain
Hi All,

I have 4000 case which have string variables in them, i want to do some
fuzzy matching and create a new variable that is of the same length with 0
or 1's
if i use the code
test- agrep(web Klick,ETC$Exposure.Type , max = 2, ignore.case = TRUE)
it works but i get
 length(test)
[1] 3127

This returns the case values that do match, can someone tell me how to match
this on the dataset (ETC) that i have as 1 and 0 ?

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] which(df$name==A) takes ~1 second! (df is very large), but can it be speeded up?

2008-08-13 Thread Emmanuel Levy
Wow great! Split was exactly what was needed. It takes about 1 second
for the whole operation :D

Thanks again - I can't believe I never used this function in the past.

All the best,

Emmanuel


2008/8/13 Erik Iverson [EMAIL PROTECTED]:
 I still don't understand what you are doing.  Can you make a small example
 that shows what you have and what you want?

 Is ?split what you are after?

 Emmanuel Levy wrote:

 Dear Peter and Henrik,

 Thanks for your replies - this helps speed up a bit, but I thought
 there would be something much faster.

 What I mean is that I thought that a particular value of a level
 could be accessed instantly, similarly to a hash key.

 Since I've got about 6000 levels in that data frame, it means that
 making a list L of the form
 L[[1]] = values of name 1
 L[[2]] = values of name 2
 L[[3]] = values of name 3
 ...
 would take ~1hour.

 Best,

 Emmanuel




 2008/8/12 Henrik Bengtsson [EMAIL PROTECTED]:

 To simplify:

 n - 2.7e6;
 x - factor(c(rep(A, n/2), rep(B, n/2)));

 # Identify 'A':s
 t1 - system.time(res - which(x == A));

 # To compare a factor to a string, the factor is in practice
 # coerced to a character vector.
 t2 - system.time(res - which(as.character(x) == A));

 # Interestingly enough, this seems to be faster (repeated many times)
 # Don't know why.
 print(t2/t1);
   user   system  elapsed
 0.632653 1.60 0.754717

 # Avoid coercing the factor, but instead coerce the level compared to
 t3 - system.time(res - which(x == match(A, levels(x;

 # ...but gives no speed up
 print(t3/t1);
   user   system  elapsed
 1.041667 1.00 1.018182

 # But coercing the factor to integers does
 t4 - system.time(res - which(as.integer(x) == match(A, levels(x
 print(t4/t1);
usersystem   elapsed
 0.417 0.000 0.3636364

 So, the latter seems to be the fastest way to identify those elements.

 My $.02

 /Henrik


 On Tue, Aug 12, 2008 at 7:31 PM, Peter Cowan [EMAIL PROTECTED] wrote:

 Emmanuel,

 On Tue, Aug 12, 2008 at 4:35 PM, Emmanuel Levy [EMAIL PROTECTED]
 wrote:

 Dear All,

 I have a large data frame ( 270 lines and 14 columns), and I would
 like to
 extract the information in a particular way illustrated below:


 Given a data frame df:

 col1=sample(c(0,1),10, rep=T)
 names = factor(c(rep(A,5),rep(B,5)))
 df = data.frame(names,col1)
 df

  names col1
 1  A1
 2  A0
 3  A1
 4  A0
 5  A1
 6  B0
 7  B0
 8  B1
 9  B0
 10 B0

 I would like to tranform it in the form:

 index = c(A,B)
 col1[[1]]=df$col1[which(df$name==A)]
 col1[[2]]=df$col1[which(df$name==B)]

 I'm not sure I fully understand your problem, you example would not run
 for me.

 You could get a small speedup by omitting which(), you can subset by a
 logical vector also which give a small speedup.

 n - 270
 foo - data.frame(

 +   one = sample(c(0,1), n, rep = T),
 +   two = factor(c(rep(A, n/2 ),rep(B, n/2 )))
 +   )

 system.time(out - which(foo$two==A))

  user  system elapsed
  0.566   0.146   0.761

 system.time(out - foo$two==A)

  user  system elapsed
  0.429   0.075   0.588

 You might also find use for unstack(), though I didn't see a speedup.

 system.time(out - unstack(foo))

  user  system elapsed
  1.068   0.697   2.004

 HTH

 Peter

 My problem is that the command:  *** which(df$name==A) ***
 takes about 1 second because df is so big.

 I was thinking that a level could maybe be accessed instantly but I
 am not
 sure about how to do it.

 I would be very grateful for any advice that would allow me to speed
 this up.

 Best wishes,

 Emmanuel

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] summary.manova rank deficiency error + data

2008-08-13 Thread Peter Dalgaard

Pedro Mardones wrote:

Thanks for the reply. The SAS output is attached but seems to me that
doesn't correspond to the wihtin-row contrasts as you suggested. By
the way, yes the data are highly correlated, in fact each row
correspond to the first part of a signal vector. Thanks anyway
PM
  
Agreed. I tried disabling the check that causes R to protest, and then 
it gives similar DF but not quite the same statistics, quite possibly 
due to numerical instabilities in one or both systems. (You can easily 
try yourself, just do anova.mlm - stats::anova.mlm and edit the qr() 
call inside.)


 anova(lm(cbind(Y1,Y2,Y3,Y4,Y5)~GROUP, test), test = Wilks)
Analysis of Variance Table

   DfWilks approx F num Df den Df Pr(F)   
(Intercept)  1 0.002537  1887.24  5 24 2e-16 ***
GROUP2 0.62 1.29 10 48 0.2616   
Residuals   28  
---

Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1




  The GLM Procedure
  Multivariate Analysis of Variance
E = Error SSCP Matrix
   y1y2y3
  y4y5
 y1  0.0353518799   0.035256904  0.0351327804
0.0349749601  0.0347868018
 y2   0.035256904  0.0351627227  0.0350395053
0.0348827098  0.0346956744
 y3  0.0351327804  0.0350395053  0.0349173343
0.0347617352  0.0345760232
 y4  0.0349749601  0.0348827098  0.0347617352
0.0346075203  0.0344233531
 y5  0.0347868018  0.0346956744  0.0345760232
0.0344233531  0.0342409225

Partial Correlation Coefficients from the Error SSCP Matrix / Prob  |r|
  DF = 28 y1 y2 y3
y4 y5
  y11.00   0.92   0.67
0.21   0.999852
 .0001 .0001
.0001 .0001
  y20.92   1.00   0.91
0.63   0.11
  .0001.0001
.0001 .0001
  y30.67   0.91   1.00
0.90   0.58
  .0001 .0001
.0001 .0001
  y40.21   0.63   0.90
1.00   0.89
  .0001 .0001 .0001
   .0001
  y50.999852   0.11   0.58
0.89   1.00
  .0001 .0001 .0001 .0001

The SAS System 10:33 Wednesday, August 13, 2008   8
  The GLM Procedure
  Multivariate Analysis of Variance
  H = Type III SSCP Matrix for group
   y1y2y3
  y4y5
 y1  0.0023822408   0.002365848  0.0023471328
0.0023261249  0.0023030993
 y2   0.002365848  0.0023495679  0.0023309816
0.0023101183  0.0022872511
 y3  0.0023471328  0.0023309816  0.0023125426
0.0022918453  0.0022691608
 y4  0.0023261249  0.0023101183  0.0022918453
0.0022713359  0.0022488593
 y5  0.0023030993  0.0022872511  0.0022691608
0.0022488593  0.0022266141

 Characteristic Roots and Vectors of: E Inverse * H, where
  H = Type III SSCP Matrix for group
E = Error SSCP Matrix
Characteristic   Characteristic Vector  V'EV=1
  Root  Percenty1y2y3
  y4y5
0.4184010371.72 -7542.628 17131.814  5347.394
-31627.317 16700.100
0.1649601128.28 -4180.854 -4413.446 32096.035
-35545.204 12040.697
0.0001 0.00-41004.875107291.004-95905.664
32641.189 -3028.470
0. 0.00  -416.226  -111.206   410.721
 295.193  -171.953
0. 0.00-14678.651  5787.997 54718.250
-69055.249 23218.580

   MANOVA Test Criteria and F Approximations for the Hypothesis of No
Overall group Effect
 H = Type III SSCP Matrix for group
E = Error SSCP Matrix
 S=2M=1N=11
   StatisticValueF ValueNum DF
Den DFPr  F
   Wilks' Lambda   0.60518744   1.3710
   480.2227
   Pillai's Trace  0.43658228   1.4010
   500.2095
   Hotelling-Lawley Trace  0.58336114   1.3710
33.3620.2385
   Roy's Greatest Root 0.41840103   2.09 5
   250.1000










On Wed, Aug 13, 2008 at 4:34 AM, Peter Dalgaard
[EMAIL PROTECTED] wrote:
  

Pedro Mardones wrote:


Dear R-users;


Re: [R] which(df$name==A) takes ~1 second! (df is very large), but can it be speeded up?

2008-08-13 Thread jim holtman
split if probably what you are after.  Here is an example:

 n - 270
 x - data.frame(name=sample(1:6000,n,TRUE), value=runif(n))
 # split it into 6000 lists
 system.time(y - split(x$value, x$name))
   user  system elapsed
   0.800.201.07
 str(y[1:10])
List of 10
 $ 1 : num [1:454] 0.270 0.380 0.238 0.048 0.715 ...
 $ 2 : num [1:440] 0.769 0.822 0.832 0.527 0.808 ...
 $ 3 : num [1:444] 0.626 0.324 0.918 0.916 0.743 ...
 $ 4 : num [1:455] 0.341 0.482 0.134 0.237 0.324 ...
 $ 5 : num [1:430] 0.610 0.217 0.245 0.716 0.600 ...
 $ 6 : num [1:443] 0.460 0.335 0.503 0.798 0.181 ...
 $ 7 : num [1:424] 0.4417 0.4759 0.7436 0.0863 0.1770 ...
 $ 8 : num [1:480] 0.0712 0.6774 0.2995 0.8378 0.1902 ...
 $ 9 : num [1:431] 0.892 0.836 0.397 0.612 0.395 ...
 $ 10: num [1:448] 0.984 0.601 0.793 0.363 0.898 ...

 Takes less that 1 second to split into 6000 lists.

On Wed, Aug 13, 2008 at 9:03 AM, Emmanuel Levy [EMAIL PROTECTED] wrote:
 Wow great! Split was exactly what was needed. It takes about 1 second
 for the whole operation :D

 Thanks again - I can't believe I never used this function in the past.

 All the best,

 Emmanuel


 2008/8/13 Erik Iverson [EMAIL PROTECTED]:
 I still don't understand what you are doing.  Can you make a small example
 that shows what you have and what you want?

 Is ?split what you are after?

 Emmanuel Levy wrote:

 Dear Peter and Henrik,

 Thanks for your replies - this helps speed up a bit, but I thought
 there would be something much faster.

 What I mean is that I thought that a particular value of a level
 could be accessed instantly, similarly to a hash key.

 Since I've got about 6000 levels in that data frame, it means that
 making a list L of the form
 L[[1]] = values of name 1
 L[[2]] = values of name 2
 L[[3]] = values of name 3
 ...
 would take ~1hour.

 Best,

 Emmanuel




 2008/8/12 Henrik Bengtsson [EMAIL PROTECTED]:

 To simplify:

 n - 2.7e6;
 x - factor(c(rep(A, n/2), rep(B, n/2)));

 # Identify 'A':s
 t1 - system.time(res - which(x == A));

 # To compare a factor to a string, the factor is in practice
 # coerced to a character vector.
 t2 - system.time(res - which(as.character(x) == A));

 # Interestingly enough, this seems to be faster (repeated many times)
 # Don't know why.
 print(t2/t1);
   user   system  elapsed
 0.632653 1.60 0.754717

 # Avoid coercing the factor, but instead coerce the level compared to
 t3 - system.time(res - which(x == match(A, levels(x;

 # ...but gives no speed up
 print(t3/t1);
   user   system  elapsed
 1.041667 1.00 1.018182

 # But coercing the factor to integers does
 t4 - system.time(res - which(as.integer(x) == match(A, levels(x
 print(t4/t1);
usersystem   elapsed
 0.417 0.000 0.3636364

 So, the latter seems to be the fastest way to identify those elements.

 My $.02

 /Henrik


 On Tue, Aug 12, 2008 at 7:31 PM, Peter Cowan [EMAIL PROTECTED] wrote:

 Emmanuel,

 On Tue, Aug 12, 2008 at 4:35 PM, Emmanuel Levy [EMAIL PROTECTED]
 wrote:

 Dear All,

 I have a large data frame ( 270 lines and 14 columns), and I would
 like to
 extract the information in a particular way illustrated below:


 Given a data frame df:

 col1=sample(c(0,1),10, rep=T)
 names = factor(c(rep(A,5),rep(B,5)))
 df = data.frame(names,col1)
 df

  names col1
 1  A1
 2  A0
 3  A1
 4  A0
 5  A1
 6  B0
 7  B0
 8  B1
 9  B0
 10 B0

 I would like to tranform it in the form:

 index = c(A,B)
 col1[[1]]=df$col1[which(df$name==A)]
 col1[[2]]=df$col1[which(df$name==B)]

 I'm not sure I fully understand your problem, you example would not run
 for me.

 You could get a small speedup by omitting which(), you can subset by a
 logical vector also which give a small speedup.

 n - 270
 foo - data.frame(

 +   one = sample(c(0,1), n, rep = T),
 +   two = factor(c(rep(A, n/2 ),rep(B, n/2 )))
 +   )

 system.time(out - which(foo$two==A))

  user  system elapsed
  0.566   0.146   0.761

 system.time(out - foo$two==A)

  user  system elapsed
  0.429   0.075   0.588

 You might also find use for unstack(), though I didn't see a speedup.

 system.time(out - unstack(foo))

  user  system elapsed
  1.068   0.697   2.004

 HTH

 Peter

 My problem is that the command:  *** which(df$name==A) ***
 takes about 1 second because df is so big.

 I was thinking that a level could maybe be accessed instantly but I
 am not
 sure about how to do it.

 I would be very grateful for any advice that would allow me to speed
 this up.

 Best wishes,

 Emmanuel

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE 

Re: [R] which(df$name==A) takes ~1 second! (df is very large), but can it be speeded up?

2008-08-13 Thread jim holtman
If you want the index, then use:

 system.time(y - split(seq(nrow(x)), x$name))
   user  system elapsed
   0.810.060.88
 str(y[1:10])
List of 10
 $ 1 : int [1:454] 6924 17503 26880 39197 42881 50835 57896 62624
65767 75359 ...
 $ 2 : int [1:440] 9954 25619 25761 33776 56651 60372 61042 63134
64414 64491 ...
 $ 3 : int [1:444] 5413 6831 15780 21652 29423 37000 38661 60977 72267 74839 ...
 $ 4 : int [1:455] 23859 24748 27221 34886 40538 41326 45065 79769
81783 83951 ...
 $ 5 : int [1:430] 2572 3514 9934 24969 33844 35409 38122 38161 40113 45593 ...
 $ 6 : int [1:443] 7145 25184 26348 31182 39965 44191 49114 52791
69855 74272 ...
 $ 7 : int [1:424] 4596 11762 24949 30324 57906 59043 64833 70769
88878 90594 ...
 $ 8 : int [1:480] 14809 17604 18958 28436 31449 45339 51829 57725
65243 73260 ...
 $ 9 : int [1:431] 10748 14579 27153 27685 31930 32593 34605 35680
35828 50490 ...
 $ 10: int [1:448] 5292 13049 21132 22673 22983 28324 40099 43709
55505 70957 ...




On Wed, Aug 13, 2008 at 9:09 AM, jim holtman [EMAIL PROTECTED] wrote:
 split if probably what you are after.  Here is an example:

 n - 270
 x - data.frame(name=sample(1:6000,n,TRUE), value=runif(n))
 # split it into 6000 lists
 system.time(y - split(x$value, x$name))
   user  system elapsed
   0.800.201.07
 str(y[1:10])
 List of 10
  $ 1 : num [1:454] 0.270 0.380 0.238 0.048 0.715 ...
  $ 2 : num [1:440] 0.769 0.822 0.832 0.527 0.808 ...
  $ 3 : num [1:444] 0.626 0.324 0.918 0.916 0.743 ...
  $ 4 : num [1:455] 0.341 0.482 0.134 0.237 0.324 ...
  $ 5 : num [1:430] 0.610 0.217 0.245 0.716 0.600 ...
  $ 6 : num [1:443] 0.460 0.335 0.503 0.798 0.181 ...
  $ 7 : num [1:424] 0.4417 0.4759 0.7436 0.0863 0.1770 ...
  $ 8 : num [1:480] 0.0712 0.6774 0.2995 0.8378 0.1902 ...
  $ 9 : num [1:431] 0.892 0.836 0.397 0.612 0.395 ...
  $ 10: num [1:448] 0.984 0.601 0.793 0.363 0.898 ...

  Takes less that 1 second to split into 6000 lists.

 On Wed, Aug 13, 2008 at 9:03 AM, Emmanuel Levy [EMAIL PROTECTED] wrote:
 Wow great! Split was exactly what was needed. It takes about 1 second
 for the whole operation :D

 Thanks again - I can't believe I never used this function in the past.

 All the best,

 Emmanuel


 2008/8/13 Erik Iverson [EMAIL PROTECTED]:
 I still don't understand what you are doing.  Can you make a small example
 that shows what you have and what you want?

 Is ?split what you are after?

 Emmanuel Levy wrote:

 Dear Peter and Henrik,

 Thanks for your replies - this helps speed up a bit, but I thought
 there would be something much faster.

 What I mean is that I thought that a particular value of a level
 could be accessed instantly, similarly to a hash key.

 Since I've got about 6000 levels in that data frame, it means that
 making a list L of the form
 L[[1]] = values of name 1
 L[[2]] = values of name 2
 L[[3]] = values of name 3
 ...
 would take ~1hour.

 Best,

 Emmanuel




 2008/8/12 Henrik Bengtsson [EMAIL PROTECTED]:

 To simplify:

 n - 2.7e6;
 x - factor(c(rep(A, n/2), rep(B, n/2)));

 # Identify 'A':s
 t1 - system.time(res - which(x == A));

 # To compare a factor to a string, the factor is in practice
 # coerced to a character vector.
 t2 - system.time(res - which(as.character(x) == A));

 # Interestingly enough, this seems to be faster (repeated many times)
 # Don't know why.
 print(t2/t1);
   user   system  elapsed
 0.632653 1.60 0.754717

 # Avoid coercing the factor, but instead coerce the level compared to
 t3 - system.time(res - which(x == match(A, levels(x;

 # ...but gives no speed up
 print(t3/t1);
   user   system  elapsed
 1.041667 1.00 1.018182

 # But coercing the factor to integers does
 t4 - system.time(res - which(as.integer(x) == match(A, levels(x
 print(t4/t1);
usersystem   elapsed
 0.417 0.000 0.3636364

 So, the latter seems to be the fastest way to identify those elements.

 My $.02

 /Henrik


 On Tue, Aug 12, 2008 at 7:31 PM, Peter Cowan [EMAIL PROTECTED] wrote:

 Emmanuel,

 On Tue, Aug 12, 2008 at 4:35 PM, Emmanuel Levy [EMAIL PROTECTED]
 wrote:

 Dear All,

 I have a large data frame ( 270 lines and 14 columns), and I would
 like to
 extract the information in a particular way illustrated below:


 Given a data frame df:

 col1=sample(c(0,1),10, rep=T)
 names = factor(c(rep(A,5),rep(B,5)))
 df = data.frame(names,col1)
 df

  names col1
 1  A1
 2  A0
 3  A1
 4  A0
 5  A1
 6  B0
 7  B0
 8  B1
 9  B0
 10 B0

 I would like to tranform it in the form:

 index = c(A,B)
 col1[[1]]=df$col1[which(df$name==A)]
 col1[[2]]=df$col1[which(df$name==B)]

 I'm not sure I fully understand your problem, you example would not run
 for me.

 You could get a small speedup by omitting which(), you can subset by a
 logical vector also which give a small speedup.

 n - 270
 foo - data.frame(

 +   one = sample(c(0,1), n, rep = T),
 +   two = factor(c(rep(A, n/2 ),rep(B, n/2 )))
 +   )

 

Re: [R] ignoring zeros or converting to NA

2008-08-13 Thread S Ellison
The help page on binary operators (see ?==) confirms that binary
representation of fractional representation is not catered for and
points to all.equal as a more suitable test method for those cases.

Steve E

 Thomas Lumley [EMAIL PROTECTED] 13/08/2008 16:47 

Integers (up to a fairly high limit) are represented exactly, as are
fractions whose denominator is a power of two (again up to a fairly high
limit), so x==0 is fine in that sense.

If x is computed by floating point operations you do have to worry
whether these are exact, eg, with
   x-seq(-1,1,length=7)
it is not clear that the fourth element will be exactly zero.

 -thomas


On Wed, 13 Aug 2008, Roland Rau wrote:

 Hi,

 since many suggestions are following the form of
 x[x==0] (or similar)
 I would like to ask if this is really recommended?
 What I have learned (the hard way) is that one should not test for
equality of 
 floating point numbers (which is the default for R's numeric values,
right?) 
 since the binary representation of these (decimal) floating point
numbers is 
 not necessarily exact (with the classic example of decimal 0.1).
 Is it okay in this case for the value zero where all binary elements
are zero? 
 Or does R somehow recognize that it is an integer?

 Just some questions out of curiosity.

 Thank you,
 Roland


 rcoder wrote:
 Hi everyone,
 
 I have a matrix that has a combination of zeros and NAs. When I
perform
 certain calculations on the matrix, the zeros generate Inf values.
Is
 there a way to either convert the zeros in the matrix to NAs, or
only
 perform the calculations if not zero (i.e. like using something
similar to
 an !all(is.na() construct)?
 
 Thanks,
 
 rcoder

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help 
 PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html 
 and provide commented, minimal, self-contained, reproducible code.


Thomas Lumley   Assoc. Professor, Biostatistics
[EMAIL PROTECTED]   University of Washington, Seattle

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help 
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html 
and provide commented, minimal, self-contained, reproducible code.

***
This email and any attachments are confidential. Any use...{{dropped:8}}

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] tcl/tk example in batch

2008-08-13 Thread David Katz

The example for learning tcl/tk under R at
http://bioinf.wehi.edu.au/~wettenhall/RTclTkExamples/OKtoplevel.html
suggests running it from batch - but when I do, the window flashes by and
the example ends. I'm under XP pro. Is there a workaround? Should I create a
modal window instead so it persists? Thanks.
-- 
View this message in context: 
http://www.nabble.com/tcl-tk-example-in-batch-tp18964294p18964294.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Arguments to lm() within a function - object not found

2008-08-13 Thread Pete Berlin
Hi all,

I'm having some difficulty passing arguments into lm() from within a
function, and I was hoping someone wiser in the ways of R could tell me
what I'm doing wrong. I have the following:

lmwrap - function(...) {

  wts - somefunction()
  print(wts) # This works, wts has the values I expect
  fit - lm(weights=wts,...)

  return(fit)
}

If I call my function lmwrap, I get the the following error:

 lmwrap(a~b)
Error in eval(expr, envir, enclos) : object wts not found

A traceback gives me the following:

8: eval(expr, envir, enclos)
7: eval(extras, data, env)
6: model.frame.default(formula = ..1, weights = wts, drop.unused.levels =
TRUE)
5: model.frame(formula = ..1, weights = wts, drop.unused.levels = TRUE)
4: eval(expr, envir, enclos)
3: eval(mf, parent.frame())
2: lm(weights = wts, ...)
1: wraplm(a ~ b)

It seems like whatever environment lm is trying to eval wts in doesn't
have it defined.

Could anyone tell me what I'm doing wrong?

As a sidenote, I do have a workaround, but this strikes me as really the
wrong thing to do. I replace the call to lm with:
eval(substitute(lm(weights = dummy,...),list(dummy=wts)))
which works.

Thanks
Pete

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] reverse orientation of text in plot margins

2008-08-13 Thread Karel Van den Meersche

Dear R users,

I am trying to reverse the orientation of axis labels and title in the right 
margin of a plot, so that they read from top to bottom. I know that this can be 
done using text() as follows: 

par(mar=c(5,4,4,4)+.1)
plot(1:4,las=0)
par(new=T)
y - rnorm(4)
plot(y,axes=FALSE,ann=FALSE,pch=17)
axis(4,labels=FALSE)
par(xpd=TRUE)
text(x=par(usr)[2]+.25,y=axTicks(4),labels=axTicks(4),srt=-90)
text(x=par(usr)[2]+.5,y=sum(par(usr)[3:4])/2,labels=titel,srt=-90)
par(xpd=FALSE)

the problem is that I have to manually reset the x and y coordinates of the 
text whenever the plot is resized. This is problematic if I want to automatize 
the production of a number of plots (or produce different output formats), or 
to make sure that the labels and title in the right axis are at an equal 
distance from the plot as the labels and title on the left axis. Now I can only 
guess it on sight. 

mtext() allows me to set the distance, but not to reverse the orientation of 
the text. 

I could use text() to also produce the left axis, like that labels on both 
sides can be at the exact same distance from the plot, but then I want to 
determine the plot margins relative to the plot dimensions. 

Does anyone see a solution to my problem that doesn't involve trial and error 
for the x coordinate?

thanks
Karel



_


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] conditional IF with AND

2008-08-13 Thread rcoder

Hi everyone,

I'm trying to create an if conditional statement with two conditions,
whereby the statement is true when condition 1 AND condition 2 are met:

code structure:
if ?AND? (a[x,y] condition1, a[x,y] condition2)

I've trawled through the help files, but I cannot find an example of the
syntax for incorporating an AND in a conditional IF statement.

Thanks,

rcoder
-- 
View this message in context: 
http://www.nabble.com/conditional-IF-with-AND-tp18966890p18966890.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] dixon test

2008-08-13 Thread giov

Thank you so much, I have not much experience on outliers =), I thought that
there were nonparametric distribution-free outliers test =(. What is the
most general distribution  I can use? I did histogram of my data set and
sometimes normal distribution seems to occur, sometimes an uniform
distribution seems to occur. So, I cannot understand what distribution I can
use for my whole data set



S Ellison wrote:
 
 
 
 giov [EMAIL PROTECTED] 13/08/2008 10:59:32 
 
 just a question...I don't know
what is the distribution of my data (normal, T, etc...). So, how can I
 set
the type parameter? 
 
 You must assume an underlying distribution or you can't do an outlier
 test.
 
 Outliers are just unusually extreme data points. They can only be
 considered 'unusual' if there is some basis - a distribution assumption
 - for deciding what is 'usual'.  The assumed underlying distribution
 describes what is expected to be 'usual'. 
 
 With no distribution assumption, there is no basis for considering any
 data point unusual, so the idea of an outlier really has no meaning. 
 
 Steve E
 
 
 
 
 ***
 This email and any attachments are confidential. Any use...{{dropped:8}}
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.
 
 

-- 
View this message in context: 
http://www.nabble.com/dixon-test-tp18940260p18964049.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] subsetting matrix according to columns with character index

2008-08-13 Thread Henrique Dallazuanna
Try this:

x
  V1 V2 V3
1 a1 c1  1
2 a1 c1  2
3 a2 c1  1
4 a1 c2  1
5 a1 c2  2


lis - split(x, list(x$V1, x$V2), drop = TRUE)
do.call(rbind, unname(lis[sapply(lis, function(x)all(1:2 %in% x[,3]))]))

On Wed, Aug 13, 2008 at 3:00 PM, Ralph S. [EMAIL PROTECTED] wrote:

  Hi,

 I have a long matrix of the following form which I would like to subset 
 according to the third column:

 [x y z]:

 a1 c1 1
 a1 c1 2
 a2 c1 1
 a1 c2 1
 a1 c2 2
 . . .


 The first two columns a characters ai and cj.

 I would like to keep all the rows where there are two entries for z, 1 and 2.

 That is, I want:
 a1 c1 1
 a1 c1 2
 a1 c2 1
 a1 c2 2
 . . .

 I try to use something like df[by(df,c(df$x,df$y),sum(z)==3),] but that only 
 gives me one line of data per x y combination.

 Is there an easy way of coding to keep all rows for a and c combinations 
 where z has entries both 1 and 2?

 Many thanks,

 Ralph

 _


 LM_WLYIA_whichathlete_us
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




-- 
Henrique Dallazuanna
Curitiba-Paraná-Brasil
25° 25' 40 S 49° 16' 22 O

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] conditional IF with AND

2008-08-13 Thread Erik Iverson

if(cond1  cond2) {
  ...
}


rcoder wrote:

Hi everyone,

I'm trying to create an if conditional statement with two conditions,
whereby the statement is true when condition 1 AND condition 2 are met:

code structure:
if ?AND? (a[x,y] condition1, a[x,y] condition2)

I've trawled through the help files, but I cannot find an example of the
syntax for incorporating an AND in a conditional IF statement.

Thanks,

rcoder


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Arguments to lm() within a function - object not found

2008-08-13 Thread Pete Berlin
Hi all,

I'm having some difficulty passing arguments into lm() from within a
function, and I was hoping someone wiser in the ways of R could tell me
what I'm doing wrong. I have the following:

lmwrap - function(...) {

  wts - somefunction()
  print(wts) # This works, wts has the values I expect
  fit - lm(weights=wts,...)

  return(fit)
}

If I call my function lmwrap, I get the the following error:

 lmwrap(a~b)
Error in eval(expr, envir, enclos) : object wts not found

A traceback gives me the following:

8: eval(expr, envir, enclos)
7: eval(extras, data, env)
6: model.frame.default(formula = ..1, weights = wts, drop.unused.levels =
TRUE)
5: model.frame(formula = ..1, weights = wts, drop.unused.levels = TRUE)
4: eval(expr, envir, enclos)
3: eval(mf, parent.frame())
2: lm(weights = wts, ...)
1: wraplm(a ~ b)

It seems like whatever environment lm is trying to eval wts in doesn't
have it defined.

Could anyone tell me what I'm doing wrong?

As a sidenote, I do have a workaround, but this strikes me as really the
wrong thing to do. I replace the call to lm with:
eval(substitute(lm(weights = dummy,...),list(dummy=wts)))
which works.

Thanks
Pete

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] conditional IF with AND

2008-08-13 Thread Henrique Dallazuanna
See:

?``


On Wed, Aug 13, 2008 at 1:45 PM, rcoder [EMAIL PROTECTED] wrote:

 Hi everyone,

 I'm trying to create an if conditional statement with two conditions,
 whereby the statement is true when condition 1 AND condition 2 are met:

 code structure:
 if ?AND? (a[x,y] condition1, a[x,y] condition2)

 I've trawled through the help files, but I cannot find an example of the
 syntax for incorporating an AND in a conditional IF statement.

 Thanks,

 rcoder
 --
 View this message in context: 
 http://www.nabble.com/conditional-IF-with-AND-tp18966890p18966890.html
 Sent from the R help mailing list archive at Nabble.com.

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




-- 
Henrique Dallazuanna
Curitiba-Paraná-Brasil
25° 25' 40 S 49° 16' 22 O

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] How to test for a random effect in a repeated measu res analysis using anova.mlm ?

2008-08-13 Thread marie-lou.lefrancois
Dear “R” masters,

I am trying to conduct an ANOVA with repeated measures using the command
anova.mlm for data structured according to a Randomized Block Design.
I would like to account for a random effect but cannot find a way to
incorporate it in the analysis.

NB. I tried using the argument “M” to define the outer projection
(block), I get the message that length differs.

“response” is the dependent variable (3 years of heights measurements,
merged with cbind).

“estab” is a factor with 3 levels (whether trees were planted, seeded
or naturally established).

I would like to include “block” as a random effect. I would like to
keep the structure of the response variable (so I don’t get an output
with a test for each year: this is what happens when I use “lme” or
“aov”).

This is the code I am using:
First, fit linear models:
estabfit-lm(response~estab)

 timefit-lm(response~1)
 
Then test the effect of the factor “estab”
 anova.mlm(timefit,estabfit,M=~1).

How do I integrate “block”?

I was inspired by: http://tolstoy.newcastle.edu.au/R/help/05/11/15744.html

Thank you so much for your help!
Cheers
Marie-lou Lefrancois

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] re placing default labels in lattice

2008-08-13 Thread Henrique Dallazuanna
You can see the source code of demo script:

file.show(system.file(demo/labels.R, package = lattice))

On Wed, Aug 13, 2008 at 11:20 AM, Andrewjohnclose [EMAIL PROTECTED] wrote:

 Dear all,

 I am having a little trouble deciphering how to change the default x-axis
 labels in a lattice xyplot (or any type of lattice plot for that matter). I
 have tried using the demo(labels) function but the code is truncated at
 precisely the wrong moment!

 All I am trying to do is to add superscript to two of the labels for which i
 tried using the expression function. It partly works, but it prints only the
 first replacement label inside the plotting region and forgets the
 rest...what am I missing?

 Thank you

 xyplot(resid(mod4)~factor(distance),aspect=1.0,cex=1.0,xlab=Distance,ylab=Residuals,data=meanAG,
 span=1,
 panel=function(x,y,span){
 panel.grid(h=0, v=-1)
 panel.xyplot(x,y,cex=1.0,points=jitter)
 panel.loess(x,y, span)
 panel.axis(side=bottom,at=TRUE,
 labels=c(expression(Bray-Curtis^{1}),expression(Bray-Curtis^{2}),expression(Canberra),expression(Gower),expression(Hellinger),expression(Kulczynski)))
 })
 http://www.nabble.com/file/p18964008/meanAG.csv meanAG.csv
 --
 View this message in context: 
 http://www.nabble.com/replacing-default-labels-in-lattice-tp18964008p18964008.html
 Sent from the R help mailing list archive at Nabble.com.

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




-- 
Henrique Dallazuanna
Curitiba-Paraná-Brasil
25° 25' 40 S 49° 16' 22 O

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] ignoring zeros or converting to NA

2008-08-13 Thread Henrik Bengtsson
FYI,

there is an isZero() in the R.utils package that allows you to specify
the precision.  It looks like this:

isZero - function (x, neps=1, eps=.Machine$double.eps, ...) {
  (abs(x)  neps*eps);
}

/Henrik

On Wed, Aug 13, 2008 at 8:23 AM, Roland Rau [EMAIL PROTECTED] wrote:
 Hi,

 since many suggestions are following the form of
 x[x==0] (or similar)
 I would like to ask if this is really recommended?
 What I have learned (the hard way) is that one should not test for equality
 of floating point numbers (which is the default for R's numeric values,
 right?) since the binary representation of these (decimal) floating point
 numbers is not necessarily exact (with the classic example of decimal 0.1).
 Is it okay in this case for the value zero where all binary elements are
 zero? Or does R somehow recognize that it is an integer?

 Just some questions out of curiosity.

 Thank you,
 Roland


 rcoder wrote:

 Hi everyone,

 I have a matrix that has a combination of zeros and NAs. When I perform
 certain calculations on the matrix, the zeros generate Inf values. Is
 there a way to either convert the zeros in the matrix to NAs, or only
 perform the calculations if not zero (i.e. like using something similar to
 an !all(is.na() construct)?

 Thanks,

 rcoder

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] conditional IF with AND

2008-08-13 Thread Ted Harding
On 13-Aug-08 16:45:27, rcoder wrote:
 Hi everyone,
 I'm trying to create an if conditional statement with two conditions,
 whereby the statement is true when condition 1 AND condition 2 are met:
 
 code structure:
 if ?AND? (a[x,y] condition1, a[x,y] condition2)
 
 I've trawled through the help files, but I cannot find an example of
 the syntax for incorporating an AND in a conditional IF statement.
 
 Thanks,
 rcoder

The basic structure of an 'if' statement (from ?if -- don't
forget the .. for certain keywords such as if) is:

  if(cond) expr

What is not explained in the ?if help is that 'cond' may
be any expression that evaluates to a logical TRUE or FALSE.

Hence you can build 'cond' to suit your purpose. Therefore:

  if( (condition 1 on a[x,y])(condition 2 on a[x,y]) ) {
whatever you want to do if (cond1 AND cond2 ) is TRUE
  }

Example:

  if( (a[x,y]1.0)(a[x,y]2.0) ){
print(Between 1 and 2)
  }

Hoping this helps,
Ted.


E-Mail: (Ted Harding) [EMAIL PROTECTED]
Fax-to-email: +44 (0)870 094 0861
Date: 13-Aug-08   Time: 19:33:53
-- XFMail --

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] which alternative tests instead of AIC/BIC for choosing models

2008-08-13 Thread tolga . i . uzuner
Many thanks John, appreciate the advice,
Tolga




John C Frain [EMAIL PROTECTED] 
13/08/2008 18:51

To
[EMAIL PROTECTED]
cc
r-help@r-project.org
Subject
Re: [R] which alternative tests instead of AIC/BIC for choosing models






My initial idea would be to forget about AIC and BIC, ask the question
what would one expect to get in the regression and then regress y on
x1 and x2 and use a simple t-test to determine what should be
included.  Remember that omitted variables will bias your coefficients
but if you include redundant variables your results will remain
consistent.  I presume that you do not have any problems with
non-stationary variables.

Best Regards

John

2008/8/13  [EMAIL PROTECTED]:
 Dear R Users,

 I am looking for an alternative to AIC or BIC to choose model 
parameters.
 This is somewhat of a general statistics question, but I ask it in this
 forum as I am looking for a R solution.

 Suppose I have one dependent variable, y, and two independent variables,
 x1 an x2.

 I can perform three regressions:
 reg1: y~x1
 reg2: y~x2
 reg3: y~x1+x2

 The AIC of reg1 is 2000, reg2 is 1000 and reg3 is 950. One would,
 presumably, conclude that one should use both x1 and x2.  However, the
 R^2's are quite different: R^2 of reg1 is 0.5%, reg2 is 95% and reg3 is
 95.25%. Knowing that, I would actually conclude that x1 adds litte and
 should probably not be used.

 There is the overall question of what potentially explains this outcome,
 i.e. the reduction in AIC in going from reg2 to reg3 even though R^2 
does
 not materially improve
 with the addition of x1 to reg 2 (to get to reg3). But that is more of a
 generic statistics issue and not my question here.

 The question I do have is, is there a package in R which implements a 
test
 and provides some diagnostic information I can use to rule out the use 
of
 x1 in a systematic way as it's addition to the equation adds little in
 terms of explaining the variability of y.

 Thanks in advance,
 Tolga

 Generally, this communication is for informational purposes only
 and it is not intended as an offer or solicitation for the purchase
 or sale of any financial instrument or as an official confirmation
 of any transaction. In the event you are receiving the offering
 materials attached below related to your interest in hedge funds or
 private equity, this communication may be intended as an offer or
 solicitation for the purchase or sale of such fund(s).  All market
 prices, data and other information are not warranted as to
 completeness or accuracy and are subject to change without notice.
 Any comments or statements made herein do not necessarily reflect
 those of JPMorgan Chase  Co., its subsidiaries and affiliates.

 This transmission may contain information that is privileged,
 confidential, legally privileged, and/or exempt from disclosure
 under applicable law. If you are not the intended recipient, you
 are hereby notified that any disclosure, copying, distribution, or
 use of the information contained herein (including any reliance
 thereon) is STRICTLY PROHIBITED. Although this transmission and any
 attachments are believed to be free of any virus or other defect
 that might affect any computer system into which it is received and
 opened, it is the responsibility of the recipient to ensure that it
 is virus free and no responsibility is accepted by JPMorgan Chase 
 Co., its subsidiaries and affiliates, as applicable, for any loss
 or damage arising in any way from its use. If you received this
 transmission in error, please immediately contact the sender and
 destroy the material in its entirety, whether in electronic or hard
 copy format. Thank you.
 Please refer to http://www.jpmorgan.com/pages/disclosures for
 disclosures relating to UK legal entities.
[[alternative HTML version deleted]]

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide 
http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




-- 
John C Frain
Trinity College Dublin
Dublin 2
Ireland
www.tcd.ie/Economics/staff/frainj/home.html
mailto:[EMAIL PROTECTED]
mailto:[EMAIL PROTECTED]



Generally, this communication is for informational purposes only
and it is not intended as an offer or solicitation for the purchase
or sale of any financial instrument or as an official confirmation
of any transaction. In the event you are receiving the offering
materials attached below related to your interest in hedge funds or
private equity, this communication may be intended as an offer or
solicitation for the purchase or sale of such fund(s).  All market
prices, data and other information are not warranted as to
completeness or accuracy and are subject to change without notice.
Any comments or statements made herein do not necessarily reflect
those of JPMorgan Chase  Co., its subsidiaries and affiliates.

This 

Re: [R] Arguments to lm() within a function - object not found

2008-08-13 Thread Prof Brian Ripley

On Wed, 13 Aug 2008, Pete Berlin wrote:


Hi all,

I'm having some difficulty passing arguments into lm() from within a
function, and I was hoping someone wiser in the ways of R could tell me
what I'm doing wrong. I have the following:

lmwrap - function(...) {

 wts - somefunction()
 print(wts) # This works, wts has the values I expect
 fit - lm(weights=wts,...)

 return(fit)
}

If I call my function lmwrap, I get the the following error:


lmwrap(a~b)

Error in eval(expr, envir, enclos) : object wts not found


Correct.  The help (?lm) says

 All of 'weights', 'subset' and 'offset' are evaluated in the same
 way as variables in 'formula', that is first in 'data' and then in
 the environment of 'formula'.




A traceback gives me the following:

8: eval(expr, envir, enclos)
7: eval(extras, data, env)
6: model.frame.default(formula = ..1, weights = wts, drop.unused.levels =
TRUE)
5: model.frame(formula = ..1, weights = wts, drop.unused.levels = TRUE)
4: eval(expr, envir, enclos)
3: eval(mf, parent.frame())
2: lm(weights = wts, ...)
1: wraplm(a ~ b)

It seems like whatever environment lm is trying to eval wts in doesn't
have it defined.

Could anyone tell me what I'm doing wrong?

As a sidenote, I do have a workaround, but this strikes me as really the
wrong thing to do. I replace the call to lm with:
eval(substitute(lm(weights = dummy,...),list(dummy=wts)))
which works.


It's one workaround, but working with the scoping rules is better.  Hint: 
use the 'data' argument to lm.



--
Brian D. Ripley,  [EMAIL PROTECTED]
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford, Tel:  +44 1865 272861 (self)
1 South Parks Road, +44 1865 272866 (PA)
Oxford OX1 3TG, UKFax:  +44 1865 272595

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] need help with stat functions(like adaboost, random forests and glm)

2008-08-13 Thread Paul Fisch
Ok, so basically I have a dataframe named data_frame

data_frame contains:
startdate
startprice
endpricethreshold1
endpricethreshold2
endpricethreshold3



all of these endpricethresholds are true/false binary vectors.  They are
true or false depending on whether the endprice was above or below whatever
the endpricethreshold is.

now I want to try to use lets say the general linear model to have it try
and predict which endprice thresholds will be true or false dependent upon
startdate and startprice.  So I have a formula like:

glm(endpricethreshold1 ~ ., data=data_frame[,c(1,2,3)],
family=binomial(logit));

but, for the first term endpricethreshold1(since I really have tons of
endpricethresholds and would like to make this a loop) I don't want to refer
to it by its name but instead by its column indice like this:

glm(data_frame[[3]] ~ ., data=data_frame[,c(1,2,3)],
family=binomial(logit));

However, when I do this I am getting completely different results and I have
no idea why.

If anyone could help it would be greatly appreciated.



Thanks,
Paul Fisch

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] subsetting matrix according to columns with character index

2008-08-13 Thread Ralph S.

I tried this - I get an empty set:

0 rows (or 0-length row.names)

I guess this happens because the z variable takes only one value per row??

What works is:
DFsub-DF[DF$z == 1 | DF$z == 2,]

but then, I do not eliminate the entries where there is only one entry for z 
given an a and c combination.

Any idea what to do?

-Ralph

 Date: Wed, 13 Aug 2008 13:05:25 -0500
 From: [EMAIL PROTECTED]
 Subject: RE: [R] subsetting matrix according to columns with character index
 To: [EMAIL PROTECTED]
 
   it must be a dataframe so, if it was DF, then, assuming i understand 
 what you want then either of the following should work:
 
 DFsub-DF[DF$z == 1  DF$z == 2,]
 
 or
 
 DFsub-subset(DF, z == 1  z == 2 )
 
 
 On Wed, Aug 13, 2008 at  2:00 PM, Ralph S. wrote:
 
  Hi,
 
  I have a long matrix of the following form which I would like to 
  subset according to the third column:
 
  [x y z]:
 
  a1 c1 1
  a1 c1 2
  a2 c1 1
  a1 c2 1
  a1 c2 2
  . . .
 
 
  The first two columns a characters ai and cj.
 
  I would like to keep all the rows where there are two entries for z, 1 
  and 2.
 
  That is, I want:
  a1 c1 1
  a1 c1 2
  a1 c2 1
  a1 c2 2
  . . .
 
  I try to use something like df[by(df,c(df$x,df$y),sum(z)==3),] but 
  that only gives me one line of data per x y combination.
 
  Is there an easy way of coding to keep all rows for a and c 
  combinations where z has entries both 1 and 2?
  Many thanks,
 
  Ralph
 
  _
 
 
  LM_WLYIA_whichathlete_us
  __
  R-help@r-project.org mailing list
  https://stat.ethz.ch/mailman/listinfo/r-help
  PLEASE do read the posting guide 
  http://www.R-project.org/posting-guide.html
  and provide commented, minimal, self-contained, reproducible code.

_


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] subsetting matrix according to columns with character index

2008-08-13 Thread markleeds

  i don't think i understood what you were trying to do, atleast based 
on Henrique's solution which I haven't cut and pasted yet in order
to understand. Did Henrique's solution do what you wanted ?

On Wed, Aug 13, 2008 at  2:45 PM, Ralph S. wrote:

I tried this - I get an empty set:

0 rows (or 0-length row.names)

I guess this happens because the z variable takes only one value per 
row??

What works is:
DFsub-DF[DF$z == 1 | DF$z == 2,]

but then, I do not eliminate the entries where there is only one entry 
for z given an a and c combination.

Any idea what to do?

-Ralph

 Date: Wed, 13 Aug 2008 13:05:25 -0500 From: [EMAIL PROTECTED] 
 Subject: RE: [R] subsetting matrix according to columns with character 
 index To: [EMAIL PROTECTED]
   it must be a dataframe so, if it was DF, then, assuming i understand 
 what you want then either of the following should work:
 DFsub-DF[DF$z == 1  DF$z == 2,]
 or
 DFsub-subset(DF, z == 1  z == 2 )

 On Wed, Aug 13, 2008 at  2:00 PM, Ralph S. wrote:
 Hi,
 I have a long matrix of the following form which I would like to 
 subset according to the third column:
 [x y z]:
 a1 c1 1 a1 c1 2 a2 c1 1 a1 c2 1 a1 c2 2 . . .

 The first two columns a characters ai and cj.
 I would like to keep all the rows where there are two entries for z, 
 1 and 2.
 That is, I want: a1 c1 1 a1 c1 2 a1 c2 1 a1 c2 2 . . .
 I try to use something like df[by(df,c(df$x,df$y),sum(z)==3),] but 
 that only gives me one line of data per x y combination.
 Is there an easy way of coding to keep all rows for a and c 
 combinations where z has entries both 1 and 2? Many thanks,
 Ralph
 _

 LM_WLYIA_whichathlete_us 
 __ R-help@r-project.org 
 mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do 
 read the posting guide http://www.R-project.org/posting-guide.html 
 and provide commented, minimal, self-contained, reproducible code.

___

Your PC, mobile phone, and online services work together like never 
before. See how Windows® fits your life 
http://clk.atdmt.com/MRT/go/108587394/direct/01/

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] which alternative tests instead of AIC/BIC for choosing models

2008-08-13 Thread tolga . i . uzuner
By way of partial follow-up to my own question, and on the odd chance 
anyone else wonders about this issue, some alternatives to this appear to 
be in the leaps package, which implements the leaps routine (Mallows Cp) 
and regsubsets. In my case Mallows' Cp does not work either (see below), 
so I have implemented the following.

regr # - holds a zoo object with the 1st column being the dependent 
variable

r2test- (result$lm.r2Rsqr)  
(all(unlist(lapply(2:(dim(regr)[2]),function(i) 
summary(lm(regr[,1]~regr[,i]))$adj.r.squared ))0.1)) 
which.min(leaps(as.matrix(regr[,-1]),regr[,1])$Cp)==dim(regr)[2]

leaps on the same problem below
===

 leaps(as.matrix(regr3[,-1]),regr3[,1],method=c(adjr2))
$which
  1 2
1 FALSE  TRUE
1  TRUE FALSE
2  TRUE  TRUE

$label
[1] (Intercept) 1   2 

$size
[1] 2 2 3

$adjr2
[1] 0.950757134 0.001681389 0.954859493

 leaps(as.matrix(regr3[,-1]),regr3[,1],method=c(Cp))
$which
  1 2
1 FALSE  TRUE
1  TRUE FALSE
2  TRUE  TRUE

$label
[1] (Intercept) 1   2 

$size
[1] 2 2 3

$Cp
[1]   38.53367 8490.553273.0

 



Tolga I Uzuner/JPMCHASE 
13/08/2008 17:33

To
r-help@r-project.org
cc

Subject
which alternative tests instead of AIC/BIC for choosing models





Dear R Users,

I am looking for an alternative to AIC or BIC to choose model parameters. 
This is somewhat of a general statistics question, but I ask it in this 
forum as I am looking for a R solution.

Suppose I have one dependent variable, y, and two independent variables, 
x1 an x2. 

I can perform three regressions: 
reg1: y~x1 
reg2: y~x2 
reg3: y~x1+x2 

The AIC of reg1 is 2000, reg2 is 1000 and reg3 is 950. One would, 
presumably, conclude that one should use both x1 and x2.  However, the 
R^2's are quite different: R^2 of reg1 is 0.5%, reg2 is 95% and reg3 is 
95.25%. Knowing that, I would actually conclude that x1 adds litte and 
should probably not be used.

There is the overall question of what potentially explains this outcome, 
i.e. the reduction in AIC in going from reg2 to reg3 even though R^2 does 
not materially improve 
with the addition of x1 to reg 2 (to get to reg3). But that is more of a 
generic statistics issue and not my question here.

The question I do have is, is there a package in R which implements a test 
and provides some diagnostic information I can use to rule out the use of 
x1 in a systematic way as it's addition to the equation adds little in 
terms of explaining the variability of y.

Thanks in advance,
Tolga


Generally, this communication is for informational purposes only
and it is not intended as an offer or solicitation for the purchase
or sale of any financial instrument or as an official confirmation
of any transaction. In the event you are receiving the offering
materials attached below related to your interest in hedge funds or
private equity, this communication may be intended as an offer or
solicitation for the purchase or sale of such fund(s).  All market
prices, data and other information are not warranted as to
completeness or accuracy and are subject to change without notice.
Any comments or statements made herein do not necessarily reflect
those of JPMorgan Chase  Co., its subsidiaries and affiliates.

This transmission may contain information that is privileged,
confidential, legally privileged, and/or exempt from disclosure
under applicable law. If you are not the intended recipient, you
are hereby notified that any disclosure, copying, distribution, or
use of the information contained herein (including any reliance
thereon) is STRICTLY PROHIBITED. Although this transmission and any
attachments are believed to be free of any virus or other defect
that might affect any computer system into which it is received and
opened, it is the responsibility of the recipient to ensure that it
is virus free and no responsibility is accepted by JPMorgan Chase 
Co., its subsidiaries and affiliates, as applicable, for any loss
or damage arising in any way from its use. If you received this
transmission in error, please immediately contact the sender and
destroy the material in its entirety, whether in electronic or hard
copy format. Thank you.
Please refer to http://www.jpmorgan.com/pages/disclosures for
disclosures relating to UK legal entities.
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] subsetting matrix according to columns with character index

2008-08-13 Thread markleeds

sorry ralph. i meant the OR instead of the AND so that was my bad 
mistake. the subset  function should also work with the OR.

i think i understand better what you want now also.  the approach below 
for doing what you want  assumes that , if there are 2 rows associated 
with the
values in the first 2 columns , then they will be 1 and 2. If they are 
1,1 or 2,2, then it won't work. So, henrique's solution could be better 
and more general.

Assume your dataframe is called DF.

tempres-split(DF$x,DF$y)

onlytwo-lapply(tempres, function(.df)
if (nrow(.df) == 2) {
   return(.df) } else {
   return(NULL) }
)

onlytwo-onlytwo[!sapply(onlytwo,is.null)

result-do.call(rbind,onlytwo)


On Wed, Aug 13, 2008 at  2:45 PM, Ralph S. wrote:

I tried this - I get an empty set:

0 rows (or 0-length row.names)

I guess this happens because the z variable takes only one value per 
row??

What works is:
DFsub-DF[DF$z == 1 | DF$z == 2,]

but then, I do not eliminate the entries where there is only one entry 
for z given an a and c combination.

Any idea what to do?

-Ralph

 Date: Wed, 13 Aug 2008 13:05:25 -0500 From: [EMAIL PROTECTED] 
 Subject: RE: [R] subsetting matrix according to columns with character 
 index To: [EMAIL PROTECTED]
   it must be a dataframe so, if it was DF, then, assuming i understand 
 what you want then either of the following should work:
 DFsub-DF[DF$z == 1  DF$z == 2,]
 or
 DFsub-subset(DF, z == 1  z == 2 )

 On Wed, Aug 13, 2008 at  2:00 PM, Ralph S. wrote:
 Hi,
 I have a long matrix of the following form which I would like to 
 subset according to the third column:
 [x y z]:
 a1 c1 1 a1 c1 2 a2 c1 1 a1 c2 1 a1 c2 2 . . .

 The first two columns a characters ai and cj.
 I would like to keep all the rows where there are two entries for z, 
 1 and 2.
 That is, I want: a1 c1 1 a1 c1 2 a1 c2 1 a1 c2 2 . . .
 I try to use something like df[by(df,c(df$x,df$y),sum(z)==3),] but 
 that only gives me one line of data per x y combination.
 Is there an easy way of coding to keep all rows for a and c 
 combinations where z has entries both 1 and 2? Many thanks,
 Ralph
 _

 LM_WLYIA_whichathlete_us 
 __ R-help@r-project.org 
 mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do 
 read the posting guide http://www.R-project.org/posting-guide.html 
 and provide commented, minimal, self-contained, reproducible code.

___

Your PC, mobile phone, and online services work together like never 
before. See how Windows® fits your life 
http://clk.atdmt.com/MRT/go/108587394/direct/01/

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] subsetting matrix according to columns with character index

2008-08-13 Thread markleeds

Ralph: I looked at Henrique's solution and he does 2 things which make 
it better than mine.

1) He splits based off the first two columns where I just split based on 
the second. So, my split assumes that the same rows are next to each 
other
which is an unnecessary assumption.

2) He actually checks to make sure that 1 and 2 are actually in the 
third column of  the resulting dataframes that split returns.  I assumed 
that , if a
dataframe was of length 2, then the latter would be true automatically.

So,  even though mine worked for what you needed, in the spirit of 
generality and minimal assumptions, it better to use Henrique's 
solution. Also,
make sure you understand it because you can learn a lot from it. ( this 
is  also true of his solutions in general ).


On Wed, Aug 13, 2008 at  3:37 PM, Ralph S. wrote:

yes this work, very elegant thank you. I didn't get Henriques message in 
my mailbox immediately for some reason -

-Ralph

___

Date: Wed, 13 Aug 2008 14:23:33 -0500
 From: [EMAIL PROTECTED]
Subject: RE: [R] subsetting matrix according to columns with character 
index
To: [EMAIL PROTECTED]
CC: r-help@r-project.org

sorry ralph. i meant the OR instead of the AND so that was my bad 
mistake. the subset  function should also work with the OR.

i think i understand better what you want now also.  the approach below 
for doing what you want  assumes that , if there are 2 rows associated 
with the
values in the first 2 columns , then they will be 1 and 2. If they are 
1,1 or 2,2, then it won't work. So, henrique's solution could be better 
and more general.

Assume your dataframe is called DF.

tempres-split(DF$x,DF$y)

onlytwo-lapply(tempres, function(.df)
if (nrow(.df) == 2) {
   return(.df) } else {
   return(NULL) }
)

onlytwo-onlytwo[!sapply(onlytwo,is.null)

result-do.call(rbind,onlytwo)


On Wed, Aug 13, 2008 at  2:45 PM, Ralph S. wrote:

I tried this - I get an empty set:

0 rows (or 0-length row.names)

I guess this happens because the z variable takes only one value per 
row??

What works is:
DFsub-DF[DF$z == 1 | DF$z == 2,]

but then, I do not eliminate the entries where there is only one entry 
for z given an a and c combination.

Any idea what to do?

-Ralph

 Date: Wed, 13 Aug 2008 13:05:25 -0500 From: [EMAIL PROTECTED] 
 Subject: RE: [R] subsetting matrix according to columns with character 
 index To: [EMAIL PROTECTED]
   it must be a dataframe so, if it was DF, then, assuming i understand 
 what you want then either of the following should work:
 DFsub-DF[DF$z == 1  DF$z == 2,]
 or
 DFsub-subset(DF, z == 1  z == 2 )

 On Wed, Aug 13, 2008 at  2:00 PM, Ralph S. wrote:
 Hi,
 I have a long matrix of the following form which I would like to 
 subset according to the third column:
 [x y z]:
 a1 c1 1 a1 c1 2 a2 c1 1 a1 c2 1 a1 c2 2 . . .

 The first two columns a characters ai and cj.
 I would like to keep all the rows where there are two entries for z, 
 1 and 2.
 That is, I want: a1 c1 1 a1 c1 2 a1 c2 1 a1 c2 2 . . .
 I try to use something like df[by(df,c(df$x,df$y),sum(z)==3),] but
 that only gives me one line of data per x y combination.
 Is there an easy way of coding to keep all rows for a and c 
 combinations where z has entries both 1 and 2? Many thanks,
 Ralph
 _

 LM_WLYIA_whichathlete_us 
 __ R-help@r-project.org 
 mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do 
 read the posting guide http://www.R-project.org/posting-guide.html 
 and provide commented, minimal, self-contained, reproducible code.

___

Your PC, mobile phone, and online services work together like never 
before. See how Windows® fits your life 
http://clk.atdmt.com/MRT/go/108587394/direct/01/

___

Get more from your digital life. Find out how. 
http://www.windowslive.com/default.html?ocid=TXT_TAGLM_WL_Home2_082008

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] merging data sets to match data to date

2008-08-13 Thread rcoder

Dear Henrique,

This is exactly what I need. Thank you very much for your help!

rcoder



Henrique Dallazuanna wrote:
 
 Try this:
 
 x - data.frame(Dates = seq(as.Date('2008-01-01'),
   as.Date('2008-01-31'), by =
 'days'),
Values = sample(31))
 
 subset(x, Dates %in% as.Date(c('2008-01-05', '2008-01-20')))
 
 On 8/13/08, rcoder [EMAIL PROTECTED] wrote:

 Hi everyone,

 I want to extract data from a data set according to dates specified in a
 vector. I have created a blank matrix with row names (dates) that I want
 to
 extract from the full data set. I have then performed a merge to try to
 o/p
 rows corresponding to common dates to a results matrix, but the operation
 did not fill the results matrix. Coulc anyone offer any advice to assist
 with this operation?

 Thanks,

 rcoder
 --
 View this message in context:
 http://www.nabble.com/merging-data-sets-to-match-data-to-date-tp18962197p18962197.html
 Sent from the R help mailing list archive at Nabble.com.

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

 
 
 -- 
 Henrique Dallazuanna
 Curitiba-Paraná-Brasil
 25° 25' 40 S 49° 16' 22 O
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.
 
 

-- 
View this message in context: 
http://www.nabble.com/merging-data-sets-to-match-data-to-date-tp18962197p18969953.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Arguments to lm() within a function - object not found

2008-08-13 Thread Pete Berlin
Thanks very much for the quick reply. I had looked at the help for lm,
but I clearly skimmed over the critical part explaining where weights is
evaluated.

Thanks,
Pete



On 13/8/2008, Prof Brian Ripley wrote:

On Wed, 13 Aug 2008, Pete Berlin wrote:

 Hi all,

 I'm having some difficulty passing arguments into lm() from within a
 function, and I was hoping someone wiser in the ways of R could tell me
 what I'm doing wrong. I have the following:

 lmwrap - function(...) {

  wts - somefunction()
  print(wts) # This works, wts has the values I expect
  fit - lm(weights=wts,...)

  return(fit)
 }

 If I call my function lmwrap, I get the the following error:

 lmwrap(a~b)
 Error in eval(expr, envir, enclos) : object wts not found

Correct.  The help (?lm) says

  All of 'weights', 'subset' and 'offset' are evaluated in the same
  way as variables in 'formula', that is first in 'data' and then in
  the environment of 'formula'.



 A traceback gives me the following:

 8: eval(expr, envir, enclos)
 7: eval(extras, data, env)
 6: model.frame.default(formula = ..1, weights = wts, drop.unused.levels =
 TRUE)
 5: model.frame(formula = ..1, weights = wts, drop.unused.levels = TRUE)
 4: eval(expr, envir, enclos)
 3: eval(mf, parent.frame())
 2: lm(weights = wts, ...)
 1: wraplm(a ~ b)

 It seems like whatever environment lm is trying to eval wts in doesn't
 have it defined.

 Could anyone tell me what I'm doing wrong?

 As a sidenote, I do have a workaround, but this strikes me as really the
 wrong thing to do. I replace the call to lm with:
 eval(substitute(lm(weights = dummy,...),list(dummy=wts)))
 which works.

It's one workaround, but working with the scoping rules is better.  Hint:
use the 'data' argument to lm.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] conditional IF with AND

2008-08-13 Thread rcoder

Thank you all for your replies. This is all very useful information for me!

Ted, thank you very much for the extra explanation and example.

Many thanks,

rcoder



Ted.Harding-2 wrote:
 
 On 13-Aug-08 16:45:27, rcoder wrote:
 Hi everyone,
 I'm trying to create an if conditional statement with two conditions,
 whereby the statement is true when condition 1 AND condition 2 are met:
 
 code structure:
 if ?AND? (a[x,y] condition1, a[x,y] condition2)
 
 I've trawled through the help files, but I cannot find an example of
 the syntax for incorporating an AND in a conditional IF statement.
 
 Thanks,
 rcoder
 
 The basic structure of an 'if' statement (from ?if -- don't
 forget the .. for certain keywords such as if) is:
 
   if(cond) expr
 
 What is not explained in the ?if help is that 'cond' may
 be any expression that evaluates to a logical TRUE or FALSE.
 
 Hence you can build 'cond' to suit your purpose. Therefore:
 
   if( (condition 1 on a[x,y])(condition 2 on a[x,y]) ) {
 whatever you want to do if (cond1 AND cond2 ) is TRUE
   }
 
 Example:
 
   if( (a[x,y]1.0)(a[x,y]2.0) ){
 print(Between 1 and 2)
   }
 
 Hoping this helps,
 Ted.
 
 
 E-Mail: (Ted Harding) [EMAIL PROTECTED]
 Fax-to-email: +44 (0)870 094 0861
 Date: 13-Aug-08   Time: 19:33:53
 -- XFMail --
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.
 
 

-- 
View this message in context: 
http://www.nabble.com/conditional-IF-with-AND-tp18966890p18970101.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] which alternative tests instead of AIC/BIC for choosingmodels

2008-08-13 Thread Daniel Malter
your model 3 is the unrestricted model and your models 1 and 2 are
restricted models. you can test model 1 and 2 against model 3 using the
anova function, e.g. anova(model2,model3), which, for the case of OLS
estimation, are compared with an F-test. If the test is insignificant, the
simpler model should be preferred (and, of course, if the test were
significant for the more parsimonious model). but if the variable is
theoretically important (e.g. a theoretically important control), then it
should be included regardless of its significance in the estimation for your
specific data.

best,
Daniel

-
cuncta stricte discussurus
-

-Ursprüngliche Nachricht-
Von: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] Im
Auftrag von [EMAIL PROTECTED]
Gesendet: Wednesday, August 13, 2008 3:19 PM
An: [EMAIL PROTECTED]; r-help@r-project.org
Betreff: Re: [R] which alternative tests instead of AIC/BIC for
choosingmodels

By way of partial follow-up to my own question, and on the odd chance anyone
else wonders about this issue, some alternatives to this appear to be in the
leaps package, which implements the leaps routine (Mallows Cp) and
regsubsets. In my case Mallows' Cp does not work either (see below), so I
have implemented the following.

regr # - holds a zoo object with the 1st column being the dependent
variable

r2test- (result$lm.r2Rsqr)  
(all(unlist(lapply(2:(dim(regr)[2]),function(i)
summary(lm(regr[,1]~regr[,i]))$adj.r.squared ))0.1)) 
which.min(leaps(as.matrix(regr[,-1]),regr[,1])$Cp)==dim(regr)[2]

leaps on the same problem below
===

 leaps(as.matrix(regr3[,-1]),regr3[,1],method=c(adjr2))
$which
  1 2
1 FALSE  TRUE
1  TRUE FALSE
2  TRUE  TRUE

$label
[1] (Intercept) 1   2 

$size
[1] 2 2 3

$adjr2
[1] 0.950757134 0.001681389 0.954859493

 leaps(as.matrix(regr3[,-1]),regr3[,1],method=c(Cp))
$which
  1 2
1 FALSE  TRUE
1  TRUE FALSE
2  TRUE  TRUE

$label
[1] (Intercept) 1   2 

$size
[1] 2 2 3

$Cp
[1]   38.53367 8490.553273.0

 



Tolga I Uzuner/JPMCHASE
13/08/2008 17:33

To
r-help@r-project.org
cc

Subject
which alternative tests instead of AIC/BIC for choosing models





Dear R Users,

I am looking for an alternative to AIC or BIC to choose model parameters. 
This is somewhat of a general statistics question, but I ask it in this
forum as I am looking for a R solution.

Suppose I have one dependent variable, y, and two independent variables,
x1 an x2. 

I can perform three regressions: 
reg1: y~x1
reg2: y~x2
reg3: y~x1+x2 

The AIC of reg1 is 2000, reg2 is 1000 and reg3 is 950. One would,
presumably, conclude that one should use both x1 and x2.  However, the R^2's
are quite different: R^2 of reg1 is 0.5%, reg2 is 95% and reg3 is 95.25%.
Knowing that, I would actually conclude that x1 adds litte and should
probably not be used.

There is the overall question of what potentially explains this outcome,
i.e. the reduction in AIC in going from reg2 to reg3 even though R^2 does
not materially improve with the addition of x1 to reg 2 (to get to reg3).
But that is more of a generic statistics issue and not my question here.

The question I do have is, is there a package in R which implements a test
and provides some diagnostic information I can use to rule out the use of
x1 in a systematic way as it's addition to the equation adds little in terms
of explaining the variability of y.

Thanks in advance,
Tolga


Generally, this communication is for informational purposes only and it is
not intended as an offer or solicitation for the purchase or sale of any
financial instrument or as an official confirmation of any transaction. In
the event you are receiving the offering materials attached below related to
your interest in hedge funds or private equity, this communication may be
intended as an offer or solicitation for the purchase or sale of such
fund(s).  All market prices, data and other information are not warranted as
to completeness or accuracy and are subject to change without notice.
Any comments or statements made herein do not necessarily reflect those of
JPMorgan Chase  Co., its subsidiaries and affiliates.

This transmission may contain information that is privileged, confidential,
legally privileged, and/or exempt from disclosure under applicable law. If
you are not the intended recipient, you are hereby notified that any
disclosure, copying, distribution, or use of the information contained
herein (including any reliance
thereon) is STRICTLY PROHIBITED. Although this transmission and any
attachments are believed to be free of any virus or other defect that might
affect any computer system into which it is received and opened, it is the
responsibility of the recipient to ensure that it is virus free and no
responsibility is accepted by JPMorgan Chase  Co., its subsidiaries and
affiliates, as applicable, for any loss or damage arising in any way from
its use. If you 

Re: [R] reverse orientation of text in plot margins

2008-08-13 Thread Patrick Connolly
On Wed, 13-Aug-2008 at 06:32PM +0200, Karel Van den Meersche wrote:

| 
| Dear R users,
| 

| I am trying to reverse the orientation of axis labels and title in
| the right margin of a plot, so that they read from top to bottom. I
| know that this can be done using text() as follows:

| par(mar=c(5,4,4,4)+.1)
| plot(1:4,las=0)
| par(new=T)
| y - rnorm(4)
| plot(y,axes=FALSE,ann=FALSE,pch=17)
| axis(4,labels=FALSE)

I think it would be easiest to work out values for at and labels in
this statement.  ?axis.

HTH

-- 
~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.   
   ___Patrick Connolly   
 {~._.~} Great minds discuss ideas
 _( Y )_Middle minds discuss events 
(:_~*~_:)Small minds discuss people  
 (_)-(_)   . Anon
  
~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] which alternative tests instead of AIC/BIC for choosing models

2008-08-13 Thread Prof Brian Ripley
Cp is either the same thing as AIC, or an approximation to it.  So it is 
not an 'alternative'.


See e.g. the discussion in MASS or ?add1.

On Wed, 13 Aug 2008, [EMAIL PROTECTED] wrote:


By way of partial follow-up to my own question, and on the odd chance
anyone else wonders about this issue, some alternatives to this appear to
be in the leaps package, which implements the leaps routine (Mallows Cp)
and regsubsets. In my case Mallows' Cp does not work either (see below),
so I have implemented the following.

regr # - holds a zoo object with the 1st column being the dependent
variable

r2test- (result$lm.r2Rsqr) 
   (all(unlist(lapply(2:(dim(regr)[2]),function(i)
summary(lm(regr[,1]~regr[,i]))$adj.r.squared ))0.1)) 
   which.min(leaps(as.matrix(regr[,-1]),regr[,1])$Cp)==dim(regr)[2]

leaps on the same problem below
===


leaps(as.matrix(regr3[,-1]),regr3[,1],method=c(adjr2))

$which
 1 2
1 FALSE  TRUE
1  TRUE FALSE
2  TRUE  TRUE

$label
[1] (Intercept) 1   2

$size
[1] 2 2 3

$adjr2
[1] 0.950757134 0.001681389 0.954859493


leaps(as.matrix(regr3[,-1]),regr3[,1],method=c(Cp))

$which
 1 2
1 FALSE  TRUE
1  TRUE FALSE
2  TRUE  TRUE

$label
[1] (Intercept) 1   2

$size
[1] 2 2 3

$Cp
[1]   38.53367 8490.553273.0







Tolga I Uzuner/JPMCHASE
13/08/2008 17:33

To
r-help@r-project.org
cc

Subject
which alternative tests instead of AIC/BIC for choosing models





Dear R Users,

I am looking for an alternative to AIC or BIC to choose model parameters.
This is somewhat of a general statistics question, but I ask it in this
forum as I am looking for a R solution.

Suppose I have one dependent variable, y, and two independent variables,
x1 an x2.

I can perform three regressions:
reg1: y~x1
reg2: y~x2
reg3: y~x1+x2

The AIC of reg1 is 2000, reg2 is 1000 and reg3 is 950. One would,
presumably, conclude that one should use both x1 and x2.  However, the
R^2's are quite different: R^2 of reg1 is 0.5%, reg2 is 95% and reg3 is
95.25%. Knowing that, I would actually conclude that x1 adds litte and
should probably not be used.

There is the overall question of what potentially explains this outcome,
i.e. the reduction in AIC in going from reg2 to reg3 even though R^2 does
not materially improve
with the addition of x1 to reg 2 (to get to reg3). But that is more of a
generic statistics issue and not my question here.

The question I do have is, is there a package in R which implements a test
and provides some diagnostic information I can use to rule out the use of
x1 in a systematic way as it's addition to the equation adds little in
terms of explaining the variability of y.

Thanks in advance,
Tolga


Generally, this communication is for informational purposes only
and it is not intended as an offer or solicitation for the purchase
or sale of any financial instrument or as an official confirmation
of any transaction. In the event you are receiving the offering
materials attached below related to your interest in hedge funds or
private equity, this communication may be intended as an offer or
solicitation for the purchase or sale of such fund(s).  All market
prices, data and other information are not warranted as to
completeness or accuracy and are subject to change without notice.
Any comments or statements made herein do not necessarily reflect
those of JPMorgan Chase  Co., its subsidiaries and affiliates.

This transmission may contain information that is privileged,
confidential, legally privileged, and/or exempt from disclosure
under applicable law. If you are not the intended recipient, you
are hereby notified that any disclosure, copying, distribution, or
use of the information contained herein (including any reliance
thereon) is STRICTLY PROHIBITED. Although this transmission and any
attachments are believed to be free of any virus or other defect
that might affect any computer system into which it is received and
opened, it is the responsibility of the recipient to ensure that it
is virus free and no responsibility is accepted by JPMorgan Chase 
Co., its subsidiaries and affiliates, as applicable, for any loss
or damage arising in any way from its use. If you received this
transmission in error, please immediately contact the sender and
destroy the material in its entirety, whether in electronic or hard
copy format. Thank you.
Please refer to http://www.jpmorgan.com/pages/disclosures for
disclosures relating to UK legal entities.
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



--
Brian D. Ripley,  [EMAIL PROTECTED]
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford, Tel:  +44 1865 272861 (self)
1 South 

Re: [R] The standard deviation of measurement 1 with respec t to measurement 2

2008-08-13 Thread Mark Lyman
Firas Swidan frsswdn at gmail.com writes:

 
 Hi,
 
 I have two (different types of) measurements, say X and Y, resulting from
 the same set of experiments. So X and Y are paired: (x_1, y_1), (x_2, y_2),
 ...
 
 I am trying to calculate the standard deviation of Y with respect to X. In
 other words, in terms of the scatter plot of X and Y, I would like to divide
 it into bins along the X-axis and for each bin calculate the standard
 deviation along the Y results in that bin. (Though I am not totally sure,
 this seems to remind me of the conditional expectation of Y given X - maybe
 it is called the conditional deviation?)
 
 Is their a built in procedure in R for calculating the above? Otherwise,
 what would be the easiest way to achieve it? (factors maybe?)
 
 Thankful for the help,
 Firas.
 

Something like the following should give you what you want:

 x - rnorm(50)
 y - rnorm(50)
 tapply(y, cut(x, 10, include.lowest=TRUE), sd)
 [-2.19,-1.75]   (-1.75,-1.3]   (-1.3,-0.86] (-0.86,-0.415] (-0.415,0.029] 
 0.7569111  0.1671267  0.5620591  1.1280510  0.7772356 
 (0.029,0.473]  (0.473,0.918]   (0.918,1.36](1.36,1.81](1.81,2.25] 
 0.5600363  0.7681090  0.9754286  0.3184307  0.2410181

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] which alternative tests instead of AIC/BIC for choosing models

2008-08-13 Thread Ben Bolker

  Dear R Users,
 
  I am looking for an alternative to AIC or BIC to choose model parameters.
  This is somewhat of a general statistics question, but I ask it in this
  forum as I am looking for a R solution.
 
  Suppose I have one dependent variable, y, and two independent variables,
  x1 an x2.
 
  I can perform three regressions:
  reg1: y~x1
  reg2: y~x2
  reg3: y~x1+x2
 
  The AIC of reg1 is 2000, reg2 is 1000 and reg3 is 950. One would,
  presumably, conclude that one should use both x1 and x2.  However, the
  R^2's are quite different: R^2 of reg1 is 0.5%, reg2 is 95% and reg3 is
  95.25%. Knowing that, I would actually conclude that x1 adds litte and
  should probably not be used.
 
  There is the overall question of what potentially explains this outcome,
  i.e. the reduction in AIC in going from reg2 to reg3 even though R^2 does
  not materially improve
  with the addition of x1 to reg 2 (to get to reg3). But that is more of a
  generic statistics issue and not my question here.
 

  I know you didn't ask the generic statistics question, but
I think it's fairly important.  I suspect the reason that
you're getting (what you consider to be) a spurious result
that includes x1, or equivalently that your delta-AICs are
so big, is that you have a huge data set.  Lindsey (p. 15)
talks a bit about calibration that changes with the size of 
the data set.

  Model 3 will very probably give you better predictive power
than model 2.  If you want to select on the basis of improvement
in R^2, why not just do that?

  Ben Bolker

Lindsey, J. K. 1999. Some Statistical Heresies. The Statistician 48, no. 1: 
1-40.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] change 3x3 cell size in 1x1 cell size

2008-08-13 Thread Alessandro
Hi All,

 

I wish to change 3x3 pixel size in 1x1 pixel size my grid. I have this
fuction:

 

dem.area -
([EMAIL PROTECTED],[EMAIL PROTECTED],1])*([EMAIL PROTECTED],[EMAIL 
PROTECTED],1])

dem.pixelsize - round(5*sqrt(dem.area/length(ground$Z)),0)

dem.pixelsize

 

where is the input to change?

 

Thanks

Ale


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Conditional statement used in sapply()

2008-08-13 Thread Altaweel, Mark R.
Hi,

I have data stored in a list that I would like to aggregate and perform some 
basic stats. However, I would like to apply conditional statements so that not 
all the data are used.  Basically, I want to get a specific variable, do some 
basic functions (such as a mean), but only get the data in each element's data 
that match the condition. The code I used is below:

 result-sapply(res, function(.df) {   #res is the list containing file data
+ if(.df$Volume0)mean(.df$Volume)  #only have the mean function calculate on 
values great than 0
+ })


I did get a numeric output; however, when I checked the output value the 
conditional was ignored (i.e. it did not do anything to the calculation)

I also obtained these warning statements:

Warning messages:
1: In if (.df$Volume  0) mean(.df$Volume) :
  the condition has length  1 and only the first element will be used
2: In if (.df$Volume  0) mean(.df$Volume) :
  the condition has length  1 and only the first element will be used

Please let me know what am I doing wrong and how can I apply a conditional 
statement to the sapply function.

Thanks

Mark

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Conditional statement used in sapply()

2008-08-13 Thread Ling, Gary (Electronic Trading)
Hi Mark, 

How about this?

result - sapply(split(res, res$Volume0)$`TRUE`, mean)

There is one thing I'm not sure: is res$Volume a vector or single
numeric?

-gary

-Original Message-
From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED]
On Behalf Of Altaweel, Mark R.
Sent: Wednesday, August 13, 2008 6:03 PM
To: r-help@r-project.org
Subject: [R] Conditional statement used in sapply()


Hi,

I have data stored in a list that I would like to aggregate and perform
some basic stats. However, I would like to apply conditional statements
so that not all the data are used.  Basically, I want to get a specific
variable, do some basic functions (such as a mean), but only get the
data in each element's data that match the condition. The code I used is
below:

 result-sapply(res, function(.df) {   #res is the list containing file
data
+ if(.df$Volume0)mean(.df$Volume)  #only have the mean function
calculate on values great than 0
+ })


I did get a numeric output; however, when I checked the output value the
conditional was ignored (i.e. it did not do anything to the calculation)

I also obtained these warning statements:

Warning messages:
1: In if (.df$Volume  0) mean(.df$Volume) :
  the condition has length  1 and only the first element will be used
2: In if (.df$Volume  0) mean(.df$Volume) :
  the condition has length  1 and only the first element will be used

Please let me know what am I doing wrong and how can I apply a
conditional statement to the sapply function.

Thanks

Mark

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


This message w/attachments (message) may be privileged, confidential or 
proprietary, and if you are not an intended recipient, please notify the 
sender, do not use or share it and delete it. Unless specifically indicated, 
this message is not an offer to sell or a solicitation of any investment 
products or other financial product or service, an official confirmation of any 
transaction, or an official statement of Merrill Lynch. Subject to applicable 
law, Merrill Lynch may monitor, review and retain e-communications (EC) 
traveling through its networks/systems. The laws of the country of each 
sender/recipient may impact the handling of EC, and EC may be archived, 
supervised and produced in countries other than the country in which you are 
located. This message cannot be guaranteed to be secure or error-free. This 
message is subject to terms available at the following link: 
http://www.ml.com/e-communications_terms/. By messaging with Merrill Lynch you 
consent to the foregoing.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Conditional statement used in sapply()

2008-08-13 Thread Steven McKinney


 -Original Message-
 From: [EMAIL PROTECTED]
[mailto:[EMAIL PROTECTED]
 On Behalf Of Altaweel, Mark R.
 Sent: Wednesday, August 13, 2008 3:03 PM
 To: r-help@r-project.org
 Subject: [R] Conditional statement used in sapply()
 
 Hi,
 
 I have data stored in a list that I would like to aggregate and
perform
 some basic stats. However, I would like to apply conditional
statements so
 that not all the data are used.  Basically, I want to get a specific
 variable, do some basic functions (such as a mean), but only get the
data
 in each element's data that match the condition. The code I used is
below:
 
  result-sapply(res, function(.df) {   #res is the list containing
file
 data
 + if(.df$Volume0)mean(.df$Volume)  #only have the mean function
calculate
 on values great than 0
 + })
 

You probably want something such as
result-sapply(res, function(.df) {  mean(.df$Volume[.df$Volume0]) })  

HTH
Steve McKinney

 
 I did get a numeric output; however, when I checked the output value
the
 conditional was ignored (i.e. it did not do anything to the
calculation)
 
 I also obtained these warning statements:
 
 Warning messages:
 1: In if (.df$Volume  0) mean(.df$Volume) :
   the condition has length  1 and only the first element will be used
 2: In if (.df$Volume  0) mean(.df$Volume) :
   the condition has length  1 and only the first element will be used
 
 Please let me know what am I doing wrong and how can I apply a
conditional
 statement to the sapply function.
 
 Thanks
 
 Mark
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-
 guide.html
 and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Conditional statement used in sapply()

2008-08-13 Thread Altaweel, Mark R.
Hi,

Yes, that's it. I got the correct results.

Thanks everyone for their help once again. This is a great help board.

Mark


-Original Message-
From: Steven McKinney [mailto:[EMAIL PROTECTED]
Sent: Wed 8/13/2008 5:29 PM
To: Altaweel, Mark R.; r-help@r-project.org
Subject: RE: [R] Conditional statement used in sapply()
 


 -Original Message-
 From: [EMAIL PROTECTED]
[mailto:[EMAIL PROTECTED]
 On Behalf Of Altaweel, Mark R.
 Sent: Wednesday, August 13, 2008 3:03 PM
 To: r-help@r-project.org
 Subject: [R] Conditional statement used in sapply()
 
 Hi,
 
 I have data stored in a list that I would like to aggregate and
perform
 some basic stats. However, I would like to apply conditional
statements so
 that not all the data are used.  Basically, I want to get a specific
 variable, do some basic functions (such as a mean), but only get the
data
 in each element's data that match the condition. The code I used is
below:
 
  result-sapply(res, function(.df) {   #res is the list containing
file
 data
 + if(.df$Volume0)mean(.df$Volume)  #only have the mean function
calculate
 on values great than 0
 + })
 

You probably want something such as
result-sapply(res, function(.df) {  mean(.df$Volume[.df$Volume0]) })  

HTH
Steve McKinney

 
 I did get a numeric output; however, when I checked the output value
the
 conditional was ignored (i.e. it did not do anything to the
calculation)
 
 I also obtained these warning statements:
 
 Warning messages:
 1: In if (.df$Volume  0) mean(.df$Volume) :
   the condition has length  1 and only the first element will be used
 2: In if (.df$Volume  0) mean(.df$Volume) :
   the condition has length  1 and only the first element will be used
 
 Please let me know what am I doing wrong and how can I apply a
conditional
 statement to the sapply function.
 
 Thanks
 
 Mark
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-
 guide.html
 and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] rgl/compiz problem

2008-08-13 Thread Duncan Murdoch

Barry Rowlingson wrote:

I have just encountered the problem with rgl where plot3d figures
don't interact with the mouse. My plots zoom in and out with the mouse
wheel but the mouse buttons do nothing. I can't rotate the plot.

This has been mentioned and discussed here and in other lists before,
and the solution is to turn off Ubuntu's fancy graphics.  Back in
March, Ben Bolker said:


unfortunately rgl and compiz/etc. both try to use
the same OpenGL interface, so you can't use both at
the same time.


This has echoes of when TCP/IP was in its infancy back in the days of
DOS, and only one program could access the network interface at a time
(until TCP/IP software got its act together). Is OpenGL really in the
same position now? Or is Compiz being greedy in some sense? Surely
two OpenGL applications can run at the same time? Or is it because rgl
is running 'within' another OpenGL window already, so there's some
nesting problem going on?
  
I think it's an Ubuntu bug, because nothing like it occurs anywhere 
else.  So I'd suggest you turn off compiz or switch to a reliable OS 
like Windows ;-).


Duncan Murdoch

 Google Earth works fine, and I think that uses OpenGL. Anyone had any
ideas since March?

I'm on Ubuntu 8.04 and R 2.7.1

Barry

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] rgl/compiz problem

2008-08-13 Thread Marc Schwartz

on 08/13/2008 06:03 PM Duncan Murdoch wrote:

Barry Rowlingson wrote:

I have just encountered the problem with rgl where plot3d figures
don't interact with the mouse. My plots zoom in and out with the mouse
wheel but the mouse buttons do nothing. I can't rotate the plot.

This has been mentioned and discussed here and in other lists before,
and the solution is to turn off Ubuntu's fancy graphics.  Back in
March, Ben Bolker said:


unfortunately rgl and compiz/etc. both try to use
the same OpenGL interface, so you can't use both at
the same time.


This has echoes of when TCP/IP was in its infancy back in the days of
DOS, and only one program could access the network interface at a time
(until TCP/IP software got its act together). Is OpenGL really in the
same position now? Or is Compiz being greedy in some sense? Surely
two OpenGL applications can run at the same time? Or is it because rgl
is running 'within' another OpenGL window already, so there's some
nesting problem going on?
  
I think it's an Ubuntu bug, because nothing like it occurs anywhere 
else.  So I'd suggest you turn off compiz or switch to a reliable OS 
like Windows ;-).


Gack...  ;-)


 Google Earth works fine, and I think that uses OpenGL. Anyone had any
ideas since March?

I'm on Ubuntu 8.04 and R 2.7.1


Baz, what kind of graphics chipset do you have?  ATI, nVidia or Intel?

nVidia is terrible right now and they are being deservedly flamed left 
and right on the nVidia Linux fora. Their Linux support has deteriorated 
notably over the past year or so and is more pronounced with the new 
version of Xorg. Even the 2D support under Linux is worse than what I 
have seen on co-workers Linux systems with Intel chipsets that use 
shared system memory.


I agree with Duncan in that you should disable any of the 
compiz/compiz-fusion features, which add significant overhead and put a 
strain on the graphics drivers. Worse if it is nVidia in their current 
state.


Regards,

Marc Schwartz

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Conditional statement used in sapply()

2008-08-13 Thread Erik Iverson

Hello -

Altaweel, Mark R. wrote:

Hi,

I have data stored in a list that I would like to aggregate and
perform some basic stats. However, I would like to apply conditional
statements so that not all the data are used.  Basically, I want to
get a specific variable, do some basic functions (such as a mean),
but only get the data in each element's data that match the
condition. The code I used is below:


result-sapply(res, function(.df) {   #res is the list containing
file data

+ if(.df$Volume0)mean(.df$Volume)  #only have the mean function
calculate on values great than 0 + })


I did get a numeric output; however, when I checked the output value
the conditional was ignored (i.e. it did not do anything to the
calculation)

I also obtained these warning statements:

Warning messages: 1: In if (.df$Volume  0) mean(.df$Volume) : the
condition has length  1 and only the first element will be used 2:
In if (.df$Volume  0) mean(.df$Volume) : the condition has length 
1 and only the first element will be used

Please let me know what am I doing wrong and how can I apply a
conditional statement to the sapply function.



Before you think about sapply, what would you do if you had one element 
of this list.  Write a function to do that.


You wouldn't do :

if(x$Volume  0)
  mean(x$Volume)

because x$Volume  0 will create a logical vector greater than length 1 
(assuming x$Volume is greater than length 1), and then if will issue 
the warning.


You might do,

mean(x$Volume[x$Volume  0])

and turn it into a function.

Then use sapply.

Hopefully that gets you started!

Erik

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Comination of two barcharts and one xyplot

2008-08-13 Thread Duncan Mackay

At 01:17 14/08/2008, you wrote:

Hi Rhelpers,
Thanks a lot, Stephen, for showing me the way to get a data frame into a 
pasteable format with the dput command.
My code is given below with the new correction. This should work, as 
Stephen says, right off the bat :-)

## df1 is the first data frame
df1 -structure(list(Year = structure(c(1L, 2L, 3L, 4L, 5L, 6L, 8L,
7L), .Label = c(2003, 2005, 2007, 2009, 2011, 2013,
2015K, 2015M), class = factor), KI = c(15.53, 15.64, 16.18,
17.09, 22.39, 33.83, 44.91, 52.22), G48 = c(0.3, 0.29, 0.49,
0.67, 0.93, 1.29, 1.83, 2.14), AvCell = c(0.24, 0.33, 0.59, 0.91,
1.24, 1.87, 2.71, 3.15), HB = c(37.45, 34.64, 30.32, 29.47, 38.03,
58.37, 75.54, 87.71), Htens = c(0.76, 1.12, 1.63, 2.27, 3.11,
4.43, 6.28, 7.34), Impact = c(1.16, 1.78, 4.23, 6.76, 9.17, 14.06,
20.57, 23.88), Struct = c(3.02, 4.2, 6.67, 9.68, 13.18, 19.41,
27.51, 31.98), Tens = c(34.05, 32.88, 30.06, 29.25, 37.84, 57.6,
74.5, 86.57), Year.ord = structure(1:8, .Label = c(2003, 2005,
2007, 2009, 2011, 2013, 2015M, 2015K), class = c(ordered,
factor))), .Names = c(Year, KI, G48, AvCell, HB,
Htens, Impact, Struct, Tens, Year.ord), row.names = c(NA,
-8L), class = data.frame)
## L1 is the second data frame
L1-structure(list(Year = c(2009L, 2011L, 2013L), KIL = c(20, 24,
30), G48L = c(1, 1, 1), AvCellL = c(1, 1.5, 2), HBL = c(30, 35,
40), HtensL = c(2, 3, 4), ImpactL = c(10, 12, 14), StructL = c(10,
13, 16), TensL = c(35, 38, 45)), .Names = c(Year, KIL, G48L,
AvCellL, HBL, HtensL, ImpactL, StructL, TensL), class = 
data.frame, row.names = c(NA,

-3L))
# # Use the reshape package to melt the data frame
library(reshape)
df1m-melt(df1,id=c(Year,Year.ord))
## Use the lattice package to plot the barchart
library(lattice)
attach(df1m)
barchart(value~Year.ord|variable,scales=list(y=free,x=list(rot=90)),xlab=Year,ylab=No. 
of Tests *1000,col=blue)
This plot works just fine. But I want to go beyond this.What I want, in 
each panel of the lattice barchart, is to plot histograms of the relevant 
variable (KI, G48 etc) in one colour for the years 2003 to 2007, and in 
another colour for the other years. On top of this, I want to have a line 
plot in each panel with the limits for different years given in the second 
data frame L1 (as bold lines).

I would like to have information on the following points :
1. How can I get a combination of these plots in every panel (two 
histograms and one line plot)? Is it possible?

2. Is it easier to do this with ggplot?
3. I would like to know how I can present the legend also.
Will appreciate any help that I can get.
Thanking You,
Ravi


- Original Message 
From: stephen sefick [EMAIL PROTECTED]
To: ravi [EMAIL PROTECTED]
Cc: r-help@r-project.org
Sent: Wednesday, 13 August, 2008 3:14:54 PM
Subject: Re: [R] Comination of two barcharts and one xyplot

not reproducible

On Wed, Aug 13, 2008 at 9:07 AM, ravi [EMAIL PROTECTED] wrote:
 Hi Rhelpers,
 I would like to have some help with a plot which is beyond my 
capabilities. This plot that I am seeking involves an overlay of two 
different barcharts and one xyplot.

 The code that I have used is the following :
 #save(df1,file=M:\\KBR\\df1.RData)
 load(file=M:\\KBR\\df1.RData)
 # df1$Year.ord created to obtain the right order i.e. 2015M  2015K
 
Year.ord-ordered(Year,levels=c('2003','2005','2007','2009','20011','2013','2015M','2015K'))

 # Use reshape package to melt the data frame
 library(reshape)
 df1m-melt(df1,id=c(Year,Year.ord))
 library(lattice)
 attach(df1m)
 
barchart(value~Year.ord|variable,scales=list(y=free,x=list(rot=90)),xlab=Year,ylab=No. 
of Tests *1000,col=blue)
 This plot works just fine. But I want to go beyond this. My first data 
frame (df1) is :

 Year,KI,G48,AvCell,HB,Htens,Impact,Struct,Tens,Year.ord
 1,2003,15.53,0.3,0.24,37.45,0.76,1.16,3.02,34.05,2003
 2,2005,15.64,0.29,0.33,34.64,1.12,1.78,4.2,32.88,2005
 3,2007,16.18,0.49,0.59,30.32,1.63,4.23,6.67,30.06,2007
 4,2009,17.09,0.67,0.91,29.47,2.27,6.76,9.68,29.25,2009
 5,2011,22.39,0.93,1.24,38.03,3.11,9.17,13.18,37.84,2011
 6,2013,33.83,1.29,1.87,58.37,4.43,14.06,19.41,57.6,2013
 7,2015M,44.91,1.83,2.71,75.54,6.28,20.57,27.51,74.5,2015M
 8,2015K,52.22,2.14,3.15,87.71,7.34,23.88,31.98,86.57,2015K
 My second data frame is (L1) is :
 Year,KIL,G48L,AvCellL,HBL,HtensL,ImpactL,StructL,TensL
 1,2009,20,1,1,30,2,10,10,35
 2,2011,24,1,1.5,35,3,12,13,38
 3,2013,30,1,2,40,4,14,16,45
 What I want, in each panel of the lattice barchart, is to plot 
histograms of the relevant variable (KI, G48 etc) in one colour for the 
years 2003 to 2007, and in another colour for the other years. On top of 
this, I want to have a line plot in each panel with the limits for 
different years given in the second data frame L1 (as bold lines).

 I would like to have information on the following points :
 1. How can I get a combination of these plots in every panel (two 
histograms and one line plot)? Is it possible?

 2. Is it easier to do this with ggplot?
 3. I would like to know how I can 

Re: [R] rgl/compiz problem

2008-08-13 Thread Simon Blomberg
My laptop has an nVidia card. Maybe that's why it works?

Simon.

On Wed, 2008-08-13 at 13:17 +, Ben Bolker wrote:
 Barry Rowlingson b.rowlingson at lancaster.ac.uk writes:
 
  
  I have just encountered the problem with rgl where plot3d figures
  don't interact with the mouse. My plots zoom in and out with the mouse
  wheel but the mouse buttons do nothing. I can't rotate the plot.
  
  This has been mentioned and discussed here and in other lists before,
  and the solution is to turn off Ubuntu's fancy graphics.  Back in
  March, Ben Bolker said:
  
  
  unfortunately rgl and compiz/etc. both try to use
  the same OpenGL interface, so you can't use both at
  the same time.
  
  
  This has echoes of when TCP/IP was in its infancy back in the days of
  DOS, and only one program could access the network interface at a time
  (until TCP/IP software got its act together). Is OpenGL really in the
  same position now? Or is Compiz being greedy in some sense? Surely
  two OpenGL applications can run at the same time? Or is it because rgl
  is running 'within' another OpenGL window already, so there's some
  nesting problem going on?
  
   Google Earth works fine, and I think that uses OpenGL. Anyone had any
  ideas since March?
  
  I'm on Ubuntu 8.04 and R 2.7.1
  
  Barry
 
   Unfortunately, an apparently knowledgeable compiz person
 said:
 
 This is a limitation of DRI, DRI2 should fix this, and should hopefully be in
 most drivers by Xorg 7.5(maybe 7.6), nvidia has there on implementation, 
 that's
 why it works on it
 
 http://forum.compiz-fusion.org/showthread.php?t=8462
 
   And poking around,
 
 http://www.phoronix.com/scan.php?page=news_itempx=NjYzNw
 
 sometime in 2009 is the closest I could get to finding
 an expected date when this would be available ...
 
   Ben Bolker
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.
-- 
Simon Blomberg, BSc (Hons), PhD, MAppStat. 
Lecturer and Consultant Statistician 
Faculty of Biological and Chemical Sciences 
The University of Queensland 
St. Lucia Queensland 4072 
Australia
Room 320 Goddard Building (8)
T: +61 7 3365 2506
http://www.uq.edu.au/~uqsblomb
email: S.Blomberg1_at_uq.edu.au

Policies:
1.  I will NOT analyse your data for you.
2.  Your deadline is your problem.

The combination of some data and an aching desire for 
an answer does not ensure that a reasonable answer can 
be extracted from a given body of data. - John Tukey.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


  1   2   >