Re: [R] truncating values into separate categories

2009-08-01 Thread PDXRugger

I must apoligize, as i want clear of what i wanted to occur.  i dont want to
count the occurences but rather recode them.  I am trying to replace all of
the values with the new coded values in Person_CAT.  SO NP - c(1,  1,  2, 
1, 1,  2,  2,  1,  4,  1,  0,  5,  
+ 3,  3,  1,  5,  3, 5, 1, 6, 1, 2, 2, 2,
+ 4, 4, 1, 2, 1, 3, 3, 1,  2,  2,  1,  2, 1, 2,
+ 2, 3, 1, 1, 4, 4, 1, 1, 1, 2, 2, 2)




and Person_CAT: 1, 1, 2, 1, 1, 2, 2, 1, 4, 1, NA, 4. and so on.  This
task would easily be done in SPSS but i am trying to automate it using R.  I
hope this is more clear, 





Bill.Venables wrote:
 
 Here is a suggestion:
 
 Per - c(NA, 1, 2, 3,4)
 NP - c(1,  1,  2,  1, 1,  2,  2,  1,  4,  1,  0,  5,  
 + 3,  3,  1,  5,  3, 5, 1, 6, 1, 2, 2, 2,
 + 4, 4, 1, 2, 1, 3, 3, 1,  2,  2,  1,  2, 1, 2,
 + 2, 3, 1, 1, 4, 4, 1, 1, 1, 2, 2, 2)
 Person_CAT - cut(NP, breaks = c(0:4, Inf)-0.5, labels = Per)
 table(Person_CAT)
 Person_CAT
 NA  1  2  3  4 
  1 19 15  6  9 
  
 
 You should be aware, though, that items corresponding to the level NA
 will NOT be treated as missing.
 
 
 Bill Venables
 http://www.cmis.csiro.au/bill.venables/ 
 
 
 -Original Message-
 From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org]
 On Behalf Of PDXRugger
 Sent: Friday, 31 July 2009 9:54 AM
 To: r-help@r-project.org
 Subject: [R] truncating values into separate categories
 
 
 Hi all, 
   Simple question which i thought i had the answer but it isnt so simple
 for
 some reason.  I am sure someone can easily help.  I would like to
 categorize
 the values in NP into 1 of the five values in Per, with the last
 category(4) representing values =4(hence 4:max(NP)).  The problem is
 that
 R is reading max(NP) as multiple values instead of range so the lengths of
 the labels and the breaks are not matching.  Suggestions?
 
 Per - c(NA, 1, 2, 3,4)
 
 NP=c(1 ,1 ,2 ,1, 1 ,2 ,2 ,1 ,4 ,1 ,0 ,5 ,3 ,3 ,1 ,5 ,3, 5, 1, 6, 1, 2, 2,
 2,
 4, 4, 1, 2, 1, 3, 3, 1 ,2 ,2 ,1 ,2, 1, 2,
 2, 3, 1, 1, 4, 4, 1, 1, 1, 2, 2, 2)
 
 Person_CAT - cut(NP, breaks=c(0,1,2,3,4:max(NP)), labels=Per)
 
 -- 
 View this message in context:
 http://www.nabble.com/truncating-values-into-separate-categories-tp24749046p24749046.html
 Sent from the R help mailing list archive at Nabble.com.
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.
 
 

-- 
View this message in context: 
http://www.nabble.com/truncating-values-into-separate-categories-tp24749046p24761455.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] superpose 2 time series with different time intervals

2009-08-01 Thread Gary Lewis
I  could use some advice.

I've got 2 time series. Both cover approximately the same period of
time (ie, 1940 to 2009). But one series has annual data and the other
has monthly data. One refers to university enrollment; the other to
unemployment rates. Both are currently in the same data frame.

I'd like to use the monthly times series as a light grayscale
background for a plot of the annual time series, showing both series
as type l (line). Naturally with all the NA's in the annual series,
that plot disappears because points are not connected across missing
values.

I suppose I could make both series annual, but a lot of interesting
detail would get lost this way. Or I guess I could interpolate values
in the annual series with monthly approximations, but this means 11
out of every 12 values is an approximation. Or I suppose I could plot
each series separately and then print them with position information,
which I'm reluctant to do because panel.superpose so nicely handles
the alignment of the 2 panels.

What I'd really like to do is plot each independently but still
superposed. Effectively this seems to mean monthly data intervals but
line connections across the NA's in the series with annual intervals.

Any suggestions would be appreciated. Thanks.

Gary Lewis

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] how to calculate time interval between dates

2009-08-01 Thread liujb

Dear R users:

I have a vector of dates as follows:
t - c(2007-01-05, 2007-05-14, 2007-12-28, 2008-01-09, 2008-04-24,
2009-02-14)

I'd like to calculate number of days between those dates (time interval).
How to do it?

Thank you,
Julia

-- 
View this message in context: 
http://www.nabble.com/how-to-calculate-time-interval-between-dates-tp24762840p24762840.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] another automation question

2009-08-01 Thread Mehdi Khan
your name is annoying.

On Fri, Jul 31, 2009 at 2:01 PM, RR! cwal...@usgs.gov wrote:


 This code works:

 x-letters[1:6]
 ycols-23:28
 xcols-rep(c(3,4,5,8),each=length(ycols))

 somertime-function(i,j)somers2(Pred_pres_a_indpdt[,i,,], population[,j])
 results-mapply(somertime,xcols,ycols)



 How can I make variable h work?

 x-letters[1:6]
 ycols-23:28
 xcols-rep(c(3,4,5,8),each=length(ycols))

 somertime-function(h,i,j)somers2(Pred_pres_h_indpdt[,i,,], population[,j])
 results-mapply(somertime,x,xcols,ycols)

 -R
 --
 View this message in context:
 http://www.nabble.com/another-automation-question-tp24763017p24763017.html
 Sent from the R help mailing list archive at Nabble.com.

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] re moving intial numerals

2009-08-01 Thread PDXRugger


I would like to recreate data so that only the last 5 digits of the below
data are inlcuded as data so 200502019 would become 02019.  Any ideas.

data=c(200500735,
200502019,
200504131,
200504217,
200504629,
200504822,
200510115,
200511605,
200514477,
200515314,
200515438,
200519040,
200519603,
200522735,
200522853,
200523415,
200524227,
200524423)

-- 
View this message in context: 
http://www.nabble.com/removing-intial-numerals-tp24763596p24763596.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Compare lm() to glm(family=poisson)

2009-08-01 Thread Ken Knoblauch
Mark Na mtb954 at gmail.com writes:
 
 Dear R-helpers,
 I would like to compare the fit of two models, one of which I fit using lm()
 and the other using glm(family=poisson). The latter doesn't provide
 r-squared, so I wonder how to go about comparing these
 models (they have the same formula).
 
 Thanks very much,
 
 Mark Na
 
I'm not sure what you are trying to do but it might be
informative to compare the diagnostic plots from the
fits.  Remember that Poisson distributed data is
heteroscedastic, mean = variance, which isn't the
default hypothesis when fitting with lm.  Also, the
default link function with the poisson family is log.
So, these are things to take into account in any potential 
comparison.  

Ken

-- 
Ken Knoblauch
Inserm U846
Stem-cell and Brain Research Institute
Department of Integrative Neurosciences
18 avenue du Doyen Lépine
69500 Bron
France
tel: +33 (0)4 72 91 34 77
fax: +33 (0)4 72 91 34 61
portable: +33 (0)6 84 10 64 10
http://www.sbri.fr/members/kenneth-knoblauch.html

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] superpose 2 time series with different time intervals

2009-08-01 Thread Gabor Grothendieck
Try this:

# two simulated series
set.seed(123)
ts.sim - arima.sim(list(order = c(1,1,0), ar = 0.7), n = 70)
ts.sim - ts(c(ts.sim), start = 1940)
ts.sim2 - arima.sim(list(order = c(1,1,0), ar = 0.7), n = 12*70)
ts.sim2 - ts(c(ts.sim2), start = 1940, freq = 12)

# plot
plot(ts.sim2, type = l, col = grey(0.5))
lines(ts.sim)
axis(1, time(ts.sim), lab = FALSE)


On Fri, Jul 31, 2009 at 3:15 PM, Gary Lewisgary.m.le...@gmail.com wrote:
 I  could use some advice.

 I've got 2 time series. Both cover approximately the same period of
 time (ie, 1940 to 2009). But one series has annual data and the other
 has monthly data. One refers to university enrollment; the other to
 unemployment rates. Both are currently in the same data frame.

 I'd like to use the monthly times series as a light grayscale
 background for a plot of the annual time series, showing both series
 as type l (line). Naturally with all the NA's in the annual series,
 that plot disappears because points are not connected across missing
 values.

 I suppose I could make both series annual, but a lot of interesting
 detail would get lost this way. Or I guess I could interpolate values
 in the annual series with monthly approximations, but this means 11
 out of every 12 values is an approximation. Or I suppose I could plot
 each series separately and then print them with position information,
 which I'm reluctant to do because panel.superpose so nicely handles
 the alignment of the 2 panels.

 What I'd really like to do is plot each independently but still
 superposed. Effectively this seems to mean monthly data intervals but
 line connections across the NA's in the series with annual intervals.

 Any suggestions would be appreciated. Thanks.

 Gary Lewis

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] re moving intial numerals

2009-08-01 Thread baptiste auguie
Try this,

formatC(d %% 1e5, width=5, flag = 0, mode=integer)

 [1] 00735 02019 04131 04217 04629 04822 10115 11605 14477
[10] 15314 15438 19040 19603 22735 22853 23415 24227 24423


HTH,

baptiste


2009/7/31 PDXRugger j_r...@hotmail.com:


 I would like to recreate data so that only the last 5 digits of the below
 data are inlcuded as data so 200502019 would become 02019.  Any ideas.

 data=c(200500735,
 200502019,
 200504131,
 200504217,
 200504629,
 200504822,
 200510115,
 200511605,
 200514477,
 200515314,
 200515438,
 200519040,
 200519603,
 200522735,
 200522853,
 200523415,
 200524227,
 200524423)

 --
 View this message in context: 
 http://www.nabble.com/removing-intial-numerals-tp24763596p24763596.html
 Sent from the R help mailing list archive at Nabble.com.

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




-- 
_

Baptiste Auguié

School of Physics
University of Exeter
Stocker Road,
Exeter, Devon,
EX4 4QL, UK

http://newton.ex.ac.uk/research/emag

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] How to stop an R script when running JGR on a Linux/SuSE system

2009-08-01 Thread Bernd Kreuss
mau...@alice.it wrote:

 I wonder whether there is a more gentle way to stop an R script running on 
 top of JGR aother than ... unplugging the power cord.

there must be a bug in JGR on Lunux. Clicking the stop button should
stop the script, clicking it here on my linux machine will immediately
crash R together with JGR altogether. Unfortunately JGR still seems to
be the best (and only) available shell/editor combination available.
When running R in an ordinary terminal window execution can be
terminated with CTRL-C

I hope they will soon fix this (and a small handfull of other
bugs/flaws, mostly missing/wrong keyboard shortcuts) and JGR would
make a huge step from very good to excellent

Bernd

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] How to stop an R script when running JGR on a Linux/SuSE system

2009-08-01 Thread Bernd Kreuss
sorry for the eventual double posting, but i got a strange error from a
versatel(???) server about not enough quota when replying to the message

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] R User Group listings

2009-08-01 Thread Friedrich Leisch
 On Fri, 31 Jul 2009 06:45:38 -0400,
 Prof John C Nash (PJCN) wrote:

   Further to my posting about R UG mailing lists etc., and David Smith's 
   post about the list he is maintaining (I was aware of his blog, but not 
   that he was updating -- good show), I'm in communication with him to try 
   to ensure we get appropriate information out to useRs.

   Already there has been a posting asking if there is any group in 
   Germany, and asking is the first step to getting a group going. I 
   suspect we need to expand from just a listing to also include 
   Desperately seeking R users... entries. Will see what we can do.

I shortly talked with Jip Porzak about it at useR (because he
previously approached me with the question about having such a list on
the official R web pages). My personal opinion is that such a
listing really belongs into the R Wiki, such that user groups can add
themselves. Of course it would be great if somebody could act as an
editor and have an eye on the page, and we could have a prominent link
to it from the R homepage.

Just my 2c,
Fritz

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Problem with RGtk2 Rattle

2009-08-01 Thread Wayne Murray

HI

Thanks for all the advice, unfortunately I am unable to install the
suggested fix - error message as follows:

Error in gzfile(file, r) : cannot open the connection
In addition: Warning message:
In gzfile(file, r) :
  cannot open compressed file 'glade-3.4.3-win32-1/DESCRIPTION', probable
reason 'No such file or directory'

Sorry but nothing seems to work

Regards

Wayne



Felix Andrews wrote:
 
 This error comes from using an old version of the GTK+ libraries.
 
 Download the latest version for Windows from
 http://gladewin32.sourceforge.net/
 
 -Felix
 
 2009/7/31 Graham Williams graham.willi...@togaware.com:
 Hi Wayne - but what version of the other tools have you installed?

 Regards,
 Graham


 2009/7/30 Wayne Murray wayne.mur...@medicareaustralia.gov.au


 HI Graham

 Thanks for responding so promptly - unfortunately downloading and
 running
 this new version of Rattle did not alter the outcome - I am however
 running
 on Windows XP

 Regards

 Wayne



 Wayne Murray wrote:
 
  HI
 
  Apologies for previously trying to post this question onto the Dev
 forum.
 
  I have recently update my versions of R and related packages. When I
 try
  to use rattle the following message appears
 
  Error in .RGtkCall(R_setGObjectProps, obj, value, PACKAGE =
 RGtk2) :
    Invalid property tooltip-text!
 
  I have downloaded and installed the latest available version of RGtk2,
 so
  I am at a loss to explain this error, or more importantly what I need
 to
  do to overcome it
 
  Thanks for any suggestions
 
  Regards
 
  Wayne
 


 -
 Dr D. W. Murray
 Canberra, Australia
 --
 View this message in context:
 http://www.nabble.com/Problem-with-RGtk2---Rattle-tp24734447p24736985.html
 Sent from the R help mailing list archive at Nabble.com.

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


        [[alternative HTML version deleted]]

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

 
 
 
 -- 
 Felix Andrews / 安福立
 Postdoctoral Fellow
 Integrated Catchment Assessment and Management (iCAM) Centre
 Fenner School of Environment and Society [Bldg 48a]
 The Australian National University
 Canberra ACT 0200 Australia
 M: +61 410 400 963
 T: + 61 2 6125 1670
 E: felix.andr...@anu.edu.au
 CRICOS Provider No. 00120C
 -- 
 http://www.neurofractal.org/felix/
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.
 
 


-
Dr D. W. Murray
Canberra, Australia
-- 
View this message in context: 
http://www.nabble.com/Problem-with-RGtk2---Rattle-tp24734447p24768229.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Using R with Hadoop/Hive for Big Data

2009-08-01 Thread Ajay ohri
Hi,

The document helps a lot thanks. I need to know how to work with Hadoop and
R in a parallel clsuter environment.

HIVE is a new system on top of Hadoop that uses a SQL derivative to query
it. http://hadoop.apache.org/hive/



Regards,

Ajay


On Fri, Jul 31, 2009 at 7:23 PM, Avram Aelony aav...@mac.com wrote:



 I am not sure if I understood your question, but you may want to look at
 http://cran.r-project.org/web/packages/HadoopStreaming/HadoopStreaming.pdf
 Regards,

 Avram



 On Friday, July 31, 2009, at 02:39PM, Ajay ohri ohri2...@gmail.com
 wrote:
 Hive http://hadoop.apache.org/hive/ is a data warehouse infrastructure
 built on top of Hadoop that provides tools to enable easy data
 summarization, adhoc querying and analysis of large datasets data stored
 in
 Hadoop files. It provides a mechanism to put structure on this data and it
 also provides a simple query language called QL which is based on SQL and
 which enables users familiar with SQL to query this data. At the same
 time,
 this language also allows traditional map/reduce programmers to be able to
 plug in their custom mappers and reducers to do more sophisticated
 analysis
 which may not be supported by the built in capabilities of the language.
 
 Is there any package currently out or in development that is looking into
 using R like matrix capabilties with HIVE like big data abilties on a
 remote/ parallel HPC.
 
 Regards,
 
 Ajay
 
[[alternative HTML version deleted]]
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.
 
 


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] how to calculate time interval between dates

2009-08-01 Thread David Winsemius


On Jul 31, 2009, at 4:46 PM, liujb wrote:



Dear R users:

I have a vector of dates as follows:
t - c(2007-01-05, 2007-05-14, 2007-12-28, 2008-01-09,  
2008-04-24,

2009-02-14)

I'd like to calculate number of days between those dates (time  
interval).

How to do it?


That is not a vector of dates, but rather a character vector.
Observe:

 t - c(2007-01-05, 2007-05-14, 2007-12-28, 2008-01-09,  
2008-04-24,

 2009-02-14)
 class(t)
# [1] character
 diff(t)
#Error in r[i1] - r[-length(r):-(length(r) - lag + 1)] :
#  non-numeric argument to binary operator

Try:

 t2 - as.Date(t)
 class(t2)
#[1] Date
 t2
# [1] 2007-01-05 2007-05-14 2007-12-28 2008-01-09 2008-04-24  
2009-02-14

 ?diff
 diff(t2)
#Time differences in days
#[1] 129 228  12 106 296






David Winsemius, MD
Heritage Laboratories
West Hartford, CT

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] R User Group listings

2009-08-01 Thread Prof. John C Nash
Why not! Looks like there were several conversations going on 
independently at UseR about this.


I'll put up a page and then ask Martin to adjust the link.

JN


Friedrich Leisch wrote:

On Fri, 31 Jul 2009 06:45:38 -0400,
Prof John C Nash (PJCN) wrote:


   Further to my posting about R UG mailing lists etc., and David Smith's 
   post about the list he is maintaining (I was aware of his blog, but not 
   that he was updating -- good show), I'm in communication with him to try 
   to ensure we get appropriate information out to useRs.


   Already there has been a posting asking if there is any group in 
   Germany, and asking is the first step to getting a group going. I 
   suspect we need to expand from just a listing to also include 
   Desperately seeking R users... entries. Will see what we can do.


I shortly talked with Jip Porzak about it at useR (because he
previously approached me with the question about having such a list on
the official R web pages). My personal opinion is that such a
listing really belongs into the R Wiki, such that user groups can add
themselves. Of course it would be great if somebody could act as an
editor and have an eye on the page, and we could have a prominent link
to it from the R homepage.

Just my 2c,
Fritz





__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] How to stop an R script when running JGR on a Linux/SuSE system

2009-08-01 Thread Peter Dalgaard

Bernd Kreuss wrote:

sorry for the eventual double posting, but i got a strange error from a
versatel(???) server about not enough quota when replying to the message


Yes, I had one of those too (see below), but notice that the error 
occurs after the mail has left the mailing list server at ETHZ; i.e., it 
involves one recipient rather than all.


AFAICS, there's a misconfigured mailer en route to one of our 
subscribers (disobeys Errors-To: and mails sender instead). The logical 
consequence would seem to be to unsubscribe mailingli...@versanet.de.


-p

--

Hi. This is the qmail-send program at maildo.versatel.de.
I'm afraid I wasn't able to deliver your message to the following addresses.
This is a permanent error; I've given up. Sorry it didn't work out.

mailingli...@versanet.de:
maildrop: Filtering through xfilter /usr/local/bin/reformail -a 
X-VT-Original-To: mailingli...@versanet.de

maildrop: Filtering through xfilter /usr/local/bin/spamc -d 89.245.129.196
maildrop: Filtering through `$MAILFILTER -u $LOGNAME`
maildrop: maildir over quota.

--- Below this line is a copy of the message.

Return-Path: p.dalga...@biostat.ku.dk
Received: (qmail 29582 invoked from network); 30 Jul 2009 09:14:20 -
Received: from avir03do.versatel-west.de ([89.245.129.71])
  (envelope-sender p.dalga...@biostat.ku.dk)
  by mail02do.versatel.de (qmail-ldap-1.03) with SMTP
  for mailingli...@versanet.de; 30 Jul 2009 09:14:20 -
Received: from avir03do.versatel-west.de (localhost.localdomain [127.0.0.1])
by avir03do.versatel-west.de (Postfix) with ESMTP id 99D7773CBAC
for mailingli...@versanet.de; Thu, 30 Jul 2009 11:14:18 +0200 (CEST)
Received: from mail01do.versatel.de (mail01do.versatel.de [89.245.129.21])
by avir03do.versatel-west.de (Postfix) with SMTP id 5D79C73CBA1
for mailingli...@versanet.de; Thu, 30 Jul 2009 11:14:18 +0200 (CEST)
Received: by mail01do.versatel.de (sSMTP sendmail emulation); Thu, 30 
Jul 2009 11:14:19 +0200

Received: (qmail 22716 invoked from network); 30 Jul 2009 09:14:18 -
X-Spam-Checker-Version: SpamAssassin 3.2.5 (2008-06-10) on
spamkill08do.versatel-west.de
X-Spam-Status: No, score=0.0 required=7.7 tests=none
Received: from hypatia.math.ethz.ch ([129.132.145.15])
  (envelope-sender r-devel-boun...@r-project.org)
  by mail01do.versatel.de (qmail-ldap-1.03) with SMTP
  for mailingli...@versanet.de; 30 Jul 2009 09:14:16 -
Received: from hypatia.math.ethz.ch (hypatia [129.132.145.15])
by hypatia.math.ethz.ch (8.14.1/8.14.1) with ESMTP id n6U9D4HU025616;
Thu, 30 Jul 2009 11:13:23 +0200
Received: from phil2.ethz.ch (phil2.ethz.ch [129.132.202.240])
by hypatia.math.ethz.ch (8.14.1/8.14.1) with ESMTP id n6U9CHbG024699
for r-de...@stat.math.ethz.ch; Thu, 30 Jul 2009 11:12:57 +0200
Received: from mail.kubism.ku.dk ([192.38.18.21] helo=mail.pubhealth.ku.dk)
by phil2.ethz.ch with esmtp (Exim 4.66)
(envelope-from p.dalga...@biostat.ku.dk) id 1MWRgL-00057m-7h
for r-de...@stat.math.ethz.ch; Thu, 30 Jul 2009 11:12:17 +0200
Received: from titmouse2.kubism.ku.dk 
(0x50c633f5.boanxx12.dynamic.dsl.tele.dk

[80.198.51.245])
(using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits))
(No client certificate requested)
by mail.pubhealth.ku.dk (Postfix) with ESMTP id 0DDE8282BE88;
Thu, 30 Jul 2009 11:11:52 +0200 (CEST)
Message-ID: 4a71641a.8080...@biostat.ku.dk
Date: Thu, 30 Jul 2009 11:12:58 +0200
From: Peter Dalgaard p.dalga...@biostat.ku.dk
User-Agent: Thunderbird 2.0.0.21 (X11/20090320)
MIME-Version: 1.0
To: kb...@andrew.cmu.edu
References: 20090729175016.731f8282c...@mail.pubhealth.ku.dk
In-Reply-To: 20090729175016.731f8282c...@mail.pubhealth.ku.dk
X-Tag-Only: YES
X-Filter-Node: phil2.ethz.ch
X-USF-Spam-Level: --
X-USF-Spam-Status: hits=-2.5 tests=BAYES_00,FORGED_RCVD_HELO,SPF_PASS
X-USF-Spam-Flag: NO
X-Virus-Scanned: by amavisd-new at stat.math.ethz.ch
Cc: r-b...@r-project.org, r-de...@stat.math.ethz.ch
Subject: Re: [Rd] Strange Interaction Between Promises and Closures
(PR#13861)
X-BeenThere: r-de...@r-project.org
X-Mailman-Version: 2.1.11
Precedence: list
List-Id: R development and technical/programmer topics 
r-devel.r-project.org

List-Unsubscribe: https://stat.ethz.ch/mailman/options/r-devel,
mailto:r-devel-requ...@r-project.org?subject=unsubscribe
List-Archive: https://stat.ethz.ch/pipermail/r-devel
List-Post: mailto:r-de...@r-project.org
List-Help: mailto:r-devel-requ...@r-project.org?subject=help
List-Subscribe: https://stat.ethz.ch/mailman/listinfo/r-devel,
mailto:r-devel-requ...@r-project.org?subject=subscribe
Content-Transfer-Encoding: quoted-printable
Content-Type: text/plain; charset=iso-8859-1; Format=flowed
Sender: r-devel-boun...@r-project.org
Errors-To: r-devel-boun...@r-project.org
X-VT-Original-To: mailingli...@versanet.de
X-Anti-Virus: Kaspersky Anti-Virus 

[R] R book for economists

2009-08-01 Thread Thiemo Fetzer
Dear Group,

I am an economics student starting with PhD work in London. As preparation I
would like to get to know R a little bit better. For Stata there are tons of
books, however, can you recommend a book for R?

I have some substantiated econometrics knowledge, so it should be more a
how-to book.

Best regards
Thiemo

---
Thiemo Fetzer, Economist
http://freigeist.devmag.net
http://www.devmag.net

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] R User Group listings

2009-08-01 Thread Prof. John C Nash
Better ideas should prevail. There is now a wiki page at 
http://wiki.r-project.org/rwiki/doku.php?id=rugs:r_user_groups.


It is not yet fully populated. (David Smith's blog at REvolution 
Computing mentions more groups.)


JN

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] about the summary(cph.object)

2009-08-01 Thread David Winsemius


On Jul 31, 2009, at 11:24 PM, zhu yao wrote:


Could someone explain the summary(cph.object)?

The example is in the help file of cph.

n - 1000
set.seed(731)
age - 50 + 12*rnorm(n)
label(age) - Age
sex - factor(sample(c('Male','Female'), n,
 rep=TRUE, prob=c(.6, .4)))
cens - 15*runif(n)
h - .02*exp(.04*(age-50)+.8*(sex=='Female'))
dt - -log(runif(n))/h
label(dt) - 'Follow-up Time'
e - ifelse(dt = cens,1,0)
dt - pmin(dt, cens)
units(dt) - Year
dd - datadist(age, sex)
options(datadist='dd')


This is process for  setting the range for the display of effects in  
Design regression objects. See:


?datadist

q.effect
set of two quantiles for computing the range of continuous variables  
to use in estimating regression effects. Defaults are c(.25,.75),  
which yields inter-quartile-range odds ratios, etc.


?summary.Design
#---
 By default, inter-quartile range effects (odds ratios, hazards  
ratios, etc.) are printed for continuous factors, ... 

#---
Value
For summary.Design, a matrix of class summary.Design with rows  
corresponding to factors in the model and columns containing the low  
and high values for the effects, the range for the effects, the effect  
point estimates (difference in predicted values for high and low  
factor values), the standard error of this effect estimate, and the  
lower and upper confidence limits.


#---



Srv - Surv(dt,e)

f - cph(Srv ~ rcs(age,4) + sex, x=TRUE, y=TRUE)
summary(f)

Effects   
Response : Srv


FactorLowHigh   Diff.  Effect S.E. Lower 0.95 Upper  
0.95

age   40.872 57.385 16.513 1.21   0.21 0.80   1.62
 Hazard Ratio 40.872 57.385 16.513 3.35 NA 2.22   5.06


In this case with a 4 df regression spline, you need to look at  the  
effect across the range of the variable. You ought to plot the age  
effect and examine anova(f) ). In the untransformed situation the plot  
is on the log hazards scale for cph. So the effect for age in this  
case should be the difference in log hazard at ages 40.872 and 57.385.  
SE is the standard error of that estimate and the Upper and Lower  
numbers are the confidence bounds on the effect estimate. The Hazard  
Ratio row gives you exponentiated results, so a difference in log  
hazards becomes a hazard ratio. {exp(1.21) = 3.35}



sex - Female:Male  2.000  1.000 NA 0.64   0.15 0.35   0.94
 Hazard Ratio  2.000  1.000 NA 1.91 NA 1.42   2.55


Wat's the meaning of Effect, S.E. Lower, Upper?


You probably ought to read a bit more basic material. If you are  
asking this question, Harrell's Regression Modeling Strategies might  
be over you head, but it would probably be a good investment anyway.  
Venables and Ripley's Modern Applied Statistics has a chapter on  
survival analysis. Also consider Kalbfliesch and Prentice Statistical  
Analysis of Failure Time Data. I'm sure there are others;  those are  
the ones I have on my shelf.


David Winsemius, MD
Heritage Laboratories
West Hartford, CT

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Problem with RGtk2 Rattle

2009-08-01 Thread Felix Andrews
Um, it sounds as if you are trying to install glade-3.4.3-win32.zip
into R as a package... but it is not an R package!!

GTK+/Glade is a system library to be installed into Windows.
You should download the .exe (not the .zip)
http://downloads.sourceforge.net/gladewin32/gtk-dev-2.12.9-win32-2.exe
and run it to install it.

By the way, problems with rattle are best sent to the rattle-users mailing list:
http://groups.google.com/group/rattle-users

-Felix

2009/8/1 Wayne Murray wayne.mur...@medicareaustralia.gov.au:

 HI

 Thanks for all the advice, unfortunately I am unable to install the
 suggested fix - error message as follows:

 Error in gzfile(file, r) : cannot open the connection
 In addition: Warning message:
 In gzfile(file, r) :
  cannot open compressed file 'glade-3.4.3-win32-1/DESCRIPTION', probable
 reason 'No such file or directory'

 Sorry but nothing seems to work

 Regards

 Wayne



 Felix Andrews wrote:

 This error comes from using an old version of the GTK+ libraries.

 Download the latest version for Windows from
 http://gladewin32.sourceforge.net/

 -Felix

 2009/7/31 Graham Williams graham.willi...@togaware.com:
 Hi Wayne - but what version of the other tools have you installed?

 Regards,
 Graham


 2009/7/30 Wayne Murray wayne.mur...@medicareaustralia.gov.au


 HI Graham

 Thanks for responding so promptly - unfortunately downloading and
 running
 this new version of Rattle did not alter the outcome - I am however
 running
 on Windows XP

 Regards

 Wayne



 Wayne Murray wrote:
 
  HI
 
  Apologies for previously trying to post this question onto the Dev
 forum.
 
  I have recently update my versions of R and related packages. When I
 try
  to use rattle the following message appears
 
  Error in .RGtkCall(R_setGObjectProps, obj, value, PACKAGE =
 RGtk2) :
    Invalid property tooltip-text!
 
  I have downloaded and installed the latest available version of RGtk2,
 so
  I am at a loss to explain this error, or more importantly what I need
 to
  do to overcome it
 
  Thanks for any suggestions
 
  Regards
 
  Wayne
 


 -
 Dr D. W. Murray
 Canberra, Australia
 --
 View this message in context:
 http://www.nabble.com/Problem-with-RGtk2---Rattle-tp24734447p24736985.html
 Sent from the R help mailing list archive at Nabble.com.

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


        [[alternative HTML version deleted]]

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




 --
 Felix Andrews / 安福立
 Postdoctoral Fellow
 Integrated Catchment Assessment and Management (iCAM) Centre
 Fenner School of Environment and Society [Bldg 48a]
 The Australian National University
 Canberra ACT 0200 Australia
 M: +61 410 400 963
 T: + 61 2 6125 1670
 E: felix.andr...@anu.edu.au
 CRICOS Provider No. 00120C
 --
 http://www.neurofractal.org/felix/

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




 -
 Dr D. W. Murray
 Canberra, Australia
 --
 View this message in context: 
 http://www.nabble.com/Problem-with-RGtk2---Rattle-tp24734447p24768229.html
 Sent from the R help mailing list archive at Nabble.com.

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




-- 
Felix Andrews / 安福立
Postdoctoral Fellow
Integrated Catchment Assessment and Management (iCAM) Centre
Fenner School of Environment and Society [Bldg 48a]
The Australian National University
Canberra ACT 0200 Australia
M: +61 410 400 963
T: + 61 2 6125 1670
E: felix.andr...@anu.edu.au
CRICOS Provider No. 00120C
-- 
http://www.neurofractal.org/felix/

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] R book for economists

2009-08-01 Thread Ronggui Huang
How about Kleiber, C.  Zeileis, A. Applied Econometrics with R Springer, 2008?

Ronggui

2009/8/1 Thiemo Fetzer t...@devmag.net:
 Dear Group,

 I am an economics student starting with PhD work in London. As preparation I
 would like to get to know R a little bit better. For Stata there are tons of
 books, however, can you recommend a book for R?

 I have some substantiated econometrics knowledge, so it should be more a
 how-to book.

 Best regards
 Thiemo

 ---
 Thiemo Fetzer, Economist
 http://freigeist.devmag.net
 http://www.devmag.net

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




-- 
HUANG Ronggui, Wincent
PhD Candidate
Dept of Public and Social Administration
City University of Hong Kong
Home page: http://asrr.r-forge.r-project.org/rghuang.html

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] truncating values into separate categories

2009-08-01 Thread David Winsemius


On Jul 31, 2009, at 2:55 PM, PDXRugger wrote:



I must apoligize, as i want clear of what i wanted to occur.  i dont  
want to
count the occurences but rather recode them.  I am trying to replace  
all of
the values with the new coded values in Person_CAT.  SO NP - c(1,   
1,  2,

1, 1,  2,  2,  1,  4,  1,  0,  5,
+ 3,  3,  1,  5,  3, 5, 1, 6, 1, 2, 2, 2,
+ 4, 4, 1, 2, 1, 3, 3, 1,  2,  2,  1,  2, 1, 2,
+ 2, 3, 1, 1, 4, 4, 1, 1, 1, 2, 2, 2)




and Person_CAT: 1, 1, 2, 1, 1, 2, 2, 1, 4, 1, NA, 4. and so on.   
This
task would easily be done in SPSS but i am trying to automate it  
using R.  I

hope this is more clear,


Perhaps:

?cut  #with special attention to the right parameter which is  
set to TRUE by default.


 per_Cat - cut(NP, breaks= c(1:4, Inf), right= FALSE)
 per_Cat
 [1] [1,2)   [1,2)   [2,3)   [1,2)   [1,2)   [2,3)   [2,3)   [1,2)
[4,Inf) [1,2)   NA[4,Inf)
[13] [3,4)   [3,4)   [1,2)   [4,Inf) [3,4)   [4,Inf) [1,2)   [4,Inf)  
[1,2)   [2,3)   [2,3)   [2,3)
[25] [4,Inf) [4,Inf) [1,2)   [2,3)   [1,2)   [3,4)   [3,4)   [1,2)
[2,3)   [2,3)   [1,2)   [2,3)
[37] [1,2)   [2,3)   [2,3)   [3,4)   [1,2)   [1,2)   [4,Inf) [4,Inf)  
[1,2)   [1,2)   [1,2)   [2,3)

[49] [2,3)   [2,3)
Levels: [1,2) [2,3) [3,4) [4,Inf)
 Per - c( 1, 2, 3,4)
 levels(per_Cat) - Per
 per_Cat
 [1] 1121122141NA 43 
3143414
[21] 1222441213312 
2121223

[41] 1144111222
Levels: 1 2 3 4





Bill.Venables wrote:


Here is a suggestion:


Per - c(NA, 1, 2, 3,4)
NP - c(1,  1,  2,  1, 1,  2,  2,  1,  4,  1,  0,  5,

+ 3,  3,  1,  5,  3, 5, 1, 6, 1, 2, 2, 2,
+ 4, 4, 1, 2, 1, 3, 3, 1,  2,  2,  1,  2, 1, 2,
+ 2, 3, 1, 1, 4, 4, 1, 1, 1, 2, 2, 2)

Person_CAT - cut(NP, breaks = c(0:4, Inf)-0.5, labels = Per)
table(Person_CAT)

Person_CAT
NA  1  2  3  4
1 19 15  6  9




You should be aware, though, that items corresponding to the level  
NA

will NOT be treated as missing.


Bill Venables
http://www.cmis.csiro.au/bill.venables/


-Original Message-
From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org 
]

On Behalf Of PDXRugger
Sent: Friday, 31 July 2009 9:54 AM
To: r-help@r-project.org
Subject: [R] truncating values into separate categories


Hi all,
 Simple question which i thought i had the answer but it isnt so  
simple

for
some reason.  I am sure someone can easily help.  I would like to
categorize
the values in NP into 1 of the five values in Per, with the last
category(4) representing values =4(hence 4:max(NP)).  The  
problem is

that
R is reading max(NP) as multiple values instead of range so the  
lengths of

the labels and the breaks are not matching.  Suggestions?

Per - c(NA, 1, 2, 3,4)

NP=c(1 ,1 ,2 ,1, 1 ,2 ,2 ,1 ,4 ,1 ,0 ,5 ,3 ,3 ,1 ,5 ,3, 5, 1, 6, 1,  
2, 2,

2,
4, 4, 1, 2, 1, 3, 3, 1 ,2 ,2 ,1 ,2, 1, 2,
2, 3, 1, 1, 4, 4, 1, 1, 1, 2, 2, 2)

Person_CAT - cut(NP, breaks=c(0,1,2,3,4:max(NP)), labels=Per)

--


David Winsemius, MD
Heritage Laboratories
West Hartford, CT

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] R book for economists

2009-08-01 Thread Achim Zeileis

On Sat, 1 Aug 2009, Thiemo Fetzer wrote:


Dear Group,

I am an economics student starting with PhD work in London. As preparation I
would like to get to know R a little bit better. For Stata there are tons of
books, however, can you recommend a book for R?


Of course, I have to recommend our book

  Kleiber  Zeileis, Applied Econometrics with R, Springer.
  http://www.springer.com/978-0-387-77316-2
  http://CRAN.R-project.org/package=AER

You can grab the preface and intro chapters in the Sample pages on 
Springer's page to get an impression.


There is also Rick Vinod's book

  Vinod, Hands-On Intermediate Econometrics Using R, World Scientific.
  http://www.worldscibooks.com/economics/6895.html

And somewhat more specialized is Bernhard Pfaff's

  Pfaff, Analysis of Integrated and Cointegrated Time Series with R,
  Springer.
  http://www.springer.com/978-0-387-75966-1

You might find further useful information on the econometrics task view:
  http://CRAN.R-project.org/view=Econometrics

And finally there was also a JSS special volume on Econometrics in R 
last year:

  http://www.jstatsoft.org/v27/

Best,
Z

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] SVG output on Windows OS

2009-08-01 Thread David Winsemius


On Jul 31, 2009, at 6:41 PM, Michael Roessler wrote:


How may one save a graphic as svg on Windows? The svg() command is
recognized and functions well on Linux, etc., but not on Windows, it  
seems.
I'm trying to use Hadley Wickam's ggplot2 and I would like to be  
able to
save created charts as svg for later input into Illustrator. I am  
able to

accomplish this workflow under Linux, but I don't know how to get R to
recognize the svg() command under Windows. I have loaded RsvgDevice,  
Cairo,

and cairoDevice in my attempts. The problem seems to me to be directly
related to enabling R to produce svg output on Windows, rather than  
related

to ggplot2.



What does capabilities() return?
--

David Winsemius, MD
Heritage Laboratories
West Hartford, CT

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Problem with RGtk2 Rattle

2009-08-01 Thread Graham Williams
2009/8/1 Felix Andrews fe...@nfrac.org

 Um, it sounds as if you are trying to install glade-3.4.3-win32.zip
 into R as a package... but it is not an R package!!

 GTK+/Glade is a system library to be installed into Windows.
 You should download the .exe (not the .zip)
 http://downloads.sourceforge.net/gladewin32/gtk-dev-2.12.9-win32-2.exe
 and run it to install it.

 By the way, problems with rattle are best sent to the rattle-users mailing
 list:
 http://groups.google.com/group/rattle-users

 -Felix


Also Wayne, I hope you are following the instructions at

http://datamining.togaware.com/survivor/Install_MS_Windows.html

If there is any ambiguity there please let me know so I can make it clearer.
I know of many who have installed Rattle following these, so they should
work.

(And thanks for the help Felix.)

Regards,
Graham





 --
 Felix Andrews / 安福立
 Postdoctoral Fellow
 Integrated Catchment Assessment and Management (iCAM) Centre
 Fenner School of Environment and Society [Bldg 48a]
 The Australian National University
 Canberra ACT 0200 Australia
 M: +61 410 400 963
 T: + 61 2 6125 1670
 E: felix.andr...@anu.edu.au
 CRICOS Provider No. 00120C
 --
 http://www.neurofractal.org/felix/

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Add columns in a dataframe and fill them from another table according to a criteria

2009-08-01 Thread Meenu Sahi
Deare R users

I am new to R.
What I want to do is explained below;-
I have table called States.Prob which is given below:-
This table gives the probabilities of the changes in the swap curve
depending on the state of the swap curve. I want to put these probabilities
in my dataframe mydata(given after the prob table).
 Prob of States
Changes  State1  State2 State3 State4
a Pa1  Pa2 Pa3 Pa4
b Pb1  Pb2 Pb3 Pb4
c Pc1  Pc2 Pc3 Pc4
d Pd1  Pd2 Pd3 Pd4

and I have a dataframe(with 93 rows) called mydata part of which(6 rows) is
given below where I want to fill in the last four columns with probabilities
taken from States.Prob according to the change and state in mydata4:-
Change  State  PState1  PState2  PState3  PState4
1 b   State1  Pb1
2 a   State4   Pa4
3 b   State2Pb2
4 c   State3 Pc3
5 d   State1  Pd1
6 a   State3 Pa3

What I want to do is highlighted in Red.
How can I do this easily?

Many thanks for your time.

kind regards
Meenu
P.S. Thanks for your reply John. I've tried to put only the relevant columns
of the dataframe. Hope its more clear now.

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] xyplot: superpose 2 time series with different time intervals

2009-08-01 Thread Gary Lewis
I could use some advice regarding xyplot.

I've got 2 time series. Both cover approximately the same period of
time (ie, 1940 to 2009). But one series has annual data and the other
has monthly data. One refers to university enrollment; the other to
unemployment rates. Both are currently in the same data frame.

I'd like to use the monthly times series as a light grayscale
background for a plot of the annual time series, showing both series
as type l (line). Naturally with all the NA's in the annual series,
that plot disappears because points are not connected across missing
values.

I suppose I could make both series annual, but a lot of interesting
detail would get lost this way. Or I guess I could interpolate values
in the annual series with monthly approximations, but this means 11
out of every 12 values is an approximation. Or I suppose I could plot
each series separately and then print them with position information,
which I'm reluctant to do because panel.superpose so nicely handles
the alignment of the 2 panels.

What I'd really like to do is plot each independently but still
superposed. Effectively this seems to mean monthly data intervals but
line connections across the NA's in the series with annual intervals.

Any suggestions would be appreciated. Thanks.

Gary Lewis

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] xyplot: superpose 2 time series with different time intervals

2009-08-01 Thread Gabor Grothendieck
Try this using the same ts.sim and ts.sim2 from my previous post.
https://stat.ethz.ch/pipermail/r-help/2009-August/206697.html

library(zoo)
library(lattice)
plot(na.approx(cbind(as.zoo(ts.sim), as.zoo(ts.sim2))), screen = 1,
col = c(black, grey(0.5)))


On Sat, Aug 1, 2009 at 10:26 AM, Gary Lewisgary.m.le...@gmail.com wrote:
 I could use some advice regarding xyplot.

 I've got 2 time series. Both cover approximately the same period of
 time (ie, 1940 to 2009). But one series has annual data and the other
 has monthly data. One refers to university enrollment; the other to
 unemployment rates. Both are currently in the same data frame.

 I'd like to use the monthly times series as a light grayscale
 background for a plot of the annual time series, showing both series
 as type l (line). Naturally with all the NA's in the annual series,
 that plot disappears because points are not connected across missing
 values.

 I suppose I could make both series annual, but a lot of interesting
 detail would get lost this way. Or I guess I could interpolate values
 in the annual series with monthly approximations, but this means 11
 out of every 12 values is an approximation. Or I suppose I could plot
 each series separately and then print them with position information,
 which I'm reluctant to do because panel.superpose so nicely handles
 the alignment of the 2 panels.

 What I'd really like to do is plot each independently but still
 superposed. Effectively this seems to mean monthly data intervals but
 line connections across the NA's in the series with annual intervals.

 Any suggestions would be appreciated. Thanks.

 Gary Lewis

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Determine the dimension-names of an element in an array in R

2009-08-01 Thread Sauvik De
Hi Christian:

Many thank for the code.

But I am afraid that your code still has a problem in terms of providing
correct correlation. For example, if you look at the correlation between
DataArray_1[A2,B1,D1,] and DataArray_2[A2,C1,D1,] after running
your code, you will notice that this is actually the correlation between
DataArray_1[A2,B1,D1,] and DataArray_2[A1,C1,D1,] and so on.

The code gives the correct result only in case where elements corresponding
to A1  D1 are involved in DataArray_1  DataArray_2.

The problem is in

Correl-Correl[1:length(c),,,]

We need to select elements of Correl more carefully to reach a proper
solution.

Thanks,
Sauvik


On Wed, Jul 29, 2009 at 11:41 PM, Poersching poerschin...@web.de wrote:

 Hey,
 i have forgotten to generalize the code so

 Correl-Correl[1:4,,,]

 must be

 Correl-Correl[1:length(c),,,]

 it's because the comparison levels. I think you don't want the
 correlation betweeen A1, B1, D1 and A2, C1, D1 ,
 but between A1, B1, D1 and A1, C1, D1 or between A1, B1, D1 and A1, C2, D1.
 So the 1:length(c) writes only the correlation between the B and C out
 of the whole correlation array.
 That's also why the sequence in the second apply function is changed.

 Regards Christian.

 Poersching schrieb:
  Hey,
  I think I have a solution for your problem:
 
  Correl-apply(DataArray_1,1:3, function(d1)
apply(DataArray_2,c(2,1,3), function(d) cor(d1,d))
  )
  Correl-Correl[1:4,,,]
  dimnames(Correl)[[1]]-c
  Correl-aperm(Correl,c(2,3,1,4))
 
  This one should work. :-)
 
  Best Regards,
  Christian
 
  Sauvik De schrieb:
 
  Hi there,
 
  Thanks again for your reply. I know for-loop is always a solution to
  my problem and I had already coded using for-loop. But the number of
  levels for each dimension is large enough in actual problem and hence
  it was time-consuming.
  So, I was just wondering if there are any other alternative way-outs
  to solving my problem. That's why I tried with apply functions
  (sapply)assuming that this might work out faster even fractionally as
  compared to for-loop.
 
  Cheers,
  Sauvik
 
  On Mon, Jul 27, 2009 at 12:28 AM, Poersching poerschin...@web.de
  mailto:poerschin...@web.de wrote:
 
  Sauvik De schrieb:
 
  Hi:
  Lots of thanks for your valuable time!
 
  But I am not sure how you would like to use the function in this
  situation.
 
  As I had mentioned that the first element of my output array
  should be like:
 
 
 cor(DataArray_1[dimnames(Correl)[[1]][1],dimnames(Correl)[[2]][1],dimnames(Correl)[[4]][1],],DataArray_2[dimnames(Correl)[[1]][1],dimnames(Correl)[[3]][1],dimnames(Correl)[[4]][1],],use=pairwise.complete.obs)
 
  in my below code.
 
  and
 
  the output array of correlation I wish to get using sapply as
  follows:
 
  Correl = sapply(Correl,function(d)
  cor(DataArray_1[...],DataArray_2[...],
  use=pairwise.complete.obs))
 
  So it would be of great help if you could kindly specify how to
  utilise your function findIndex in ...
 
  Apologies for all this!
 
  Thanks  Regards,
  Sauvik
 
 
  Hey,
  sorry, I haven't understood your problem last time, but now this
  solution should solve your problem, so I hope. :-)
  It's only a for to loop, but an apply function may work too. I
  will think about this, but for now...  ;-)
 
  la-length(a)
  lb-length(b)
  lc-length(c)
  ld-length(d)
  for (ia in 1:la) {
for (ib in 1:lb) {
  for (ic in 1:lc) {
for (id in 1:ld) {
  Correl[ia,ib,ic,id]-cor(
   DataArray_1[dimnames(Correl)[[1]][ia],
   dimnames(Correl)[[2]][ib],
   dimnames(Correl)[[4]][id],]
   ,
   DataArray_2[dimnames(Correl)[[1]][ia],
dimnames(Correl)[[3]][ic],
dimnames(Correl)[[4]][id],]
   ,
   use=pairwise.complete.obs)
}
  }
}
  }
  ## with function findIndex you can find the dimensions with
  ## i.e. cor values greater 0.5 or smaller -0.5, like:
  findIndex(Correl,Correl[Correl0.5])
  findIndex(Correl,Correl[Correl(-0.5)])
 
  I have changed the code of the function findIndex in line which
  contents: el[j]-which(is.element(data,element[j]))
 
  Rigards,
  Christian
 
 
  On Sun, Jul 26, 2009 at 3:54 PM, Poerschingpoerschin...@web.de
  mailto:poerschin...@web.de wrote:
   Sauvik De schrieb:
  
   Hi Gabor:
   Many thanks for your prompt reply!
   The code is fine. But I need it in more general form as I had
  mentioned that
   I need to input any 0 to find its dimension-names.
  
   Actually, I was using sapply to calculate correlation and
  this idea was
   required in the middle of correlation calculation.
   I am providing the way I tried my calculation.
  
   a= c(A1,A2,A3,A4,A5)
   b= c(B1,B2,B3)
   

Re: [R] xyplot: superpose 2 time series with different time intervals

2009-08-01 Thread Gabor Grothendieck
In the last statement you can replace plot with xyplot (although both work).

On Sat, Aug 1, 2009 at 10:37 AM, Gabor
Grothendieckggrothendi...@gmail.com wrote:
 Try this using the same ts.sim and ts.sim2 from my previous post.
 https://stat.ethz.ch/pipermail/r-help/2009-August/206697.html

 library(zoo)
 library(lattice)
 plot(na.approx(cbind(as.zoo(ts.sim), as.zoo(ts.sim2))), screen = 1,
        col = c(black, grey(0.5)))


 On Sat, Aug 1, 2009 at 10:26 AM, Gary Lewisgary.m.le...@gmail.com wrote:
 I could use some advice regarding xyplot.

 I've got 2 time series. Both cover approximately the same period of
 time (ie, 1940 to 2009). But one series has annual data and the other
 has monthly data. One refers to university enrollment; the other to
 unemployment rates. Both are currently in the same data frame.

 I'd like to use the monthly times series as a light grayscale
 background for a plot of the annual time series, showing both series
 as type l (line). Naturally with all the NA's in the annual series,
 that plot disappears because points are not connected across missing
 values.

 I suppose I could make both series annual, but a lot of interesting
 detail would get lost this way. Or I guess I could interpolate values
 in the annual series with monthly approximations, but this means 11
 out of every 12 values is an approximation. Or I suppose I could plot
 each series separately and then print them with position information,
 which I'm reluctant to do because panel.superpose so nicely handles
 the alignment of the 2 panels.

 What I'd really like to do is plot each independently but still
 superposed. Effectively this seems to mean monthly data intervals but
 line connections across the NA's in the series with annual intervals.

 Any suggestions would be appreciated. Thanks.

 Gary Lewis

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.



__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Variable alias

2009-08-01 Thread Daniel Haase

Hi Everyone,

is there the possibility in R to assign a variable to be an alias of  
another one?

Example:

x - 17
# assign y to be an alias of x
y # returns 17
x - 4
y # returns 4

Daniel

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Variable alias

2009-08-01 Thread Henrique Dallazuanna
You can assign the x variable like this:

y - x - 17
y - x - 4



On Sat, Aug 1, 2009 at 11:55 AM, Daniel Haase d...@haase-zm.de wrote:

 Hi Everyone,

 is there the possibility in R to assign a variable to be an alias of
 another one?
 Example:

 x - 17
 # assign y to be an alias of x
 y # returns 17
 x - 4
 y # returns 4

 Daniel

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




-- 
Henrique Dallazuanna
Curitiba-Paraná-Brasil
25° 25' 40 S 49° 16' 22 O

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] zoo plot warning messages - I don't know what they mean or how to inspect the data to figure this out

2009-08-01 Thread stephen sefick
I have a time series from 1933-2005 of precipitation at Fayetteville
NC.  I get the following error messages when I plot the zoo series.
Any help would be appreciated.  If you need the data I can dput it or
send the csv.  I didn't include it here because I didn't want to clog
up anybodies email account.  I know that this is not reproducible, and
I will send along the file if needed.

Warning messages:
1: In xy.coords(x, y, xlabel, ylabel, log) : NAs introduced by coercion
2: In lines.times(x.index, y[, i], col = col[[i]], pch = pch[[i]],  :
  NAs introduced by coercion

-- 
Stephen Sefick

Let's not spend our time and resources thinking about things that are
so little or so large that all they really do for us is puff us up and
make us feel like gods.  We are mammals, and have not exhausted the
annoying little problems of being mammals.

-K. Mullis

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Add columns in a dataframe and fill them from another table according to a criteria

2009-08-01 Thread David Winsemius


On Aug 1, 2009, at 9:52 AM, Meenu Sahi wrote:


Deare R users

I am new to R.
What I want to do is explained below;-
I have table called States.Prob which is given below:-
This table gives the probabilities of the changes in the swap curve
depending on the state of the swap curve. I want to put these  
probabilities

in my dataframe mydata(given after the prob table).
Prob of States
Changes  State1  State2 State3 State4
a Pa1  Pa2 Pa3 Pa4
b Pb1  Pb2 Pb3 Pb4
c Pc1  Pc2 Pc3 Pc4
d Pd1  Pd2 Pd3 Pd4

and I have a dataframe(with 93 rows) called mydata part of which(6  
rows) is
given below where I want to fill in the last four columns with  
probabilities

taken from States.Prob according to the change and state in mydata4:-
Change  State  PState1  PState2  PState3  PState4
1 b   State1  Pb1
2 a   State4   Pa4
3 b   State2Pb2
4 c   State3 Pc3
5 d   State1  Pd1
6 a   State3 Pa3

What I want to do is highlighted in Red.
How can I do this easily?

You may have seen it in red, but we don't, and I, at least, cannot  
figure out what you intend.   (Per the Posting Guide, which you have  
obviously not yet read, you need to compose your question in plain old  
monochromatic text and change your mail client so it posts in plain  
text.)


If looking at the help pages for stack() and reshape() does not offer  
useful information and worked examples that meet your needs then:


An approach that would make you more populat in these parts would be  
to  make a simpler example, composed in syntactically correct R, that  
is complete in itself, and can pasted into an R session. Indicate what  
you intend as output from this simpler input.


Perhaps

 pstate - read.table(textConnection(Changes  State1  State2 State3  
State4

+ a Pa1  Pa2 Pa3 Pa4
+ b Pb1  Pb2 Pb3 Pb4
+ c Pc1  Pc2 Pc3 Pc4
+ d Pd1  Pd2 Pd3 Pd4),  header=TRUE,  
as.is=TRUE)


?stack

 data.frame(Change=pstate[,1],
  prstate =stack(pstate[2:5])$values,
  state=stack(pstate[2:5])$ind )

#first column  is only 4 elements long, but will get recycled
# second  retreives the probabilities and may need to have  
as.numeric( ) wrapped around it if they really are numeric.

# third returns what started out as column names.

   Change prstate  state
1   a Pa1 State1
2   b Pb1 State1
3   c Pc1 State1
4   d Pd1 State1
5   a Pa2 State2
6   b Pb2 State2
7   c Pc2 State2
8   d Pd2 State2
9   a Pa3 State3
10  b Pb3 State3
11  c Pc3 State3
12  d Pd3 State3
13  a Pa4 State4
14  b Pb4 State4
15  c Pc4 State4
16  d Pd4 State4


Many thanks for your time.

kind regards
Meenu
P.S. Thanks for your reply John. I've tried to put only the relevant  
columns

of the dataframe. Hope its more clear now.


\\//

[[alternative HTML version deleted]]


   ^^Note: ^^

David Winsemius, MD
Heritage Laboratories
West Hartford, CT

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Variable alias

2009-08-01 Thread Gabor Grothendieck
If you only have to use y once then you can use delayedAssign. This
will assign a promise and the promise will not be evaluated until its
used:

 x
Error: object 'x' not found
 y
Error: object 'y' not found
 x - y
 x - 1
 delayedAssign(y, x)
 x - 2
 y
[1] 2

If that's not good enough you can use makeActiveBinding:

 x - 1
 makeActiveBinding(y, function() x, .GlobalEnv)
 y
[1] 1
 x - 2
 y
[1] 2



On Sat, Aug 1, 2009 at 10:55 AM, Daniel Haased...@haase-zm.de wrote:
 Hi Everyone,

 is there the possibility in R to assign a variable to be an alias of another
 one?
 Example:

 x - 17
 # assign y to be an alias of x
 y # returns 17
 x - 4
 y # returns 4

 Daniel

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Displaying function arguments using a Windows R console

2009-08-01 Thread David Winsemius


On Jul 31, 2009, at 2:35 PM, Laura S. wrote:

I am relatively new to R, and would appreciate any suggestions you  
may have.


I noticed on a Mac the functions' arguments are listed at the bottom  
of the R console.


Is it possible to add such a feature to a Windows R console? I have  
Windows XP if that helps.


I know function arguments can be found using args(...), but I was  
wanting to have something more automatic, like what I saw on the Mac  
computer.


I went into GUI preferences, but was not sure what to do. I noticed  
for graphics, the playwith package can be usedI was wondering if  
there was something similar to this for the arguments of functions  
in R.




I believe the feature to which you are referring is the one called  
function hints. This is a fairly recent answer to this question from  
a source that is generally authoritative:


http://finzi.psych.upenn.edu/Rhelp08/2009-July/203189.html

The follow-up exchange also appears worth reading for Windows users.


David Winsemius, MD
Heritage Laboratories
West Hartford, CT

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Determine the dimension-names of an element in an array in R

2009-08-01 Thread Poersching
Hey,
oh yes, but now I have realy the ultimate solution... ;-)
Here it comes:

a= c(A1,A2,A3,A4,A5)
b= c(B1,B2,B3)
c= c(C1,C2,C3,C4)
d= c(D1,D2)
e= c(E1,E2,E3,E4,E5,E6,E7,E8)

DataArray_1 = array(c(rnorm(240)),dim=c(length(a),length(b),
 length(d),length(e)),dimnames=list(a,b,d,e))
DataArray_2 = array(c(rnorm(320)), dim=c(length(a),length(c),
 length(d),length(e)),dimnames=list(a,c,d,e))

z-apply(as.matrix(a),c(1,2),function(f1)
apply(as.matrix(d),c(1,2),function(f2)
   
apply(DataArray_1[dimnames(DataArray_1)[[1]]==f1,,dimnames(DataArray_1)[[3]]==f2,],1,
function(d1)
   
apply(DataArray_2[dimnames(DataArray_2)[[1]]==f1,,dimnames(DataArray_2)[[3]]==f2,],1,
function(d2)
cor(d1,d2))
)))
Correl = array(z, dim=c(length(c),length(b),
 length(d),length(a)),dimnames=list(c,b,d,a))
Correl-aperm(Correl,c(4,2,1,3))

So, best Regards,
Christian

Sauvik De schrieb:
 Hi Christian:

 Many thank for the code.

 But I am afraid that your code still has a problem in terms of
 providing correct correlation. For example, if you look at the
 correlation between DataArray_1[A2,B1,D1,] and
 DataArray_2[A2,C1,D1,] after running your code, you will notice
 that this is actually the correlation between
 DataArray_1[A2,B1,D1,] and DataArray_2[A1,C1,D1,] and so on.

 The code gives the correct result only in case where elements
 corresponding to A1  D1 are involved in DataArray_1  DataArray_2.

 The problem is in

 Correl-Correl[1:length(c),,,]

 We need to select elements of Correl more carefully to reach a proper
 solution.

 Thanks,
 Sauvik


 On Wed, Jul 29, 2009 at 11:41 PM, Poersching poerschin...@web.de
 mailto:poerschin...@web.de wrote:

 Hey,
 i have forgotten to generalize the code so

 Correl-Correl[1:4,,,]

 must be

 Correl-Correl[1:length(c),,,]

 it's because the comparison levels. I think you don't want the
 correlation betweeen A1, B1, D1 and A2, C1, D1 ,
 but between A1, B1, D1 and A1, C1, D1 or between A1, B1, D1 and
 A1, C2, D1.
 So the 1:length(c) writes only the correlation between the B and
 C out
 of the whole correlation array.
 That's also why the sequence in the second apply function is changed.

 Regards Christian.

 Poersching schrieb:
  Hey,
  I think I have a solution for your problem:
 
  Correl-apply(DataArray_1,1:3, function(d1)
apply(DataArray_2,c(2,1,3), function(d) cor(d1,d))
  )
  Correl-Correl[1:4,,,]
  dimnames(Correl)[[1]]-c
  Correl-aperm(Correl,c(2,3,1,4))
 
  This one should work. :-)
 
  Best Regards,
  Christian
 
  Sauvik De schrieb:
 
  Hi there,
 
  Thanks again for your reply. I know for-loop is always a
 solution to
  my problem and I had already coded using for-loop. But the
 number of
  levels for each dimension is large enough in actual problem and
 hence
  it was time-consuming.
  So, I was just wondering if there are any other alternative
 way-outs
  to solving my problem. That's why I tried with apply functions
  (sapply)assuming that this might work out faster even
 fractionally as
  compared to for-loop.
 
  Cheers,
  Sauvik
 
  On Mon, Jul 27, 2009 at 12:28 AM, Poersching
 poerschin...@web.de mailto:poerschin...@web.de
  mailto:poerschin...@web.de mailto:poerschin...@web.de wrote:
 
  Sauvik De schrieb:
 
  Hi:
  Lots of thanks for your valuable time!
 
  But I am not sure how you would like to use the function
 in this
  situation.
 
  As I had mentioned that the first element of my output array
  should be like:
 
 
 
 cor(DataArray_1[dimnames(Correl)[[1]][1],dimnames(Correl)[[2]][1],dimnames(Correl)[[4]][1],],DataArray_2[dimnames(Correl)[[1]][1],dimnames(Correl)[[3]][1],dimnames(Correl)[[4]][1],],use=pairwise.complete.obs)
 
  in my below code.
 
  and
 
  the output array of correlation I wish to get using
 sapply as
  follows:
 
  Correl = sapply(Correl,function(d)
  cor(DataArray_1[...],DataArray_2[...],
  use=pairwise.complete.obs))
 
  So it would be of great help if you could kindly specify
 how to
  utilise your function findIndex in ...
 
  Apologies for all this!
 
  Thanks  Regards,
  Sauvik
 
 
  Hey,
  sorry, I haven't understood your problem last time, but now
 this
  solution should solve your problem, so I hope. :-)
  It's only a for to loop, but an apply function may work too. I
  will think about this, but for now...  ;-)
 
  la-length(a)
  lb-length(b)
  lc-length(c)
  ld-length(d)
  for (ia in 1:la) {
for (ib in 1:lb) {
  for (ic in 1:lc) {
for (id in 1:ld) {

Re: [R] write matrix M including names(dimnames(M))

2009-08-01 Thread David Winsemius


On Jul 30, 2009, at 11:50 PM, Steve Jaffe wrote:



I can do this by writing (and reading) the file according to some  
format of
my own devising, but I'm wondering if there is a built-in way to  
write and

then restore a matrix with not only the dimnames (which
write.table/read.table can preserve) but also the names(dimnames)?

Example:

M - matrix(1:4, 2, 2)
dimnames(M) - list(xdim=c(a, b), ydim=c(u, v))
M

   ydim
xdim u v
  a 1 3
  b 2 4


There are two such matched combinations for saving R objects complete  
with attributes:


dput/dget  # will be more readable with a text editor than the next  
option


save/load  # not very readable





David Winsemius, MD
Heritage Laboratories
West Hartford, CT

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] xyplot: superpose 2 time series with different time intervals

2009-08-01 Thread Deepayan Sarkar
On Sat, Aug 1, 2009 at 7:26 AM, Gary Lewisgary.m.le...@gmail.com wrote:
 I could use some advice regarding xyplot.

 I've got 2 time series. Both cover approximately the same period of
 time (ie, 1940 to 2009). But one series has annual data and the other
 has monthly data. One refers to university enrollment; the other to
 unemployment rates. Both are currently in the same data frame.

 I'd like to use the monthly times series as a light grayscale
 background for a plot of the annual time series, showing both series
 as type l (line). Naturally with all the NA's in the annual series,
 that plot disappears because points are not connected across missing
 values.

You could define a small wrapper function that discards NA's before
drawing lines:

my.panel.lines - function(x, y, ...) {
keep - !is.na(y)
panel.lines(x[keep], y[keep], ...)
}

and use it as a custom panel.groups function:

xyplot(whatever you had before,
   panel = panel.superpose, panel.groups = my.panel.lines)

-Deepayan

 I suppose I could make both series annual, but a lot of interesting
 detail would get lost this way. Or I guess I could interpolate values
 in the annual series with monthly approximations, but this means 11
 out of every 12 values is an approximation. Or I suppose I could plot
 each series separately and then print them with position information,
 which I'm reluctant to do because panel.superpose so nicely handles
 the alignment of the 2 panels.

 What I'd really like to do is plot each independently but still
 superposed. Effectively this seems to mean monthly data intervals but
 line connections across the NA's in the series with annual intervals.

 Any suggestions would be appreciated. Thanks.

 Gary Lewis

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Parameters of Logistic Distribution and (3 Parameter) Log Logistic Distribution

2009-08-01 Thread David Winsemius


On Jul 31, 2009, at 3:33 AM, Madhavi Bhave wrote:


Dear R Helpers

Please guide me how one can estimate the parameters of Logistic  
Distribution and 3 Parameter Log-logistic distribution for a given  
data.


data -

c(2987.43,2990.12,3023.52,2964.79,3019.60,3051.07,3080.16,2944.15,3035.19,3023.46,2985.05,2970.95,3192.36,3084.39,2926.23,2952.15,3064.15,3003.20,2980..35,2980.45,3043.12,3115.53,3006.90,2946.03,3039.97,3064.01,3000.56,3049.57,3042.54,3037.63,2982.03,2889.74,3043.83,2930.95,3020.65,3009.21,3084.16,2954.05,2991.04,3083.10,3007.26,2949.58,2995.65,3078.36,3031.64,3001.28,3103.32,3015.04,2994.45,2963.71,2932.90,3021.31,3074.72,2980.15,3002.29,3088.18,2991.39,2942.90,3057.91,3023.25,3192.67,2966.49,3049.31,2915.38,3045.27,2852.72,2999.25,2978.52,3040.07,2945.50,3047.47,2915.95,3012.24,2985.80,2971.04,3035.72,3025.40,3014.76,2979.62,3029.20,2938.38,2966.47,3017.81,3016.43,2989.60,2941.22,3038.30,3033.44,3003.77,2950.02,3053.19,3011.69,2916.34,2918..10,3049.98,3062.46,2948.55,3072.90,3113.52,2987.61)



require(MASS)
?fitdistr




David Winsemius, MD
Heritage Laboratories
West Hartford, CT

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Add columns in a dataframe and fill them from another table according to a criteria

2009-08-01 Thread Meenu Sahi
Dear R users
My apologizes for not writing in the correct format due to my ignorance. In
the future I will write more clearly. I hope to contribute to the R
community in the process of picking up the language professionally.
I have now written the R code which is attached in a notepad file. I've
simplified my problem in an example of, table pstate which contains the
probabilities of getting certain changes in the four different states and a
dataframe mydata4 which contains all the changes connected to the four
different states. I would like to add the probabilities into mydata4 after
matching for the change and the state.
Everything before # output can be copy pasted in the R window. The
desired output is written after ## OUTPUT
Must I write an if else or can I do it in an easier way?
Your help is greatly appreciated ! Many thanks for your patience.

Regards
Meenu

On Sat, Aug 1, 2009 at 9:43 PM, David Winsemius dwinsem...@comcast.netwrote:


 On Aug 1, 2009, at 9:52 AM, Meenu Sahi wrote:

 Deare R users

 I am new to R.
 What I want to do is explained below;-
 I have table called States.Prob which is given below:-
 This table gives the probabilities of the changes in the swap curve
 depending on the state of the swap curve. I want to put these
 probabilities
 in my dataframe mydata(given after the prob table).
Prob of States
 Changes  State1  State2 State3 State4
 a Pa1  Pa2 Pa3 Pa4
 b Pb1  Pb2 Pb3 Pb4
 c Pc1  Pc2 Pc3 Pc4
 d Pd1  Pd2 Pd3 Pd4

 and I have a dataframe(with 93 rows) called mydata part of which(6 rows)
 is
 given below where I want to fill in the last four columns with
 probabilities
 taken from States.Prob according to the change and state in mydata4:-
 Change  State  PState1  PState2  PState3  PState4
 1 b   State1  Pb1
 2 a   State4   Pa4
 3 b   State2Pb2
 4 c   State3 Pc3
 5 d   State1  Pd1
 6 a   State3 Pa3

 What I want to do is highlighted in Red.
 How can I do this easily?

 You may have seen it in red, but we don't, and I, at least, cannot
 figure out what you intend.   (Per the Posting Guide, which you have
 obviously not yet read, you need to compose your question in plain old
 monochromatic text and change your mail client so it posts in plain text.)

 If looking at the help pages for stack() and reshape() does not offer
 useful information and worked examples that meet your needs then:

 An approach that would make you more populat in these parts would be to
  make a simpler example, composed in syntactically correct R, that is
 complete in itself, and can pasted into an R session. Indicate what you
 intend as output from this simpler input.

 Perhaps

  pstate - read.table(textConnection(Changes  State1  State2 State3
 State4
 + a Pa1  Pa2 Pa3 Pa4
 + b Pb1  Pb2 Pb3 Pb4
 + c Pc1  Pc2 Pc3 Pc4
 + d Pd1  Pd2 Pd3 Pd4),  header=TRUE, as.is=TRUE)

 ?stack

  data.frame(Change=pstate[,1],
  prstate =stack(pstate[2:5])$values,
  state=stack(pstate[2:5])$ind )

 #first column  is only 4 elements long, but will get recycled
 # second  retreives the probabilities and may need to have as.numeric( )
 wrapped around it if they really are numeric.
 # third returns what started out as column names.

   Change prstate  state
 1   a Pa1 State1
 2   b Pb1 State1
 3   c Pc1 State1
 4   d Pd1 State1
 5   a Pa2 State2
 6   b Pb2 State2
 7   c Pc2 State2
 8   d Pd2 State2
 9   a Pa3 State3
 10  b Pb3 State3
 11  c Pc3 State3
 12  d Pd3 State3
 13  a Pa4 State4
 14  b Pb4 State4
 15  c Pc4 State4
 16  d Pd4 State4

 Many thanks for your time.

 kind regards
 Meenu
 P.S. Thanks for your reply John. I've tried to put only the relevant
 columns
 of the dataframe. Hope its more clear now.

\\//

[[alternative HTML version deleted]]


   ^^Note: ^^

 David Winsemius, MD
 Heritage Laboratories
 West Hartford, CT


pstate-read.table(textConnection(Changes PState1 PState2 PState3 PState4
+ a Pa1 Pa2 Pa3 Pa4
+ b Pb1 Pb2 Pb3 Pb4
+ c Pc1 Pc2 Pc3 Pc4
+ d Pd1 Pd2 Pd3 Pd4),header=TRUE,as.is=TRUE)

Change-c(b,a,b,c,d,a)
State-c(State1,State4,State2,State3,State1,State3)

mydata4-data.frame(Change,State)


mydata4-within(mydata4, {
PState1-NA
PState2-NA
PState3-NA
PState4-NA
})
#OUTPUT
#I would like to see my output of mydata4 with NA in the last 4 columns 
replaced by matching probabilities
# from table pstate in whichever of the 4 columns are applicable depending on 
the State and Change. e.g. Row1
# of mydata4 has 

Re: [R] Determine the dimension-names of an element in an array in R

2009-08-01 Thread Sauvik De
Hi Christian:

Thanks a lot for your continuous help. This time you got the code right !
That's what I wanted :)
Great job!

Thanks  Regards,
Sauvik

On Sat, Aug 1, 2009 at 10:30 PM, Poersching poerschin...@web.de wrote:

 Hey,
 oh yes, but now I have realy the ultimate solution... ;-)
 Here it comes:

 a= c(A1,A2,A3,A4,A5)
 b= c(B1,B2,B3)
 c= c(C1,C2,C3,C4)
 d= c(D1,D2)
 e= c(E1,E2,E3,E4,E5,E6,E7,E8)

 DataArray_1 = array(c(rnorm(240)),dim=c(length(a),length(b),
  length(d),length(e)),dimnames=list(a,b,d,e))
 DataArray_2 = array(c(rnorm(320)), dim=c(length(a),length(c),
  length(d),length(e)),dimnames=list(a,c,d,e))

 z-apply(as.matrix(a),c(1,2),function(f1)
apply(as.matrix(d),c(1,2),function(f2)


 apply(DataArray_1[dimnames(DataArray_1)[[1]]==f1,,dimnames(DataArray_1)[[3]]==f2,],1,
 function(d1)


 apply(DataArray_2[dimnames(DataArray_2)[[1]]==f1,,dimnames(DataArray_2)[[3]]==f2,],1,
 function(d2)
cor(d1,d2))
 )))
 Correl = array(z, dim=c(length(c),length(b),
  length(d),length(a)),dimnames=list(c,b,d,a))
 Correl-aperm(Correl,c(4,2,1,3))

 So, best Regards,
 Christian

 Sauvik De schrieb:
  Hi Christian:
 
  Many thank for the code.
 
  But I am afraid that your code still has a problem in terms of
  providing correct correlation. For example, if you look at the
  correlation between DataArray_1[A2,B1,D1,] and
  DataArray_2[A2,C1,D1,] after running your code, you will notice
  that this is actually the correlation between
  DataArray_1[A2,B1,D1,] and DataArray_2[A1,C1,D1,] and so on.
 
  The code gives the correct result only in case where elements
  corresponding to A1  D1 are involved in DataArray_1  DataArray_2.
 
  The problem is in
 
  Correl-Correl[1:length(c),,,]
 
  We need to select elements of Correl more carefully to reach a proper
  solution.
 
  Thanks,
  Sauvik
 
 
  On Wed, Jul 29, 2009 at 11:41 PM, Poersching poerschin...@web.de
  mailto:poerschin...@web.de wrote:
 
  Hey,
  i have forgotten to generalize the code so
 
  Correl-Correl[1:4,,,]
 
  must be
 
  Correl-Correl[1:length(c),,,]
 
  it's because the comparison levels. I think you don't want the
  correlation betweeen A1, B1, D1 and A2, C1, D1 ,
  but between A1, B1, D1 and A1, C1, D1 or between A1, B1, D1 and
  A1, C2, D1.
  So the 1:length(c) writes only the correlation between the B and
  C out
  of the whole correlation array.
  That's also why the sequence in the second apply function is changed.
 
  Regards Christian.
 
  Poersching schrieb:
   Hey,
   I think I have a solution for your problem:
  
   Correl-apply(DataArray_1,1:3, function(d1)
 apply(DataArray_2,c(2,1,3), function(d) cor(d1,d))
   )
   Correl-Correl[1:4,,,]
   dimnames(Correl)[[1]]-c
   Correl-aperm(Correl,c(2,3,1,4))
  
   This one should work. :-)
  
   Best Regards,
   Christian
  
   Sauvik De schrieb:
  
   Hi there,
  
   Thanks again for your reply. I know for-loop is always a
  solution to
   my problem and I had already coded using for-loop. But the
  number of
   levels for each dimension is large enough in actual problem and
  hence
   it was time-consuming.
   So, I was just wondering if there are any other alternative
  way-outs
   to solving my problem. That's why I tried with apply functions
   (sapply)assuming that this might work out faster even
  fractionally as
   compared to for-loop.
  
   Cheers,
   Sauvik
  
   On Mon, Jul 27, 2009 at 12:28 AM, Poersching
  poerschin...@web.de mailto:poerschin...@web.de
   mailto:poerschin...@web.de mailto:poerschin...@web.de wrote:
  
   Sauvik De schrieb:
  
   Hi:
   Lots of thanks for your valuable time!
  
   But I am not sure how you would like to use the function
  in this
   situation.
  
   As I had mentioned that the first element of my output array
   should be like:
  
  
 
 cor(DataArray_1[dimnames(Correl)[[1]][1],dimnames(Correl)[[2]][1],dimnames(Correl)[[4]][1],],DataArray_2[dimnames(Correl)[[1]][1],dimnames(Correl)[[3]][1],dimnames(Correl)[[4]][1],],use=pairwise.complete.obs)
  
   in my below code.
  
   and
  
   the output array of correlation I wish to get using
  sapply as
   follows:
  
   Correl = sapply(Correl,function(d)
   cor(DataArray_1[...],DataArray_2[...],
   use=pairwise.complete.obs))
  
   So it would be of great help if you could kindly specify
  how to
   utilise your function findIndex in ...
  
   Apologies for all this!
  
   Thanks  Regards,
   Sauvik
  
  
   Hey,
   sorry, I haven't understood your problem last time, but now
  this
   solution should solve your problem, so I hope. :-)
 

Re: [R] zoo plot warning messages - I don't know what they mean or how to inspect the data to figure this out

2009-08-01 Thread Gabor Grothendieck
So as not to leave this thread dangling the problem was
character fields where numeric fields had been expected.

On Sat, Aug 1, 2009 at 11:32 AM, stephen sefickssef...@gmail.com wrote:
 I have a time series from 1933-2005 of precipitation at Fayetteville
 NC.  I get the following error messages when I plot the zoo series.
 Any help would be appreciated.  If you need the data I can dput it or
 send the csv.  I didn't include it here because I didn't want to clog
 up anybodies email account.  I know that this is not reproducible, and
 I will send along the file if needed.

 Warning messages:
 1: In xy.coords(x, y, xlabel, ylabel, log) : NAs introduced by coercion
 2: In lines.times(x.index, y[, i], col = col[[i]], pch = pch[[i]],  :
  NAs introduced by coercion

 --
 Stephen Sefick

 Let's not spend our time and resources thinking about things that are
 so little or so large that all they really do for us is puff us up and
 make us feel like gods.  We are mammals, and have not exhausted the
 annoying little problems of being mammals.

                                                                -K. Mullis

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Automatic datasets creation from multiple data sheets in a single excel file

2009-08-01 Thread Erich Neuwirth
If you have RExcel (and the necessary infrastructure, i.e.
statconnDCOM and possibly rcom) installed, the following VBA
macro will do the trick.

-=-=-=-=-=
Option Explicit

Sub TransferAllSheetsAsDataframes(wb As Workbook)
Dim ws As Worksheet
RInterface.StartRServer
For Each ws In wb.Sheets
RInterface.PutDataframe ws.Name, ws.Cells(1, 1).CurrentRegion
Next ws
RInterface.StopRServer
End Sub

Sub TransferSheetsInThisWorkbook()
TransferAllSheetsAsDataframes ThisWorkbook
End Sub


-=-=-=-=-

You have to establish a reference to RExcelVBALib in your workbook.

The names of the sheets will be used as the names of the dataframes.



Dieter Menne wrote:
 
 
 rajclinasia wrote:
 Please let us know how to create automatic datasets from multiple data
 sheets in a single excel file...

 For example if there are 10 sheets in a single excel file, automatically
 10 datasets need to be created at a time when i read an excel file as a
 whole at once.


 
 The critical part is getting the names of the worksheets. 
 
 http://tolstoy.newcastle.edu.au/R/e6/help/09/03/7736.html
 
 For reading individual worksheets, there is lots of code around.
 
 And a site search for read excel worksheet returns quite a few references.
 
 Dieter
 
 
 
 

-- 
Erich Neuwirth, University of Vienna
Faculty of Computer Science
Computer Supported Didactics Working Group
Visit our SunSITE at http://sunsite.univie.ac.at
Phone: +43-1-4277-39464 Fax: +43-1-4277-39459

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] diagonal lda and qda

2009-08-01 Thread cindy Guo
Hi, all,

I am wondering if there is any package doing lda and qda which allows
assuming diagonal covariance matrices. I checked the lda function in MASS,
and it seems it does not support this.

Thanks,

Cindy

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] diagonal LDA and QDA

2009-08-01 Thread cindy Guo
Hi, all,

I am wondering if there is any package doing lda and qda which allows
assuming diagonal covariance matrices. I checked the lda function in MASS,
and it seems it does not support this.

Thanks,

Cindy

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Question about rpart decision trees (being used to predict customer churn)

2009-08-01 Thread Carlos J. Gil Bellosta
Hello,

If you do

my.tree - rpart(cancel ~ experience)

and then you check

my.tree$frame

you will note that the complexity parameter there is 0. 

Check ?rpart.object to get a description of what this output means. But
essentially, you will not be able to break the leaf unless you set a
complexity parameter below that value, this is, never.

You may need to go into the internals of the function (and the C code)
in order to understand how this parameter is calculated. It looks to me
as an oddity and it is worth trying to understand why. 

Best regards,

Carlos J. Gil Bellosta
http://www.datanalytics.com


P.S.: Note that there is a bug in your submitted code that requires some
hand fixing.



On Sun, 2009-07-26 at 11:37 -0700, Robert Smith wrote:
 Hi,
 
 I am using rpart decision trees to analyze customer churn. I am finding that
 the decision trees created are not effective because they are not able to
 recognize factors that influence churn. I have created an example situation
 below. What do I need to do to for rpart to build a tree with the variable
 experience? My guess is that this would happen if rpart used the loss matrix
 while creating the tree.
 
  experience - as.factor(c(rep(good,90), rep(bad,10)))
  cancel - as.factor(c(rep(no,85), rep(yes,5), rep(no,5),
 rep(yes,5)))
  table(experience, cancel)
   cancel
 experience no yes
   bad   5   5
   good 85   5
  rpart(cancel ~ experience)
 n= 100
 node), split, n, loss, yval, (yprob)
   * denotes terminal node
 1) root 100 10 no (0.900 0.100) *
 
 I tried the following commands with no success.
 rpart(cancel ~ experience, control=rpart.control(cp=.0001))
 rpart(cancel ~ experience, parms=list(split='information'))
 rpart(cancel ~ experience, parms=list(split='information'),
 control=rpart.control(cp=.0001))
 rpart(cancel ~ experience, parms=list(loss=matrix(c(0,1,1,0), nrow=2,
 ncol=2)))
 
 Thanks a lot for your help.
 
 Best regards,
 Robert
 
   [[alternative HTML version deleted]]
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Transparency and trellis device

2009-08-01 Thread pomchip
Dear R-users,

I am trying to produce trellis (png, or jpeg) graphs with transparent 
background, but I cannot manage to make that happen. I tried to play around 
with themes but to no avail. Any advise on the following example will be 
greatly appreciated:

Thank you

Sebastien


library(lattice)

df - data.frame(a=rep(1:4,4), b=rep(1:4,4), c=rep(1:4,each=4))

settings - standard.theme()
settings - modifyList(settings,
   list(background=list(alpha=1,
col=transparent)))
str(settings)

trellis.device(png,
   file=test.png,
   theme=settings)

myplot-xyplot(b~a|c, data=df)

print(myplot)

dev.off()



__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] diagonal LDA and QDA

2009-08-01 Thread cindy Guo
Hi, all,

I am wondering if there is any package doing lda and qda which allows
assuming diagonal covariance matrices. I checked the lda function in MASS,
and it seems it does not support this.

Thanks,

Cindy

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] How to stop an R script when running JGR on a Linux/SuSE system

2009-08-01 Thread Liviu Andronic
Hello,

On 7/31/09, mau...@alice.it mau...@alice.it wrote:
 When I need to stop a running R script  on Windows or Mac I just use the 
 esc key which kills the current script and returns the control to R 
 interpreter.
  But when I run R from JGR the esc is useless as well as the other 
 available keyboard keys.

This issue was addressed in a recent discussion [1].
Liviu

[1] 
http://mailman.rz.uni-augsburg.de/pipermail/stats-rosuda-devel/2009q2/001106.html

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] SVG output on Windows OS

2009-08-01 Thread Michael Roessler
Thank you, David.
Cairo is installed and loaded.
cairoDevice is installed and loaded.
RGtk2 is installed and loaded (which installed GTK+)

yet I still get false for cairo:

 capabilities()
jpeg  png tifftcltk  X11 aqua http/ftp  sockets
TRUE TRUE TRUE TRUEFALSEFALSE TRUE TRUE
  libxml fifo   clediticonv  NLS  profmemcairo
TRUEFALSE TRUE TRUE TRUEFALSEFALSE


 Cairo.capabilities()
  png  jpeg  tiff   pdf   svgps   x11   win
 TRUE FALSE FALSE  TRUE  TRUE  TRUE FALSE  TRUE

 ggsave(file=chart.svg)
Saving 6 x 6 image
Error: 'svg' is not an exported object from 'namespace:grDevices'


I'm lost as to how to produce the svg output on windows. All works suitably
on Linux.


Michael Roessler, CFA
michael.roes...@keyevent.com



On Sat, Aug 1, 2009 at 6:15 AM, David Winsemius dwinsem...@comcast.netwrote:


 On Jul 31, 2009, at 6:41 PM, Michael Roessler wrote:

  How may one save a graphic as svg on Windows? The svg() command is
 recognized and functions well on Linux, etc., but not on Windows, it
 seems.
 I'm trying to use Hadley Wickam's ggplot2 and I would like to be able to
 save created charts as svg for later input into Illustrator. I am able to
 accomplish this workflow under Linux, but I don't know how to get R to
 recognize the svg() command under Windows. I have loaded RsvgDevice,
 Cairo,
 and cairoDevice in my attempts. The problem seems to me to be directly
 related to enabling R to produce svg output on Windows, rather than
 related
 to ggplot2.


 What does capabilities() return?
 --

 David Winsemius, MD
 Heritage Laboratories
 West Hartford, CT



[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Cox ridge regression

2009-08-01 Thread Ljubomir Buturovic

Hello,

I have questions regarding penalized Cox regression using survival
package (functions coxph() and ridge()). I am using R 2.8.0 on Ubuntu
Linux and survival package version 2.35-4.

Question 1. Consider the following example from help(ridge):

 fit1 - coxph(Surv(futime, fustat) ~ rx + ridge(age, ecog.ps, theta=1), 
 ovarian)

As I understand, this builds a model in which `rx' is the predictor,
whereas ridge penalty term contains variables `age' and
`ph.ecog'. Could someone explain what it means to regularize on
parameters which are not part of the model?  Based on definition of
Cox ridge regression (see for example [1]), or any other regularized
regression, the penalty term is a function of the coefficients
corresponding to the predictor variables, and nothing else.

Question 2. Consider a similar example:

 library(survival)
 lfit2 - coxph(Surv(time, status) ~ age+ph.ecog + ridge(age, ph.ecog, 
 theta=1), cancer)
 print(lfit2)
Call:
coxph(formula = Surv(time, status) ~ age + ph.ecog + ridge(age, 
ph.ecog, theta = 1), data = cancer)

   coef se(coef) se2  Chisq DF p   
age1.13e-02 0.1119.32e-03 0.01  1  0.92
ph.ecog4.43e-01 1.3981.16e-01 0.10  1  0.75
ridge(age) 2.60e-21 0.1104.85e-17 0.00  1  1.00
ridge(ph.ecog) 5.14e-22 1.393 0.00  1  1.00

Iterations: 1 outer, 3 Newton-Raphson
Degrees of freedom for terms= 0 0 0 
Likelihood ratio test=19.1  on 0.01 df, p=3.54e-08
  n=227 (1 observation deleted due to missingness)
Warning message:
In sqrt((diag(x$var2))[kk]) : NaNs produced

What is the meaning of the ridge(age) and ridge(ph.ecog) coefficients?
Again, based on the definition of Cox ridge regression, it simply adds
a penalty term to the standard Cox regression function, and doesn't
introduce any new predictors. What to make of the ridge(age) and
ridge(ph.ecog) rows in the output?

Question 3. What is the origin and significance of the warning in the
previous example:

Warning message:
In sqrt((diag(x$var2))[kk]) : NaNs produced

Thank you very much for your help,

Ljubomir

[1] Bovelstad et al., Predicting survival from microarray data - a
comparative study (Bioinformatics, Vol. 23, no. 16, 2007,
pp. 2080-2087).

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] odfWeave : sudden and unexplained error

2009-08-01 Thread Emmanuel Charpentier
Dear list, dear Max,

I a currently working on a report. I'm writing it with OpenOffice.org
and odfWeave. I'm working increentally : I write a bit, test
(interactively) some ideas, cutting-and-pasting code to the Ooo report
when satisfied with it. I the process, I tend to recompile the .odt
source a *lot*.

Suddenly, odfWeave started to give me an incomprehensible error even
before starting the compilation itself (InFile is my inut .odt file,
Outfile is the resultant .odt file) :

 odfWeave(InFile, OutFile)
  Copying  SrcAnalyse1.odt 
  Setting wd to  /tmp/RtmphCUkSf/odfWeave01225949667 
  Unzipping ODF file using unzip -o SrcAnalyse1.odt 
Archive:  SrcAnalyse1.odt
 extracting: mimetype
  inflating: content.xml 
  inflating: layout-cache
  inflating: styles.xml  
 extracting: meta.xml
  inflating: Thumbnails/thumbnail.png  
  inflating: Configurations2/accelerator/current.xml  
   creating: Configurations2/progressbar/
   creating: Configurations2/floater/
   creating: Configurations2/popupmenu/
   creating: Configurations2/menubar/
   creating: Configurations2/toolbar/
   creating: Configurations2/images/Bitmaps/
   creating: Configurations2/statusbar/
  inflating: settings.xml
  inflating: META-INF/manifest.xml   

  Removing  SrcAnalyse1.odt 
  Creating a Pictures directory

  Pre-processing the contents
Erreur : cc$parentId == parentId is not TRUE
 
Perusing the documentation and the r-help list archives didn't turn up
anything relevant. This error survived restarting OOo, restarting R,
restarting its enclosing Emacs session and even rebooting the damn
hardware...

Any idea ?

Emmanuel Charpentier

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] R book for economists

2009-08-01 Thread Alain Zuur



Thiemo Fetzer wrote:
 
 Dear Group,
 
 I am an economics student starting with PhD work in London. As preparation
 I
 would like to get to know R a little bit better. For Stata there are tons
 of
 books, however, can you recommend a book for R?
 
 I have some substantiated econometrics knowledge, so it should be more a
 how-to book.
 
 Best regards
 Thiemo
 
 ---
 Thiemo Fetzer, Economist
 http://freigeist.devmag.net
 http://www.devmag.net
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.
 
 

Besides the other already mentioned econometrical references.if you are
willing to read a book with life science data, then try:

A Beginner's Guide to R (2009).
Zuur, AF, Ieno, EN, Meesters, EHWG. Springer
http://www.springer.com/statistics/computational/book/978-0-387-93836-3


Alain



-

Dr. Alain F. Zuur
First author of:

1. Analysing Ecological Data (2007).
Zuur, AF, Ieno, EN and Smith, GM. Springer. 680 p.

2. Mixed effects models and extensions in ecology with R. (2009).
Zuur, AF, Ieno, EN, Walker, N, Saveliev, AA, and Smith, GM. Springer.

3. A Beginner's Guide to R (2009).
Zuur, AF, Ieno, EN, Meesters, EHWG. Springer


Statistical consultancy, courses, data analysis and software
Highland Statistics Ltd.
6 Laverock road
UK - AB41 6FN Newburgh
Email: highs...@highstat.com
URL: www.highstat.com



-- 
View this message in context: 
http://www.nabble.com/R-book-for-economists-tp24768682p24772774.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Getting file name from pdf device?

2009-08-01 Thread Tom Short
On Fri, Jul 31, 2009 at 8:49 AM, Rainer M Krugr.m.k...@gmail.com wrote:
 My question: how can I get the filename of the pdf from the device
 before it is closed?

I've also looked for this and couldn't find a way. I had a similar
use, where I wanted to get an R transcript with embedded plots in
emacs (see prettyR for another transcript-with-plots option). What I
did was use dev2bitmap to write out a PNG file. You could do something
similar with dev.copy2pdf to create the pdf after you do the plotting.
You could also use dev2bitmap in this manner to drive ghostscript to
create pdf's for you (I don't know if it'll compress like you want).
Here's what I did:

show - function(file = paste(tempfile(), .png, sep = )) {
dev2bitmap(file)
cat([[, file, ]]\n, sep = ) # I do some post-processing in
emacs to see the embedded graphic
}

My use case was that plots would be inserted where I used show as follows:

plot(sin)
show()#  plot inserted into transcript here
plot(cos)
show(cos.png) # this time, a named local file instead of a temp file

- Tom

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Compare lm() to glm(family=poisson)

2009-08-01 Thread Alain Zuur



Mark Na wrote:
 
 Dear R-helpers,
 I would like to compare the fit of two models, one of which I fit using
 lm()
 and the other using glm(family=poisson). The latter doesn't provide
 r-squared, so I wonder how to go about comparing these
 models (they have the same formula).
 
 Thanks very much,
 
 Mark Na
 
   [[alternative HTML version deleted]]
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.
 
 


The decision which distribution to use (Normal versus Poisson) should be an
a priori choice. If you really want to compare them, then inspect the
residuals of both models and see which model doesn't have any residual
patterns.

Alain

-

Dr. Alain F. Zuur
First author of:

1. Analysing Ecological Data (2007).
Zuur, AF, Ieno, EN and Smith, GM. Springer. 680 p.

2. Mixed effects models and extensions in ecology with R. (2009).
Zuur, AF, Ieno, EN, Walker, N, Saveliev, AA, and Smith, GM. Springer.

3. A Beginner's Guide to R (2009).
Zuur, AF, Ieno, EN, Meesters, EHWG. Springer


Statistical consultancy, courses, data analysis and software
Highland Statistics Ltd.
6 Laverock road
UK - AB41 6FN Newburgh
Email: highs...@highstat.com
URL: www.highstat.com



-- 
View this message in context: 
http://www.nabble.com/Compare-lm%28%29-to-glm%28family%3Dpoisson%29-tp24764558p24772802.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] A hiccup when using anova on gam() fits.

2009-08-01 Thread Rolf Turner


Thank you.  That clarified a great many things.

cheers,

Rolf

##
Attention:\ This e-mail message is privileged and confid...{{dropped:9}}

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Question about rpart decision trees (being used to predict customer churn)

2009-08-01 Thread Graham Williams
2009/7/27 Robert Smith robertpsmith2...@gmail.com

 Hi,

 I am using rpart decision trees to analyze customer churn. I am finding
 that
 the decision trees created are not effective because they are not able to
 recognize factors that influence churn. I have created an example situation
 below. What do I need to do to for rpart to build a tree with the variable
 experience? My guess is that this would happen if rpart used the loss
 matrix
 while creating the tree.

  experience - as.factor(c(rep(good,90), rep(bad,10)))
  cancel - as.factor(c(rep(no,85), rep(yes,5), rep(no,5),
 rep(yes,5)))
  table(experience, cancel)
  cancel
 experience no yes
  bad   5   5
  good 85   5
  rpart(cancel ~ experience)
 n= 100
 node), split, n, loss, yval, (yprob)
  * denotes terminal node
 1) root 100 10 no (0.900 0.100) *

 I tried the following commands with no success.
 rpart(cancel ~ experience, control=rpart.control(cp=.0001))
 rpart(cancel ~ experience, parms=list(split='information'))
 rpart(cancel ~ experience, parms=list(split='information'),
 control=rpart.control(cp=.0001))
 rpart(cancel ~ experience, parms=list(loss=matrix(c(0,1,1,0), nrow=2,
 ncol=2)))

 Thanks a lot for your help.

 Best regards,
 Robert


Hi Robert,

Perhaps try a less extreme loss matrix:

rpart(cancel ~ experience, parms=list(loss=matrix(c(0,5,1,0), byrow=TRUE,
nrow=2)))

Output from Rattle:

Summary of the Tree model for Classification (built using rpart):

n= 100

node), split, n, loss, yval, (yprob)
  * denotes terminal node

1) root 100 50 no (0.9000 0.1000)
  2) experience=good 90 25 no (0.9444 0.0556) *
  3) experience=bad 10  5 yes (0.5000 0.5000) *

Classification tree:
rpart(formula = cancel ~ ., data = crs$dataset, method = class,
parms = list(loss = matrix(c(0, 5, 1, 0), byrow = TRUE, nrow = 2)),
control = rpart.control(cp = 0.0001, usesurrogate = 0, maxsurrogate =
0))

Variables actually used in tree construction:
[1] experience

Root node error: 50/100 = 0.5

n= 100

  CP nsplit rel error xerror xstd
1 0.4000  0   1.01.0 0.30
2 0.0001  1   0.60.6 0.22

TRAINING DATA Error Matrix - Counts

 Actual
Predicted no yes
  no  85   5
  yes  5   5


TRAINING DATA Error Matrix - Percentages

 Actual
Predicted no yes
  no  85   5
  yes  5   5

Time taken: 0.01 secs

Generated by Rattle 2009-08-02 08:24:50 gjw
==

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] odfWeave : sudden and unexplained error

2009-08-01 Thread Max Kuhn
Sending me a reproducible example and the results of sessionInfo() would help.

Max

On Sat, Aug 1, 2009 at 5:13 PM, Emmanuel
Charpentiercharp...@bacbuc.dyndns.org wrote:
 Dear list, dear Max,

 I a currently working on a report. I'm writing it with OpenOffice.org
 and odfWeave. I'm working increentally : I write a bit, test
 (interactively) some ideas, cutting-and-pasting code to the Ooo report
 when satisfied with it. I the process, I tend to recompile the .odt
 source a *lot*.

 Suddenly, odfWeave started to give me an incomprehensible error even
 before starting the compilation itself (InFile is my inut .odt file,
 Outfile is the resultant .odt file) :

 odfWeave(InFile, OutFile)
  Copying  SrcAnalyse1.odt
  Setting wd to  /tmp/RtmphCUkSf/odfWeave01225949667
  Unzipping ODF file using unzip -o SrcAnalyse1.odt
 Archive:  SrcAnalyse1.odt
  extracting: mimetype
  inflating: content.xml
  inflating: layout-cache
  inflating: styles.xml
  extracting: meta.xml
  inflating: Thumbnails/thumbnail.png
  inflating: Configurations2/accelerator/current.xml
   creating: Configurations2/progressbar/
   creating: Configurations2/floater/
   creating: Configurations2/popupmenu/
   creating: Configurations2/menubar/
   creating: Configurations2/toolbar/
   creating: Configurations2/images/Bitmaps/
   creating: Configurations2/statusbar/
  inflating: settings.xml
  inflating: META-INF/manifest.xml

  Removing  SrcAnalyse1.odt
  Creating a Pictures directory

  Pre-processing the contents
 Erreur : cc$parentId == parentId is not TRUE

 Perusing the documentation and the r-help list archives didn't turn up
 anything relevant. This error survived restarting OOo, restarting R,
 restarting its enclosing Emacs session and even rebooting the damn
 hardware...

 Any idea ?

                                        Emmanuel Charpentier

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




-- 

Max

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] R Package That Contains International Geomagnetic Reference Field (IGRF)

2009-08-01 Thread Jason Rupert
By any chance is anyone aware of an R package that contains a representation of 
the International Geomagnetic Reference Field (IGRF)?  
http://www.ngdc.noaa.gov/IAGA/vmod/igrf.html

I've tracked down some Fortran and C code for the IGRF-10, and possibly 
IGRF-11, and was hoping to avoid an awkward port.  

Thanks again for any feedback and leads provided.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] vglm{VGAM} output to Latex ?

2009-08-01 Thread Ugur Ozdemir

Hi all,

I am trying to put the summary output of vglm{VGAM} into a Latex table using 
mtable(Memisc}. I think I solved the problem regarding to the fact that vglm 
produces a vglm object which is not accepted by mtable by default defining a 
getSummary.vglm function. 

However summary.vglm adds : to the end of the coefficient names followed by 
the factor level while using the multinomial model. So the coefficient names 
look like (Intercept):1,(Intercept):2 etc. However this creates a problem 
in mtable:

Error in strsplit(coefnames, :, fixed = TRUE) : non-character argument 

Any suggestions ? 

Or, are there any other general suggestions about putting vglm summary output 
into a Latex table using another method ?

All help is greatly appreciated.

Ugur   

Microsoft gives you windows, Linux gives you the whole house.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] SVG output on Windows OS

2009-08-01 Thread David Winsemius
It says cairo is not available. (Once you read this in a proper  
monospaced font, anyway.)



On Aug 1, 2009, at 4:27 PM, Michael Roessler wrote:


Thank you, David.

Cairo is installed and loaded.
cairoDevice is installed and loaded.
RGtk2 is installed and loaded (which installed GTK+)

yet I still get false for cairo:

 capabilities()
jpeg  png tifftcltk  X11 aqua http/ftp   
sockets
TRUE TRUE TRUE TRUEFALSEFALSE TRUE  
TRUE

  libxml fifo   clediticonv  NLS  profmemcairo
TRUEFALSE TRUE TRUE TRUEFALSEFALSE

 ^^^

You need to investigate why your Windows cairo installation is not  
available.







 Cairo.capabilities()
  png  jpeg  tiff   pdf   svgps   x11   win
 TRUE FALSE FALSE  TRUE  TRUE  TRUE FALSE  TRUE

 ggsave(file=chart.svg)
Saving 6 x 6 image
Error: 'svg' is not an exported object from 'namespace:grDevices'


I'm lost as to how to produce the svg output on windows. All works  
suitably on Linux.



Michael Roessler, CFA
michael.roes...@keyevent.com



On Sat, Aug 1, 2009 at 6:15 AM, David Winsemius dwinsem...@comcast.net 
 wrote:


On Jul 31, 2009, at 6:41 PM, Michael Roessler wrote:

How may one save a graphic as svg on Windows? The svg() command is
recognized and functions well on Linux, etc., but not on Windows, it  
seems.
I'm trying to use Hadley Wickam's ggplot2 and I would like to be  
able to
save created charts as svg for later input into Illustrator. I am  
able to

accomplish this workflow under Linux, but I don't know how to get R to
recognize the svg() command under Windows. I have loaded RsvgDevice,  
Cairo,

and cairoDevice in my attempts. The problem seems to me to be directly
related to enabling R to produce svg output on Windows, rather than  
related

to ggplot2.


What does capabilities() return?
--

David Winsemius, MD
Heritage Laboratories
West Hartford, CT




David Winsemius, MD
Heritage Laboratories
West Hartford, CT

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Add columns in a dataframe and fill them from another table according to a criteria

2009-08-01 Thread David Winsemius
Apologies to list: Should have replied to all.

--  
DW

Begin forwarded message:

 From: David Winsemius dwinsem...@comcast.net
 Date: August 1, 2009 3:02:58 PM EDT
 To: Meenu Sahi meenus...@gmail.com
 Subject: Re: [R] Add columns in a dataframe and fill them from  
 another table  according to a criteria


 On Aug 1, 2009, at 1:43 PM, Meenu Sahi wrote:

 Dear R users
 My apologizes for not writing in the correct format due to my  
 ignorance. In the future I will write more clearly. I hope to  
 contribute to the R community in the process of picking up the  
 language professionally.
 I have now written the R code which is attached in a notepad file.  
 I've simplified my problem in an example of, table pstate which  
 contains the probabilities of getting certain changes in the four  
 different states and a dataframe mydata4 which contains all the  
 changes connected to the four different states. I would like to add  
 the probabilities into mydata4 after matching for the change and  
 the state.
 Everything before # output can be copy pasted in the R window.  
 The desired output is written after ## OUTPUT
 Must I write an if else or can I do it in an easier way?
 Your help is greatly appreciated ! Many thanks for your patience.

 You need to figure out how to send mail to the list with plain text.  
 But I suspect you did successfully get the attchment through to the  
 audience.

 I did not like the ordering of the PStates in your new target  
 dataframe so I changed it to fit my(and your) purposes.

  Change-c(b,a,b,c,d,a)
  State-c(State1,State4,State2,State3,State1,State3)
 
  mydata4-data.frame(Change,State)
  mydata4-data.frame(mydata4,
 + PState1=NA,
 + PState2=NA,
 + PState3=NA,
 + PState4=NA
 + )
  mydata4
  Change  State PState1 PState2 PState3 PState4
 1  b State1  NA  NA  NA  NA
 2  a State4  NA  NA  NA  NA
 3  b State2  NA  NA  NA  NA
 4  c State3  NA  NA  NA  NA
 5  d State1  NA  NA  NA  NA
 6  a State3  NA  NA  NA  NA

 Note that str(pstate shows that State is a factor which becomes  
 important.

 This now effects the desired transformation:

 for (i in 1:length(mydata4) )
 {  mydata4[i,  as.numeric( mydata4[i, State])+2 ] -
  #assign to the i-th row, State + 2 column in  
 mydata4 ...
  pstate[ mydata4[i, Change], as.numeric( mydata4[i,  
 State])+1 ] }
   #... the value of i-th row, State+1 column of pstate

   mydata4
  Change  State PState1 PState2 PState3 PState4
 1  b State1 Pb1NANANA
 2  a State4NANANA Pa4
 3  b State2NA Pb2NANA
 4  c State3NANA Pc3NA
 5  d State1 Pd1NANANA
 6  a State3NANA Pa3NA

 The main non-obvious trick is the as.numeric( mydata4[i,  
 State])  bit. as.numeric() when applied to a factor results in a  
 numeric offset derived from the factor coding rather than using the  
 level names. I suppose I could have left the PStaten's in the  
 original order but then I would have been subtracting them from 7 to  
 get the proper column number. Seemed even less understandable


 Regards
 Meenu

 On Sat, Aug 1, 2009 at 9:43 PM, David Winsemius dwinsem...@comcast.net 
  wrote:

 On Aug 1, 2009, at 9:52 AM, Meenu Sahi wrote:

 Deare R users

 I am new to R.
 What I want to do is explained below;-
 I have table called States.Prob which is given below:-
 This table gives the probabilities of the changes in the swap curve
 depending on the state of the swap curve. I want to put these  
 probabilities
 in my dataframe mydata(given after the prob table).
   Prob of States
 Changes  State1  State2 State3 State4
 a Pa1  Pa2 Pa3 Pa4
 b Pb1  Pb2 Pb3 Pb4
 c Pc1  Pc2 Pc3 Pc4
 d Pd1  Pd2 Pd3 Pd4

 and I have a dataframe(with 93 rows) called mydata part of which(6  
 rows) is
 given below where I want to fill in the last four columns with  
 probabilities
 taken from States.Prob according to the change and state in mydata4:-
 Change  State  PState1  PState2  PState3  PState4
 1 b   State1  Pb1
 2 a   State4   Pa4
 3 b   State2Pb2
 4 c   State3 Pc3
 5 d   State1  Pd1
 6 a   State3 Pa3

 What I want to do is highlighted in Red.
 How can I do this easily?

 You may have seen it in red, but we don't, and I, at least,  
 cannot figure out what you intend.   (Per the Posting Guide, which  
 you have obviously not yet read, you need to compose your question  
 in plain old monochromatic text and change your mail client so it  
 posts in plain text.)

 If looking at the help pages for stack() and reshape() does not  
 offer useful information 

Re: [R] about the summary(cph.object)

2009-08-01 Thread zhu yao
Thx for your reply.
In this example, age was transformed with rcs. So the output was different
between f and summary(f).
If I need to publicate the results, how do I explation the hazard ratio of
age?

2009/8/1 David Winsemius dwinsem...@comcast.net


 On Jul 31, 2009, at 11:24 PM, zhu yao wrote:

  Could someone explain the summary(cph.object)?

 The example is in the help file of cph.

 n - 1000
 set.seed(731)
 age - 50 + 12*rnorm(n)
 label(age) - Age
 sex - factor(sample(c('Male','Female'), n,
 rep=TRUE, prob=c(.6, .4)))
 cens - 15*runif(n)
 h - .02*exp(.04*(age-50)+.8*(sex=='Female'))
 dt - -log(runif(n))/h
 label(dt) - 'Follow-up Time'
 e - ifelse(dt = cens,1,0)
 dt - pmin(dt, cens)
 units(dt) - Year
 dd - datadist(age, sex)
 options(datadist='dd')


 This is process for  setting the range for the display of effects in Design
 regression objects. See:

 ?datadist

 q.effect
 set of two quantiles for computing the range of continuous variables to use
 in estimating regression effects. Defaults are c(.25,.75), which yields
 inter-quartile-range odds ratios, etc.

 ?summary.Design
 #---
  By default, inter-quartile range effects (odds ratios, hazards ratios,
 etc.) are printed for continuous factors, ... 
 #---
 Value
 For summary.Design, a matrix of class summary.Design with rows
 corresponding to factors in the model and columns containing the low and
 high values for the effects, the range for the effects, the effect point
 estimates (difference in predicted values for high and low factor values),
 the standard error of this effect estimate, and the lower and upper
 confidence limits.

 #---


  Srv - Surv(dt,e)

 f - cph(Srv ~ rcs(age,4) + sex, x=TRUE, y=TRUE)
 summary(f)

Effects  Response : Srv

 FactorLowHigh   Diff.  Effect S.E. Lower 0.95 Upper 0.95
 age   40.872 57.385 16.513 1.21   0.21 0.80   1.62
  Hazard Ratio 40.872 57.385 16.513 3.35 NA 2.22   5.06


 In this case with a 4 df regression spline, you need to look at  the
 effect across the range of the variable. You ought to plot the age effect
 and examine anova(f) ). In the untransformed situation the plot is on the
 log hazards scale for cph. So the effect for age in this case should be the
 difference in log hazard at ages 40.872 and 57.385. SE is the standard error
 of that estimate and the Upper and Lower numbers are the confidence bounds
 on the effect estimate. The Hazard Ratio row gives you exponentiated
 results, so a difference in log hazards becomes a hazard ratio. {exp(1.21) =
 3.35}

  sex - Female:Male  2.000  1.000 NA 0.64   0.15 0.35   0.94
  Hazard Ratio  2.000  1.000 NA 1.91 NA 1.42   2.55


 Wat's the meaning of Effect, S.E. Lower, Upper?


 You probably ought to read a bit more basic material. If you are asking
 this question, Harrell's Regression Modeling Strategies might be over you
 head, but it would probably be a good investment anyway. Venables and
 Ripley's Modern Applied Statistics has a chapter on survival analysis.
 Also consider Kalbfliesch and Prentice Statistical Analysis of Failure Time
 Data. I'm sure there are others;  those are the ones I have on my shelf.

 David Winsemius, MD
 Heritage Laboratories
 West Hartford, CT



[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] about the summary(cph.object)

2009-08-01 Thread Frank E Harrell Jr

zhu yao wrote:

Thx for your reply.
In this example, age was transformed with rcs. So the output was different
between f and summary(f).
If I need to publicate the results, how do I explation the hazard ratio of
age?


David explained this.  Nonlinearity in age does not complicate the 
explanation.  The estimate is the estimate of the ratio of hazard rate 
at the upper quartile of age compared to the hazard ratio at the lower 
quartile, with the ages corresponding to these 2 points shown in the output.


The output of f is not very useful for publication.  The output of 
summary, Function, and latex are.


Frank



2009/8/1 David Winsemius dwinsem...@comcast.net


On Jul 31, 2009, at 11:24 PM, zhu yao wrote:

 Could someone explain the summary(cph.object)?

The example is in the help file of cph.

n - 1000
set.seed(731)
age - 50 + 12*rnorm(n)
label(age) - Age
sex - factor(sample(c('Male','Female'), n,
rep=TRUE, prob=c(.6, .4)))
cens - 15*runif(n)
h - .02*exp(.04*(age-50)+.8*(sex=='Female'))
dt - -log(runif(n))/h
label(dt) - 'Follow-up Time'
e - ifelse(dt = cens,1,0)
dt - pmin(dt, cens)
units(dt) - Year
dd - datadist(age, sex)
options(datadist='dd')


This is process for  setting the range for the display of effects in Design
regression objects. See:

?datadist

q.effect
set of two quantiles for computing the range of continuous variables to use
in estimating regression effects. Defaults are c(.25,.75), which yields
inter-quartile-range odds ratios, etc.

?summary.Design
#---
 By default, inter-quartile range effects (odds ratios, hazards ratios,
etc.) are printed for continuous factors, ... 
#---
Value
For summary.Design, a matrix of class summary.Design with rows
corresponding to factors in the model and columns containing the low and
high values for the effects, the range for the effects, the effect point
estimates (difference in predicted values for high and low factor values),
the standard error of this effect estimate, and the lower and upper
confidence limits.

#---


 Srv - Surv(dt,e)

f - cph(Srv ~ rcs(age,4) + sex, x=TRUE, y=TRUE)
summary(f)

   Effects  Response : Srv

FactorLowHigh   Diff.  Effect S.E. Lower 0.95 Upper 0.95
age   40.872 57.385 16.513 1.21   0.21 0.80   1.62
 Hazard Ratio 40.872 57.385 16.513 3.35 NA 2.22   5.06


In this case with a 4 df regression spline, you need to look at  the
effect across the range of the variable. You ought to plot the age effect
and examine anova(f) ). In the untransformed situation the plot is on the
log hazards scale for cph. So the effect for age in this case should be the
difference in log hazard at ages 40.872 and 57.385. SE is the standard error
of that estimate and the Upper and Lower numbers are the confidence bounds
on the effect estimate. The Hazard Ratio row gives you exponentiated
results, so a difference in log hazards becomes a hazard ratio. {exp(1.21) =
3.35}

 sex - Female:Male  2.000  1.000 NA 0.64   0.15 0.35   0.94

 Hazard Ratio  2.000  1.000 NA 1.91 NA 1.42   2.55


Wat's the meaning of Effect, S.E. Lower, Upper?


You probably ought to read a bit more basic material. If you are asking
this question, Harrell's Regression Modeling Strategies might be over you
head, but it would probably be a good investment anyway. Venables and
Ripley's Modern Applied Statistics has a chapter on survival analysis.
Also consider Kalbfliesch and Prentice Statistical Analysis of Failure Time
Data. I'm sure there are others;  those are the ones I have on my shelf.

David Winsemius, MD
Heritage Laboratories
West Hartford, CT




[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.




--
Frank E Harrell Jr   Professor and Chair   School of Medicine
 Department of Biostatistics   Vanderbilt University

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] What does .[foo] really mean?

2009-08-01 Thread Simon Tian
Hi R users,

I really want to know what exactly .[foo] means.

Thanks in advance. -Simon

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.