Re: [R] ON MAC, how to copy a plot on to Word document?

2008-09-07 Thread Nicky Chorley
2008/9/7 asdfjkl; [EMAIL PROTECTED]:

 Yes, I don't know how to copy the plot on Mac and paste on to Word because
 you can't right click on the graph and say copy as metafile.
 I'm so surprised I can't find any information about this anywhere on the
 Internet...

The obvious way would be to save it as a PNG/JPEG/BMP/TIFF and then
just import it.

Regards,

Nicky Chorley

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Corrupted PDF files

2008-09-07 Thread Prof Brian Ripley
How are you executing the script?  Quite possibly FAQ Q7.22 applies (it 
will it you use source(), for example).


On Sat, 6 Sep 2008, Nathan Teuscher wrote:


I have the following code that when executed from the command line
works properly and produces a proper PDF. When the script is executed,
the PDF produced is considered corrupt. I am using R 2.7.2 on Mac OSX
10.5.4. Thank you in advance for the help!

library(lattice)
pdf(file=CLDiag2.pdf)
xyplot(
  CL ~ HT + WT + AGE + CREA + SEX,
  data=data2,
  outer=TRUE,
  scales=list(x=list(relation=free)),
  panel=function(...){
panel.loess(..., col=red)
panel.xyplot(..., pch=.)
}
)
dev.off()


Nathan Teuscher
[EMAIL PROTECTED]





[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



--
Brian D. Ripley,  [EMAIL PROTECTED]
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford, Tel:  +44 1865 272861 (self)
1 South Parks Road, +44 1865 272866 (PA)
Oxford OX1 3TG, UKFax:  +44 1865 272595

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] XML - get node by name

2008-09-07 Thread Antje

Hi there,

I try to rewrite some Java-code with R. It deals with reading XML files. I 
started with the XML package. In Java, I had a very useful method which gave me 
a node by using:


name of the node
index of appearance
start point: global (false) / local (true)

So, I could do something like this.

setCurrentChildNode(data, 0);
getValueOfElement(val,1,true);
-- gives 45

setCurrentChildNode(data, 1);
getValueOfElement(val,1,true);
-- gives 11

getValueOfElement(val,1,false);
-- gives 45

root
  data loc=1
val i=t1 22 /val
val i=t2 45 /val
  /data
  data loc=2
val i=t1 44 /val
val i=t2 11 /val
  /data
/root

Now, I'd like to do something like this in R. Most important would be to 
retrieve a node just by its name, not by the whole path. How is it possible?


Can anybody help me with this issue?

Antje

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Test for equality of complicatedly related average correlations

2008-09-07 Thread Ralph79

Thank you very much, Adam. 

I have to get a bit more familiar with the model you propose in order to
understand if it applies to my problem as well. 

My question is not really does time show a different effect but which one
of two measures is more reliable: My respondents have completed exactly the
same questionnaire twice (t=1 and t=2). The questionnaire consisted of two
ways of measuring attribute importance, and the better method of measuring
these importances is the one that gives the same importances for each
respondent in t=1 and t=2. In other words: I want to examine test-retest
reliability of the two measures. Naturally, if X(t=1,t=2)-correlation is
higher for a specific respondent than the Y(t=1,t=2)-corralation, than for
this respondent the method that yields the X-importances is more reliable.
All I want to do is to see if this holds for the whole sample as well...

Anyway, thank you again, I will think of your approach.

Ralph



Adam D. I. Kramer-3 wrote:
 
 Hi Ralph,
 
   I had the same problem you do a few months ago, and realized that
 the question I had (does time show a different effect for X than Y) was
 not
 best modeled as differences between correlations across individuals, but
 as
 whether time interacts with condition.
 
   I answered this question with
 library(nlme)
 lme(obs ~ cond*time, random=~cond*time|subj)
 
 ...where obs is the responses on the X or Y variable, cond is a factor of
 either X or Y, and subj is your subject variable. This fits a heirarchical
 linear model to the data. The relationship between X and time is sig.
 diff.
 from the relationship between Y and time if the cond:time fixed effect is
 true.
 
 This approach makes better use of your data, because when you correlate
 the
 observations, you're effectively losing variability (because
 correlations
 are doubly standardized) as well as degrees of freedom (you have 9 df
 within
 each individual, but each correlation is only one number).
 
 --Adam
 
 On Sat, 6 Sep 2008, Ralph79 wrote:
 

 Dear R-Users,

 I am currently looking for a way to test the equality of two correlations
 that are related in a very special way. Let me describe the situation
 with
 an example.

 - There are 100 respondents, and there are 2 points in time, t=1 and t=2.

 - For each of the respondents and at each of the time points, I have
 information on 10 X-variables and on 10 Y-variables.

 - Based on this information, I calculate two correlations for each
 respondent: cor(X[t=1],X[t=2]) and cor(Y[t=1],Y[t=2]), with X and Y being
 the vectors of the corresponding 10 variables.

 - Now I get the average correlations over the whole sample using Fishers
 Z-transformation, i.e. I have mean(cor(X[t=1],X[t=2])) and
 mean(cor(X[t=1],X[t=2])) and want to know if the mean correlations are
 significantly different!


 I haven't found any test that deals with exactly my situation. Therefore,
 I
 simply apply a paired t-test based on the individual z-correlations.
 From
 my point of view this should be ok, because of the z's normality.
 However, I
 am unsure if there is a better way to test the hypothesis that I am
 interested in?

 I'd be grateful for any comment or hint.

 Thank you very much,

 Ralph

 -
 Ralph Wirth
 University Erlangen-Nuremberg, Chair of Statistics
 GfK Group, Department of Methods and Product Development

 -- 
 View this message in context:
 http://www.nabble.com/Test-for-equality-of-complicatedly-related-average-correlations-tp19346312p19346312.html
 Sent from the R help mailing list archive at Nabble.com.

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.
 
 


-
Ralph Wirth
University Erlangen-Nuremberg, Chair of Statistics
GfK Group, Department of Methods and Product Development

-- 
View this message in context: 
http://www.nabble.com/Test-for-equality-of-complicatedly-related-average-correlations-tp19346312p19355825.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] loop

2008-09-07 Thread Davide Crapis

It's exactely what I was looking for.
Thanks a lot
-- 
View this message in context: 
http://www.nabble.com/loop-tp19346683p19356409.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Mode value

2008-09-07 Thread Jim Lemon

Carlos Morales wrote:

Hello everyone,


I would like to know if there is any function to calculate the mode value, or I 
have to build one to do it.

  

Hi Carlos,
If you mean the mode of a sample from a discrete distribution, try Mode 
in the prettyR package.


Jim

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] R_USER - in which file should I include it?

2008-09-07 Thread Eduardo M. A. M.Mendes
Hello

I am a newbie.  I had my R upgraded from 2.7.1 to 2.7.2 and in doing so I
decided to install all 2.7 versions under c:\program files\R\2.7 from now on
(2.7.1 is located under .\2.7.1) 

Although I don't like the idea (I am running Vista), I have edited
etc\Renviron.site to contain:


R_USER=c:/Users/eduardo/Documents/R
R_LIBS_USER=c:/Users/eduardo/Documents/R/win-library/2.7

As far as R starting always from the same location, that is,
c:/Users/eduardo/Documents/R, etc\Renviron.site didn't help.  So I wonder
whether someone from the list could help me to:

a) force R to start always from the same location
b) force R to install all new packages in the same location


Many thanks

Ed

PS. Before sending this email, I read windows FAQ and browsed the archives
(too many posts in the subject!).

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Problem with starting and using R

2008-09-07 Thread Thomas Lo
Dear all,

I encountered a problem on starting and using the R v  2.7.2 installation on
my PC running Windows Vista and would appreciate your help.

When R was first started, the Rgui returned several error messages:

Error in structure(.Internal(Sys.getenv(as.character(x),
as.character(unset)$   unsupported conversion
Error in file.exists(name) : unsupported conversion in 'filenameToWchar'

In addition, a dialog box called 'Information' popped up with the following
message:

Fatal error: unable to restore saved data in .RData

On clicking 'OK', R closed immediately and the same thing occurs on
restarting R.

After checking for previous related messages online, I followed one of the
recommendations from before and appended --no-restore-data to the R shorcut
target line.  After that, R could start without the 'fatal error'.  However,
some functions such as 'help' and 'setwd' do not work:

e.g. help()
Error: could not find function help

 setwd(DirName)
Error in setwd(DirName) : unsupported conversion in 'filenameToWchar'
I then typed 'Sys.getlocale()' and got this:

 Sys.getlocale()
[1] LC_COLLATE=Chinese (Traditional)_Hong Kong S.A.R..950;LC_CTYPE=Chinese
(Traditional)_Hong Kong S.A.R..950;LC_MONETARY=Chinese (Traditional)_Hong
Kong S.A.R..950;LC_NUMERIC=C;LC_TIME=Chinese (Traditional)_Hong Kong
S.A.R..950
Setting LC_ALL=en in the shortcut target does not appear to work in this
case as I got

During startup - Warning message:
Setting LC_CTYPE=en failed

Furthermore, I tried the patched version of R 2.7.2 and the same problem
occurs.

I would be very grateful if anybody could help.  Many thx.

Thomas

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Label 2 groups in PCA different colours

2008-09-07 Thread Pedro Mardones
here is a simple approach to, for instance, plot scores for PC1 and
PC2 using diff colors:

scores - prcomp(yourdata)$x
plot(scores[1:100,1], scores[1:100,2], pch = 20, col = blue)
points(scores[101:200,1], scores[101:200,2], pch = 20, col = red)

PM

On Sat, Sep 6, 2008 at 11:44 PM, pgseye [EMAIL PROTECTED] wrote:

 Hi,

 I'm wanting to do a PCA on some data which is comprised of two different
 groups (to see how well the groups are discriminated). Is there a way to
 change the colour of the datapoints in a biplot so that I can easily see
 which group is which (eg objects 1-100, red, 101-200, black).

 Might be simple, but I'm new to R and can't seem to find how to do this.

 Thanks.

 Paul
 --
 View this message in context: 
 http://www.nabble.com/Label-2-groups-in-PCA-different-colours-tp19354077p19354077.html
 Sent from the R help mailing list archive at Nabble.com.

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] cohen's kappa

2008-09-07 Thread Weiwei Shi
Dear all,

I have a question on Cohen's kappa:

Assume I have two datasets, one has 500 objects, 10 methods and the other,
1000 different objects, 20 different methods. Could I compare between the
two datasets to conclude the 10 methods are more concordant than the 20
ones by looking at some output, for example, cohen.kappa{concord} ?

One more, could anyone explain in brief, what's the difference between
kappa(Cohen) and kappa(Siegel)?

Thanks,


-- 
Weiwei Shi, Ph.D
Research Scientist
GeneGO, Inc.

Did you always know?
No, I did not. But I believed...
---Matrix III

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] ON MAC, how to copy a plot on to Word document?

2008-09-07 Thread John Kane
I think you need to save the plot and import it into Word.  AFAIK you can only 
copy and paste a plot in Windows.

Have a look at ?png  (There are other formats available)


--- On Sun, 9/7/08, asdfjkl; [EMAIL PROTECTED] wrote:

 From: asdfjkl; [EMAIL PROTECTED]
 Subject: [R]  ON MAC, how to copy a plot on to Word document?
 To: r-help@r-project.org
 Received: Sunday, September 7, 2008, 1:46 AM
 Yes, I don't know how to copy the plot on Mac and paste
 on to Word because
 you can't right click on the graph and say copy
 as metafile. 
 I'm so surprised I can't find any information about
 this anywhere on the
 Internet... 
 -- 
 View this message in context:
 http://www.nabble.com/ON-MAC%2C-how-to-copy-a-plot-on-to-Word-document--tp19354558p19354558.html
 Sent from the R help mailing list archive at Nabble.com.
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained,
 reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] XML - get node by name

2008-09-07 Thread Ajay ohri
well not sure how its done in R , but heres a way to do it in simple Excel.
http://decisionstats.com/2008/parsing-xml-files-easily/

Parsing XML files easily

To parse a XML (or KML or PMML) file easily without using any
complicated softwares, here is a piece of code that fits right in your
excel sheet.

Just import this file using Excel, and then use the function
getElement, after pasting the XML code in 1 cell.

xml-getelement

It is used  for simply reading the xml/kml code as a text string. Just
pasted all the xml code in one cell, and used the start ,end function
(for example start=constraints and end=/constraints to get the
value of constraints in the xml code).

Simply read into the value in another cell using the getElement function.

heres the code if you ever need it.Just paste it into the VB editor of
Excel to create the GetElement function (if not there already) or
simply import the file in the link above.

Attribute VB_Name = Module1″
Public Function getElement(xml As String, start As String, finish As String)
  For i = 1 To Len(xml)
If Mid(xml, i, Len(start)) = start Then
  For j = i + Len(start) To Len(xml)
If Mid(xml, j, Len(finish)) = finish Then
  getElement = Mid(xml, i + Len(start), j - i - Len(start))
  Exit Function
End If
  Next j
End If
  Next i
End Function

On Sun, Sep 7, 2008 at 1:52 PM, Antje [EMAIL PROTECTED] wrote:

 Hi there,

 I try to rewrite some Java-code with R. It deals with reading XML files. I 
 started with the XML package. In Java, I had a very useful method which gave 
 me a node by using:

 name of the node
 index of appearance
 start point: global (false) / local (true)

 So, I could do something like this.

 setCurrentChildNode(data, 0);
 getValueOfElement(val,1,true);
 -- gives 45

 setCurrentChildNode(data, 1);
 getValueOfElement(val,1,true);
 -- gives 11

 getValueOfElement(val,1,false);
 -- gives 45

 root
  data loc=1
val i=t1 22 /val
val i=t2 45 /val
  /data
  data loc=2
val i=t1 44 /val
val i=t2 11 /val
  /data
 /root

 Now, I'd like to do something like this in R. Most important would be to 
 retrieve a node just by its name, not by the whole path. How is it possible?

 Can anybody help me with this issue?

 Antje

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.



--
Regards,

Ajay Ohri
http://tinyurl.com/liajayohri
__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] XML - get node by name

2008-09-07 Thread Dirk Eddelbuettel

On 7 September 2008 at 10:22, Antje wrote:
| I try to rewrite some Java-code with R. It deals with reading XML files. I 
[...]
| Now, I'd like to do something like this in R. Most important would be to 
| retrieve a node just by its name, not by the whole path. How is it possible?
| 
| Can anybody help me with this issue?

Have you looked at the XML package for R ?

Dirk

-- 
Three out of two people have difficulties with fractions.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Fwd: request: most repeated sequnce

2008-09-07 Thread jim holtman
-- Forwarded message --
From: jim holtman [EMAIL PROTECTED]
Date: Sun, Sep 7, 2008 at 11:42 AM
Subject: Re: [R] request: most repeated sequnce
To: Muhammad Azam [EMAIL PROTECTED]


This should do it for you:

 x=c(1,1,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,1,1,2,2,2,3,3,0,0,0,0,0,0,0,0,0,0,1,1,1,2,2,3,3,3,4,4,4,0,0,0,0,0,0,0,1,1,1,2,2,2,3,3,3,4,4,4,
+ 
0,0,0,0,0,0,1,2,2,2,2,2,0,3,3,0,4,4,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0)
  x=array(x,dim=c(3,6,7))
   apply(x,3,function(.mat){
+
+ rows - table(apply(.mat,1,function(z){
+ # remove the zeros
+ z - z[z != 0]
+
+ paste(z,collapse=' ')
+ }))
+ # remove empty strings
+ rows - rows[names(rows) != ]
+
+ if (!is.null(rows)){
+ return(names(rows)[which.max(rows)])
+ } else return(NULL)
+  })
[[1]]
[1] 1

[[2]]
[1] 1 2 3

[[3]]
[1] 1 2 3 4

[[4]]
[1] 1 2 3 4

[[5]]
[1] 2 2 3 4

[[6]]
character(0)

[[7]]
[1] 1




On Sun, Sep 7, 2008 at 8:08 AM, Muhammad Azam [EMAIL PROTECTED] wrote:
 Dear Jim Holtman
 Thanks a lot for your help. The problem is still there. Please consider this
 set of values

 x=c(1,1,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,1,1,2,2,2,3,3,0,0,0,0,0,0,0,0,0,0,1,1,1,2,2,3,3,3,4,4,4,0,0,0,0,0,0,0,1,1,1,2,2,2,3,3,3,4,4,4,

 0,0,0,0,0,0,1,2,2,2,2,2,0,3,3,0,4,4,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0)
  x=array(x,dim=c(3,6,7))
   apply(x,3,function(.mat){

 rows - table(apply(.mat,1,function(z){
 # remove the zeros
 z - z[z != 0]
 if (length(z) == 0) return(NULL)
 paste(z,collapse=' ')
 }))
 names(rows[which.max(rows)])
  })

 output is:
 Error in as.vector(x, mode) : invalid argument 'mode'


 Note: the obtained rows consist of all zeros should not take part in most
 repeated sequence process.

 best regards
 Muhammad Azam

 - Original Message 
 From: jim holtman [EMAIL PROTECTED]
 To: Muhammad Azam [EMAIL PROTECTED]
 Cc: R-help request [EMAIL PROTECTED]; R Help
 r-help@r-project.org
 Sent: Sunday, September 7, 2008 12:36:18 AM
 Subject: Re: [R] request: most repeated sequnce

 This may come closer since it removes the zeros before comparison:


 x=c(1,1,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,1,1,2,2,2,3,3,0,0,0,0,0,0,0,0,0,0,1,1,1,2,2,3,3,3,4,4,4,0,0,0,0,0,0,0,1,1,1,2,2,2,3,3,3,4,4,4,
 + 0,0,0,0,0,0,1,2,2,2,2,2,0,3,3,0,4,4,0,0,0,0,0,0)
 x=array(x,dim=c(3,6,5))
 apply(x,3,function(.mat){
 +rows - table(apply(.mat,1,function(z){
 +# remove the zeros
 +z - z[z != 0]
 +if (length(z) == 0) return(NULL)
 +paste(z,collapse=' ')
 +}))
 +names(rows[which.max(rows)])
 + })
 [1] 1  1 2 3  1 2 3 4 1 2 3 4 2 2 3 4





 On Sat, Sep 6, 2008 at 12:48 PM, Muhammad Azam [EMAIL PROTECTED] wrote:
 Dear R community
 Initially i thought my problem has been solved but one thing which i found
 e.g. if
 1. All the elements of a sector are zero e.g
 , , 7

[,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10]
 [1,]0000000000
 [2,]0000000000
 [3,]0000000000
 [4,]0000000000
 [5,]0000000000

 2. Majority of the rows consist of zeros e.g.
 , , 5

[,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10]
 [1,]4400000000
 [2,]4400000000
 [3,]0000000000
 [4,]0000000000
 [5,]0000000000

 Actually
 zeros are not my values. I get values and fill the remaining parts with
 zeros like x=array(0,dim=c(3,6,5)). Now according to first strategy
 0000000000 are most repeated
 sequence of rows in both of above cases. But i don't want to consider
 cases where all elements are zeros and interested to get  44
 00000000 or just  4  4  in case 2.
 Thanks and best regards

 Muhammad Azam





 - Original Message 
 From: jim holtman [EMAIL PROTECTED]
 To: Muhammad Azam [EMAIL PROTECTED]
 Cc: R Help r-help@r-project.org; R-help request
 [EMAIL PROTECTED]
 Sent: Saturday, September 6, 2008 2:39:19 PM
 Subject: Re: [R] request: most repeated sequnce

 Here is a start.  You can delete the zeros:


 x=c(1,1,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,1,1,2,2,2,3,3,0,0,0,0,0,0,0,0,0,0,1,1,1,2,2,3,3,3,4,4,4,0,0,0,0,0,0,0,1,1,1,2,2,2,3,3,3,4,4,4,
 + 0,0,0,0,0,0,1,2,2,2,2,2,0,3,3,0,4,4,0,0,0,0,0,0)
 x=array(x,dim=c(3,6,5))
 apply(x,3,function(.mat){
 +rows - table(apply(.mat,1,function(z){
 +paste(z,collapse=' ')
 +}))
 +names(rows[which.max(rows)])
 + })
 [1] 1 0 0 0 0 0 1 2 3 0 0 0 1 2 3 4 0 0 1 2 3 4 0 0 2 2 3 4 0 0


 On Sat, Sep 6, 2008 at 4:54 AM, Muhammad Azam [EMAIL PROTECTED] wrote:
 

Re: [R] XML - get node by name

2008-09-07 Thread Gabor Grothendieck
In particular try this:

 Lines - '
+ root
+  data loc=1
+val i=t1 22 /val
+val i=t2 45 /val
+  /data
+  data loc=2
+val i=t1 44 /val
+val i=t2 11 /val
+  /data
+ /root
+ '

 library(XML)
 doc - xmlTreeParse(Lines, asText = TRUE, trim = TRUE, useInternalNodes = 
 TRUE)
 root - xmlRoot(doc)

 data1 - getNodeSet(root, //data)[[1]]
 xmlValue(getNodeSet(data1, //val)[[1]])
[1]  22 



On Sun, Sep 7, 2008 at 11:42 AM, Dirk Eddelbuettel [EMAIL PROTECTED] wrote:

 On 7 September 2008 at 10:22, Antje wrote:
 | I try to rewrite some Java-code with R. It deals with reading XML files. I
 [...]
 | Now, I'd like to do something like this in R. Most important would be to
 | retrieve a node just by its name, not by the whole path. How is it possible?
 |
 | Can anybody help me with this issue?

 Have you looked at the XML package for R ?

 Dirk

 --
 Three out of two people have difficulties with fractions.

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] ON MAC, how to copy a plot on to Word document?

2008-09-07 Thread Dr Eberhard W Lisse

I just CMD-C'd it and pasted it into OpenOffice with CMD-V.

el

On 07 Sep 2008, at 16:57 , John Kane wrote:

I think you need to save the plot and import it into Word.  AFAIK  
you can only copy and paste a plot in Windows.


Have a look at ?png  (There are other formats available)


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Axis tick label format and rotation

2008-09-07 Thread hadley wickham
Hi Kurt,

 Please tell me how to format data in a data frame so when currency amount is 
 displayed in a chart the axis tick labels contain leading $ signs.

The easiest way is add a custom scale:
vals - seq(0, 100, by = 10)
qplot(...) + scale_x_continuous(breaks = vals, labels = paste($,
vals, sep = ))

 Please also tell me if it is possible to rotate x axis labels using ggplot2.

Not easily, but there will be in the next version.

Hadley

-- 
http://had.co.nz/

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] using nls to fit a curve to data

2008-09-07 Thread jpl

Thanks Ben!

Switching over to the gamma pdf and using the algorithm=plinear did the
trick.

jpl

-- 
View this message in context: 
http://www.nabble.com/using-nls-to-fit-a-curve-to-data-tp19332210p19360761.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] XML - get node by name

2008-09-07 Thread Duncan Temple Lang
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

Hi Antje

Well, the XML package gives you a variety of ways to parse
an XML document and manipulate it in R.
Perhaps the approach that best matches the Java-style you
outline is to use XPath to access nodes.
To do this, you use
  doc = xmlTreeParse(filename.xml, useInternalNodes = TRUE)

and then access the elements of interest with XPath queries, e.g.
to get the value of the second val element within each data
element, use

  xpathApply(doc, //data, function(n) xmlValue(n[[2]]))

To get the first val node in the first data you could use

  doc[ //data/val ] [[1]]

or

  doc[[ //data[1]/val[1] ]]


(Note the indexing/subsetting is being done in different languages.)


Being able to access a node by just its name is convenient,
but it may not be adequate. You may pick up too many matching nodes.
So XPath is a powerful way to be able to use simplicity when it is
adequate and more explicit constrantts on the path when more
specificity is necessary.  And XPath is a widespread standard
mechanism for XML rather than specific to R or Java.

HTH,

  D.


Antje wrote:
 Hi there,
 
 I try to rewrite some Java-code with R. It deals with reading XML files.
 I started with the XML package. In Java, I had a very useful method
 which gave me a node by using:
 
 name of the node
 index of appearance
 start point: global (false) / local (true)
 
 So, I could do something like this.
 
 setCurrentChildNode(data, 0);
 getValueOfElement(val,1,true);
 -- gives 45
 
 setCurrentChildNode(data, 1);
 getValueOfElement(val,1,true);
 -- gives 11
 
 getValueOfElement(val,1,false);
 -- gives 45
 
 root
   data loc=1
 val i=t1 22 /val
 val i=t2 45 /val
   /data
   data loc=2
 val i=t1 44 /val
 val i=t2 11 /val
   /data
 /root
 
 Now, I'd like to do something like this in R. Most important would be to
 retrieve a node just by its name, not by the whole path. How is it
 possible?
 
 Can anybody help me with this issue?
 
 Antje
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.
-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.9 (Darwin)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iEYEARECAAYFAkjD4osACgkQ9p/Jzwa2QP7ZUACfYpsezY4T2AeKb3G7Jo6Vr0N0
RmwAnAtKCY5s8vBoDx7C1DFP24eveCtk
=XWJ8
-END PGP SIGNATURE-

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] an error to call 'gee' function in R

2008-09-07 Thread Qinglin Wu
Dear List:

I found an error when I called the 'gee' function.  I cannot solve and explain 
it.  There are no errors when I used the 'geeglm' function.  Both functions fit 
the gee model.  The project supervisor recommends me to use the 'gee' function. 
 But I cannot explain to him why this error happens.   Would you help me solve 
this problem?  I appreciate your help.

In this project I will use the 'gee' or 'geeglm' and 'glmer' to fit the 
simulated multivariate count responses.  I generated the data like this:

Set β0 = β1 = 1, μ0 = 3, and n = 50.
For each 1 ≤ i ≤ n,
Simulate xi from N (1, 1).
Simulate zi0 and zit from
zi0 follows i.d. Poisson (μ0) ,
zit | xi follows i.d. Poisson (μit) , 1 ≤ t ≤ 3,
log (μit) = log(E (zit | xi)) = β0t + xiβ1t = 1+xi.
Let yit = zi0 + zit, 1 ≤ t ≤ 3.

So my data frame, let me call it 'simdata', the first 10 rows look like this:
  id y.1 y.2 y.3   x
   1   3   5   6 -0.06588626
   2   6   7   6 -0.08265981
   3   6   8  13  0.58307719
   4  22  21  28  2.21099940
   5   5  12   8  1.06299869
   6   8  21  24  1.47615784
   7  11   8   9  0.83748390
   8  16  15  16  1.67011313
   9   9   7   7 -0.14181264
  10  31  37  40  2.56751453


This is the longitudinal data.  I will change its shape to analyze it.  The 
changed 'newdata' looks like this:
  id   x time  y
   1 -0.065886261  3
   2 -0.082659811  6
   3  0.583077191  6
   4  2.210999401 22
   5  1.062998691  5
   6  1.476157841  8
   7  0.837483901 11
   8  1.670113131 16
   9 -0.141812641  9
  10  2.567514531 31
 ...
   1 -0.065886262  5
   2 -0.082659812  7
   3  0.583077192  8
   4  2.210999402 21
   5  1.062998692 12
   6  1.476157842 21
   7  0.837483902  8
   8  1.670113132 15
   9 -0.141812642  7
  10  2.567514532 37
 ...
   1 -0.065886263  6
   2 -0.082659813  6
   3  0.583077193 13
   4  2.210999403 28
   5  1.062998693  8
   6  1.476157843 24
   7  0.837483903  9
   8  1.670113133 16
   9 -0.141812643  7
  10  2.567514533 40
 ...


My data 'y' comes from x.  So their correlations are not independent.  What 
does the argument 'corstr' mean it defined in the function.  I tried all 
choices.  But the error was still there.  Here was the function I used in my 
programming:

 mfit1 - 
gee(y~x,data=newdata,family=poisson(link=log),id=id,corstr=exchangeable)

 GEE:  GENERALIZED LINEAR MODELS FOR DEPENDENT DATA
 gee S-function, version 4.13 modified 98/01/27 (1998)

 Model:
 Link:  Logarithm
 Variance to Mean Relation: Poisson
 Correlation Structure: Exchangeable

 Call:
 gee(formula = y ~ x, id = id, data = newdata, family = poisson(link = log),
corstr = exchangeable)

 Number of observations :  150

 Maximum cluster size   :  1

 Coefficients:
 (Intercept)   x
  1.5849653   0.7937203

 Estimated Scale Parameter:  1.162505
 Number of Iterations:  1

 Working Correlation[1:4,1:4]
 Error in print(x$working.correlation[1:4, 1:4], digits = digits) :
  subscript out of bounds



Is this kind of data not fit for the function 'gee'?  Because when I tested 
this two functions  by using the R data 'warpbreaks' they worked perfect 
although some returned objects were different. I used the following to do it:
 1.  (summary(gee(breaks ~ tension, id=wool, data=warpbreaks, 
corstr=exchangeable))
 2.  summary(geeglm(breaks ~ tension, id=wool, data=warpbreaks, 
corstr=exchangeable))).

The first one is from the example of ?gee file.

I will attach the part of my programming as .R file.  You can excute in R 
software.  Thanks a lot.


 I appreciate your help.


 Best regards.

 Sincerely,
 Cynthia Wu



  __
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] re ferring to a group of vectors without explicit enumeration

2008-09-07 Thread shalu

Thanks very much. This is what I implemented. I found a similar example
elsewhere too. 
It works fine now.



jholtman wrote:
 
 I would suggest that you use a list to store the values since it is
 easier to create and reference:
 
 output - list()
 for (i in 1:10) output[[i]] - seq(i)

 output
 [[1]]
 [1] 1
 
 [[2]]
 [1] 1 2
 
 [[3]]
 [1] 1 2 3
 
 [[4]]
 [1] 1 2 3 4
 
 [[5]]
 [1] 1 2 3 4 5
 
 [[6]]
 [1] 1 2 3 4 5 6
 
 [[7]]
 [1] 1 2 3 4 5 6 7
 
 [[8]]
 [1] 1 2 3 4 5 6 7 8
 
 [[9]]
 [1] 1 2 3 4 5 6 7 8 9
 
 [[10]]
  [1]  1  2  3  4  5  6  7  8  9 10
 

 
 
 On Sat, Sep 6, 2008 at 5:33 PM, shalu [EMAIL PROTECTED] wrote:

 I am trying to define 25 vectors of varying lengths, say y1 to y25 in a
 loop,
 and then store the results of some computations in them. My problem is
 about
 using some sort of concatenation for names. For example, instead of
 initializing each of y1 through y25, I would like to do it in a loop.
 Similar to cat and paste for texts, is there anyway of using yi for the
 vector name where i ranges from 1 to 25, so ultimately it refers to the
 vector y1,..,y25?
 Varying lengths is not a problem. To start with each has only length 1
 and
 then I will be adding to each vector based on some results.
 --
 View this message in context:
 http://www.nabble.com/referring-to-a-group-of-vectors-without-explicit-enumeration-tp19351518p19351518.html
 Sent from the R help mailing list archive at Nabble.com.

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

 
 
 
 -- 
 Jim Holtman
 Cincinnati, OH
 +1 513 646 9390
 
 What is the problem that you are trying to solve?
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.
 
 

-- 
View this message in context: 
http://www.nabble.com/referring-to-a-group-of-vectors-without-explicit-enumeration-tp19351518p19359352.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Regression with nominal data

2008-09-07 Thread soeren . vogel

Hi,

y is nominal (3 categories), x1 to 3 is scale. What I want is a  
regression, showing the probability to fall in one of the three  
categories of y according to the x. How can I perform such a  
regression in R?


Thanks for your help

Sören

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Regression with nominal data

2008-09-07 Thread Dimitris Rizopoulos

check:

help(multinom, package = nnet)


I hope it helps.

Best,
Dimitris


[EMAIL PROTECTED] wrote:

Hi,

y is nominal (3 categories), x1 to 3 is scale. What I want is a 
regression, showing the probability to fall in one of the three 
categories of y according to the x. How can I perform such a regression 
in R?


Thanks for your help

Sören

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide 
http://www.R-project.org/posting-guide.html

and provide commented, minimal, self-contained, reproducible code.



--
Dimitris Rizopoulos
Assistant Professor
Department of Biostatistics
Erasmus University Medical Center

Address: PO Box 2040, 3000 CA Rotterdam, the Netherlands
Tel: +31/(0)10/7043399
Fax: +31/(0)10/7044657

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] XML - get node by name

2008-09-07 Thread Gabor Grothendieck
On Sun, Sep 7, 2008 at 12:10 PM, Gabor Grothendieck
[EMAIL PROTECTED] wrote:
 In particular try this:

 Lines - '
 + root
 +  data loc=1
 +val i=t1 22 /val
 +val i=t2 45 /val
 +  /data
 +  data loc=2
 +val i=t1 44 /val
 +val i=t2 11 /val
 +  /data
 + /root
 + '

 library(XML)
 doc - xmlTreeParse(Lines, asText = TRUE, trim = TRUE, useInternalNodes = 
 TRUE)
 root - xmlRoot(doc)

 data1 - getNodeSet(root, //data)[[1]]
 xmlValue(getNodeSet(data1, //val)[[1]])
 [1]  22 


The last line should be the following (although in this case it
actually gives the same answer):

xmlValue(getNodeSet(data1, val)[[1]])




 On Sun, Sep 7, 2008 at 11:42 AM, Dirk Eddelbuettel [EMAIL PROTECTED] wrote:

 On 7 September 2008 at 10:22, Antje wrote:
 | I try to rewrite some Java-code with R. It deals with reading XML files. I
 [...]
 | Now, I'd like to do something like this in R. Most important would be to
 | retrieve a node just by its name, not by the whole path. How is it 
 possible?
 |
 | Can anybody help me with this issue?

 Have you looked at the XML package for R ?

 Dirk

 --
 Three out of two people have difficulties with fractions.

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.



__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] XML - get node by name

2008-09-07 Thread Antje

Thanks a lot to Gabor and Duncan!

I didn't know that XPath is a standard. I'll give it a deeper look to better 
understand it.


Oh, I guess I understand a bit more

xpathApply(doc, //val, function(n) xmlValue(n))

would search globally for all nodes named val and return its values :-)
So that's excactly what I was looking for. Not caring about the exact location 
of a node.

I think, in my case it should be okay, to parse for nodes just by their names.

Thanks again!

@ Ajay: Sorry, but I was looking for a solution with R
@ Dirk: I already used the XML package but didn't know the possibilities to 
access data as I was used to.




Antje schrieb:

Hi there,

I try to rewrite some Java-code with R. It deals with reading XML files. 
I started with the XML package. In Java, I had a very useful method 
which gave me a node by using:


name of the node
index of appearance
start point: global (false) / local (true)

So, I could do something like this.

setCurrentChildNode(data, 0);
getValueOfElement(val,1,true);
-- gives 45

setCurrentChildNode(data, 1);
getValueOfElement(val,1,true);
-- gives 11

getValueOfElement(val,1,false);
-- gives 45

root
  data loc=1
val i=t1 22 /val
val i=t2 45 /val
  /data
  data loc=2
val i=t1 44 /val
val i=t2 11 /val
  /data
/root

Now, I'd like to do something like this in R. Most important would be to 
retrieve a node just by its name, not by the whole path. How is it 
possible?


Can anybody help me with this issue?

Antje

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide 
http://www.R-project.org/posting-guide.html

and provide commented, minimal, self-contained, reproducible code.



__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Averaging 'blocks' of data

2008-09-07 Thread Steve Murray


Dear all,

I have a large dataset which I hope to reduce in size, to make it more useable. 
I hope to do this by taking an average of each 60 x 60 blockof values and 
forming a new data frame out of the averaged values.

How would I go about taking averages of 60 x 60 'blocks' in R, and cycling 
through the whole dataset, recording each calculated value in a new table/data 
frame?

Many thanks for any advice offered.

Steve

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] XML - get node by name

2008-09-07 Thread Gabor Grothendieck
On Sun, Sep 7, 2008 at 2:56 PM, Antje [EMAIL PROTECTED] wrote:
 Thanks a lot to Gabor and Duncan!

 I didn't know that XPath is a standard. I'll give it a deeper look to better
 understand it.

 Oh, I guess I understand a bit more

 xpathApply(doc, //val, function(n) xmlValue(n))

or just

xpathApply(doc, //val, xmlValue)



 would search globally for all nodes named val and return its values :-)
 So that's excactly what I was looking for. Not caring about the exact
 location of a node.
 I think, in my case it should be okay, to parse for nodes just by their
 names.

 Thanks again!

 @ Ajay: Sorry, but I was looking for a solution with R
 @ Dirk: I already used the XML package but didn't know the possibilities to
 access data as I was used to.



 Antje schrieb:

 Hi there,

 I try to rewrite some Java-code with R. It deals with reading XML files. I
 started with the XML package. In Java, I had a very useful method which gave
 me a node by using:

 name of the node
 index of appearance
 start point: global (false) / local (true)

 So, I could do something like this.

 setCurrentChildNode(data, 0);
 getValueOfElement(val,1,true);
 -- gives 45

 setCurrentChildNode(data, 1);
 getValueOfElement(val,1,true);
 -- gives 11

 getValueOfElement(val,1,false);
 -- gives 45

 root
  data loc=1
val i=t1 22 /val
val i=t2 45 /val
  /data
  data loc=2
val i=t1 44 /val
val i=t2 11 /val
  /data
 /root

 Now, I'd like to do something like this in R. Most important would be to
 retrieve a node just by its name, not by the whole path. How is it possible?

 Can anybody help me with this issue?

 Antje

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Averaging 'blocks' of data

2008-09-07 Thread Gabor Grothendieck
This was answered last month:

http://tolstoy.newcastle.edu.au/R/e4/help/08/08/19091.html

On Sun, Sep 7, 2008 at 3:32 PM, Steve Murray [EMAIL PROTECTED] wrote:


 Dear all,

 I have a large dataset which I hope to reduce in size, to make it more 
 useable. I hope to do this by taking an average of each 60 x 60 blockof 
 values and forming a new data frame out of the averaged values.

 How would I go about taking averages of 60 x 60 'blocks' in R, and cycling 
 through the whole dataset, recording each calculated value in a new 
 table/data frame?

 Many thanks for any advice offered.

 Steve

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Averaging 'blocks' of data

2008-09-07 Thread Dylan Beaudette
On Sun, Sep 7, 2008 at 12:32 PM, Steve Murray [EMAIL PROTECTED] wrote:


 Dear all,

 I have a large dataset which I hope to reduce in size, to make it more 
 useable. I hope to do this by taking an average of each 60 x 60 blockof 
 values and forming a new data frame out of the averaged values.

what does the data look like? vector / matrix / list ?


 How would I go about taking averages of 60 x 60 'blocks' in R, and cycling 
 through the whole dataset, recording each calculated value in a new 
 table/data frame?

some form of apply(), tapply(), mapply(), or lapply() would probably
do what you want

 Many thanks for any advice offered.

 Steve


Here is a start:

# step 1. too much data: 10x10 matrix
m - matrix(runif(100), ncol=10)

# step 2. reduce down to a 10x1 vector, averaging-by-row:
apply(m, 1, mean)

# step 3 profit.

Dylan

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Averaging 'blocks' of data

2008-09-07 Thread Steve Murray

Gabor - thanks for your suggestion... I had checked the previous post, but I 
found (as a new user of R) this approach to be too complicated and I had 
problems gaining the correct output values. If there is a simpler way of doing 
this, then please feel free to let me know.

Dylan - thanks, your approach is a good start. In answer to your questions, my 
data are 43200 columns and 16800 rows as a data frame - I will probably have to 
read the dataset in segments though, as it won't fit into the memory!  I've 
been able to follow your example - how would I be able to apply this technique 
for finding the average of each 60 x 60 block?

Any other suggestions are of course welcome!

Many thanks again,

Steve

_
Discover Bird's Eye View now with Multimap from Live Search

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Averaging 'blocks' of data

2008-09-07 Thread Steve Murray

Gabor - thanks for your suggestion... I had checked the previous post, but I 
found (as a new user of R) this approach to be too complicated and I had 
problems gaining the correct output values. If there is a simpler way of doing 
this, then please feel free to let me know.

Dylan - thanks, your approach is a good start. In answer to your questions, my 
data are 43200 columns and 16800 rows as a data frame - I will probably have to 
read the dataset in segments though, as it won't fit into the memory!  I've 
been able to follow your example - how would I be able to apply this technique 
for finding the average of each 60 x 60 block?

Any other suggestions are of course welcome!

Many thanks again,

Steve

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] how to draw a vertical line from points to x-axis

2008-09-07 Thread Anny Huang
Hello,

I want to know how to draw a line connecting each point to the x-axis
perpendicularly (i.e. a vertical line).
abline(v=...) seems not to work for my purpose, because it runs over the
data point. Can anyone help? Thanks.

Anny

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] creating a vignette

2008-09-07 Thread Edna Bell
Dear R Gurus:

How would I create a vignette, please?

Why would a vignette be better than examples, please?

Thanks,
Edna Bell

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] restructuring datset problem

2008-09-07 Thread jim holtman
This should do it for you:

  CODE NAME
13  aaa
23  aab
33  aac
44  bba
54  bbb
64  bbc
74  bbd
85  cca
95  ccb
 x.s - split(x$NAME, x$CODE)
 maxLine - max(table(x$CODE))
 # pad out the lines
 x.pad - lapply(x.s, function(line){
+ # convert to character
+ line - as.character(line)
+ length(line) - maxLine
+ line
+ })
 as.data.frame(do.call(rbind, x.pad))
   V1  V2   V3   V4
3 aaa aab  aac NA
4 bba bbb  bbc  bbd
5 cca ccb NA NA


On Sun, Sep 7, 2008 at 2:23 PM, Gellrich  Mario
[EMAIL PROTECTED] wrote:
 Hi,

 I've got a question regarding the restructering of a data set. What I have 
 are municipality zip-codes and the names of 5'000 built-up areas within 
 municipalities. The following example shows, what I would like to do:

 Input (Zip-Codes and Names):

 # CODE NAME
 #1   3  aaa
 #2   3  aab
 #3   3  aac
 #4   4  bba
 #5   4  bbb
 #6   4  bbc
 #7   4  bbd
 #8   5  cca
 #9   5  ccb

 Desired Output (Zip-Codes and restructured names)

 #  CODE  V2V3V4V5
 #1  3   aaa   aab   aacNA
 #2  4   bba   bbb   bbc   bbd
 #3  5   cca   ccb   NA NA

 I tougth about this problem several hours and tried functions like 
 aggregate() and t() in combination with for-loops but didn't came to the 
 output above. Can anybody help me?

 Best regards,

 Mario



 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




-- 
Jim Holtman
Cincinnati, OH
+1 513 646 9390

What is the problem that you are trying to solve?

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] how to draw a vertical line from points to x-axis

2008-09-07 Thread Peter Alspach
Anny

Here's one way:

plot(0:10, 0:10, pch=16)
lines(rep(0:10, each=3), t(matrix(c(0:10, rep(c(0,NA), each=11)),
ncol=3))) 

HTH 

Peter Alspach


 -Original Message-
 From: [EMAIL PROTECTED] 
 [mailto:[EMAIL PROTECTED] On Behalf Of Anny Huang
 Sent: Monday, 8 September 2008 8:49 a.m.
 To: r-help@r-project.org
 Subject: [R] how to draw a vertical line from points to x-axis
 
 Hello,
 
 I want to know how to draw a line connecting each point to 
 the x-axis perpendicularly (i.e. a vertical line).
 abline(v=...) seems not to work for my purpose, because it 
 runs over the data point. Can anyone help? Thanks.
 
 Anny
 
   [[alternative HTML version deleted]]
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide 
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.
 

The contents of this e-mail are privileged and/or confidential to the named
 recipient and are not to be used by any other person and/or organisation.
 If you have received this e-mail in error, please notify the sender and delete
 all material pertaining to this e-mail.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Averaging 'blocks' of data

2008-09-07 Thread jim holtman
Here is a way to do it by reading in 60 lines at a time and computing the means:

# create some test data
n - 360
x - matrix(runif(360*16800), nrow=16800)
cat(x, file=/tempxx.txt)


# now process the data 60 lines at a time, averaging each 60x60 block
result - matrix(0, nrow=6, ncol=280)
nextLine - 1  # next output in the result
# create a list of indices to use to partition the input matrix
colIndex - split(seq(16800), (seq(16800) - 1) %/% 60)
input - file(/tempxx.txt, r)
while (TRUE){
# use 'scan' to read in 60 lines at a time
block - scan(input, what=0, n=60*16800)
if (length(block) != 60 * 16800) break  # exit if done
# convert to a matrix
block - matrix(block, nrow=60, byrow=TRUE)
# compute the mean and store it
result[nextLine,] - sapply(colIndex, function(.blk){
mean(block[, .blk])
})
nextLine - nextLine + 1
}


On Sun, Sep 7, 2008 at 4:46 PM, Steve Murray [EMAIL PROTECTED] wrote:

 Gabor - thanks for your suggestion... I had checked the previous post, but I 
 found (as a new user of R) this approach to be too complicated and I had 
 problems gaining the correct output values. If there is a simpler way of 
 doing this, then please feel free to let me know.

 Dylan - thanks, your approach is a good start. In answer to your questions, 
 my data are 43200 columns and 16800 rows as a data frame - I will probably 
 have to read the dataset in segments though, as it won't fit into the memory! 
  I've been able to follow your example - how would I be able to apply this 
 technique for finding the average of each 60 x 60 block?

 Any other suggestions are of course welcome!

 Many thanks again,

 Steve

 _
 Discover Bird's Eye View now with Multimap from Live Search

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




-- 
Jim Holtman
Cincinnati, OH
+1 513 646 9390

What is the problem that you are trying to solve?

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] R_USER - in which file should I include it?

2008-09-07 Thread Eduardo M. A. M.Mendes
Hello

I am a newbie.  I had my R upgraded from 2.7.1 to 2.7.2 and in doing so I
decided to install all 2.7 versions under c:\program files\R\2.7 from now
on (2.7.1 is located under .\2.7.1) 

Although I don't like the idea (I am running Vista), I have edited
etc\Renviron.site to contain:


R_USER=c:/Users/eduardo/Documents/R
R_LIBS_USER=c:/Users/eduardo/Documents/R/win-library/2.7

As far as R starting always from the same location, that is,
c:/Users/eduardo/Documents/R, etc\Renviron.site didn't help.  So I wonder
whether someone from the list could help me to:

a) force R to start always from the same location
b) force R to install all new packages in the same location


Many thanks

Ed

PS. Before sending this email, I read windows FAQ and browsed the archives
(too many posts in the subject!).

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] how to draw a vertical line from points to x-axis

2008-09-07 Thread Barry Rowlingson
2008/9/7 Anny Huang [EMAIL PROTECTED]:
 Hello,

 I want to know how to draw a line connecting each point to the x-axis
 perpendicularly (i.e. a vertical line).
 abline(v=...) seems not to work for my purpose, because it runs over the
 data point. Can anyone help? Thanks.


 If your x-axis is at y=zero then plot with type='h' will do this:

   plot(1:10,runif(10),type='h',ylim=c(0,1))

 but it will draw lines *up* if the value is negative:

   plot(1:10,(1:10)-5,type='h')

 Or do you really want the lines to come right down to the axis line?
In which case a modified version of Peter Alspach's solution which
goes down to the limit of the plot instead of zero should work. See
help(par) for what par()$usr is all about.

 y= 6+0:10
 x=0:10
 plot(x,y,pch=16,ylim=c(-2,17))
 lines(rep(x,each=3),t(matrix(c(y,rep(c(par()$usr[3],NA),each=11)),ncol=3)))

Barry

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] restructuring datset problem

2008-09-07 Thread Gabor Grothendieck
Try this:

 # read in data ensuring NAME is character, not factor
 Lines -  CODE NAME
+ 13  aaa
+ 23  aab
+ 33  aac
+ 44  bba
+ 54  bbb
+ 64  bbc
+ 74  bbd
+ 85  cca
+ 95  ccb
+ 
 DF - read.table(textConnection(Lines), header = TRUE, as.is = TRUE)

 DF$seq = ave(DF$CODE, DF$CODE, FUN = seq_along)
 tapply(DF$NAME, DF[c(CODE, seq)], c)
seq
CODE 1 2 3 4
   3 aaa aab aac NA
   4 bba bbb bbc bbd
   5 cca ccb NANA


On Sun, Sep 7, 2008 at 2:23 PM, Gellrich  Mario
[EMAIL PROTECTED] wrote:
 Hi,

 I've got a question regarding the restructering of a data set. What I have 
 are municipality zip-codes and the names of 5'000 built-up areas within 
 municipalities. The following example shows, what I would like to do:

 Input (Zip-Codes and Names):

 # CODE NAME
 #1   3  aaa
 #2   3  aab
 #3   3  aac
 #4   4  bba
 #5   4  bbb
 #6   4  bbc
 #7   4  bbd
 #8   5  cca
 #9   5  ccb

 Desired Output (Zip-Codes and restructured names)

 #  CODE  V2V3V4V5
 #1  3   aaa   aab   aacNA
 #2  4   bba   bbb   bbc   bbd
 #3  5   cca   ccb   NA NA

 I tougth about this problem several hours and tried functions like 
 aggregate() and t() in combination with for-loops but didn't came to the 
 output above. Can anybody help me?

 Best regards,

 Mario



 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] run optim() on a list

2008-09-07 Thread Weidong Gu
Hi,

 

I am at the end of my wit to figure out how to run the optim function on
a list.

 

Basically, I have a data set of three columns as Site, Pool and
Positivity ( the full data set is copied at the end). I want to run
the maximal likelihood estimation separately on subsets split by Site

 

data-read.table(...)

sp-split(data,data$Site)

 

# My likelihood function is

like-function(p,...){

  for(i in 1:length(Pool)){

if(Positivity[i]==1)
log.l[i]-log(1-(1-p)^Pool[i])

else log.l[i]-Pool[i]*log(1-p)

}

return(sum(log.l))

}

# Then I run 

lapply(sp,function(x) optim (0.1,like,control=list(fnscale=-1)))

 

#But it gives an estimation based on the full data, not separately on
sp[[1]], sp[[2]],... I tried do.call without success. So, your help
would be appreciated.

 

 

Weidong Gu

 

Department of Medicine
University of Alabama, Birmingham
1900 University Blvd., Birmingham, Alabama 35294
Email: [EMAIL PROTECTED]
PH: (205)-975-9053

 

Site  Pool Positivity

UBA_1

22

0

UBA_1

50

0

UBA_1

23

0

UBA_1

25

0

UBA_1

35

0

UBA_1

24

0

UBA_1

26

0

Cham_res

43

0

Cham_res

45

0

Cham_res

34

0

Cham_res

24

0

Cham_res

21

0

Cham_res

16

0

Cham_res

28

0

Cham_res

50

0

Cham_res

50

1

Cham_res

39

1

UBA_2

16

0

UBA_2

18

1

UBA_2

42

1

UBA_2

35

1

UBA_2

50

1

UBA_2

26

0

UBA_2

20

0

UBA_2

16

0

UBA_2

19

0

UBA_2

50

0

UBA_2

26

0

UBA_2

13

1

UBA_2

30

1

UBA_3

17

0

UBA_3

20

0

UBA_3

19

0

UBA_3

50

0

UBA_3

24

1

UBA_3

18

1

UBA_3

16

1

UBA_3

14

0

UBA_3

12

0

UBA_3

15

0

UBA_3

11

0

UBA_3

20

1

UBA_3

19

1

UBA_3

31

1

UBA_4

12

0

UBA_4

11

0

UBA_4

12

0

UBA_4

21

0

UBA_4

33

0

UBA_4

15

0

UBA_4

10

0

 


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Request for advice on character set conversions (those damn Excel files, again ...)

2008-09-07 Thread Emmanuel Charpentier
Dear list,

I have to read a not-so-small bunch of not-so-small Excel files, which 
seem to have traversed Window 3.1, Windows95 and Windows NT versions of 
the thing (with maybe a Mac or two thrown in for good measure...).
The problem is that 1) I need to read strings, and 2) those 
strings may have various encodings. In the same sheet of the same file, 
some cells may be latin1, some UTF-8 and some CP437 (!).

read.xls() alows me to read those things in sets of dataframes. my 
problem is to convert the encodings to UTF8 without cloberring those who 
are already (looking like) UTF8.

I came to the following solution :

foo-function(d, from=latin1,to=UTF-8){
  # Semi-smart conversion of a dataframe between charsets.
  # Needed to ease use of those [EMAIL PROTECTED] Excel files
  # that have survived the Win3.1 -- Win95 -- NT transition,
  # usually in poor shape..
  conv1-function(v,from,to) {
condconv-function(v,from,to) {
  cnv-is.na(iconv(v,to,to))
  v[cnv]-iconv(v[cnv],from,to)
  return(v)
}
if (is.factor(v)) {
  l-condconv(levels(v),from,to)
  levels(v)-l
  return(v)
}
else if (is.character(v)) return(condconv(v,from,to))
else return(v)
  }
  for(i in names(d)) d[,i]-conv1(d[,i],from,to)
  return(d)
}

Any advice for enhancement is welcome...

Sincerely yours,

Emmanuel Charpentier

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] an error to call 'gee' function in R

2008-09-07 Thread Thomas Lumley


From the gee() help page:

Data are assumed to be sorted so that observations
on a cluster are contiguous rows for all entities in the formula.

-thomas


On Sun, 7 Sep 2008, Qinglin Wu wrote:


Dear List:

I found an error when I called the 'gee' function.  I cannot solve and explain 
it.  There are no errors when I used the 'geeglm' function.  Both functions fit 
the gee model.  The project supervisor recommends me to use the 'gee' function. 
 But I cannot explain to him why this error happens.   Would you help me solve 
this problem?  I appreciate your help.

In this project I will use the 'gee' or 'geeglm' and 'glmer' to fit the 
simulated multivariate count responses.  I generated the data like this:

Set ??0 = ??1 = 1, ??0 = 3, and n = 50.
For each 1 ?? i ?? n,
Simulate xi from N (1, 1).
Simulate zi0 and zit from
zi0 follows i.d. Poisson (??0) ,
zit | xi follows i.d. Poisson (??it) , 1 ?? t ?? 3,
log (??it) = log(E (zit | xi)) = ??0t + xi??1t = 1+xi.
Let yit = zi0 + zit, 1 ?? t ?? 3.

So my data frame, let me call it 'simdata', the first 10 rows look like this:
 id y.1 y.2 y.3   x
  1   3   5   6 -0.06588626
  2   6   7   6 -0.08265981
  3   6   8  13  0.58307719
  4  22  21  28  2.21099940
  5   5  12   8  1.06299869
  6   8  21  24  1.47615784
  7  11   8   9  0.83748390
  8  16  15  16  1.67011313
  9   9   7   7 -0.14181264
 10  31  37  40  2.56751453


This is the longitudinal data.  I will change its shape to analyze it. 
The changed 'newdata' looks like this:



 id   x time  y
  1 -0.065886261  3
  2 -0.082659811  6
  3  0.583077191  6
  4  2.210999401 22
  5  1.062998691  5
  6  1.476157841  8
  7  0.837483901 11
  8  1.670113131 16
  9 -0.141812641  9
 10  2.567514531 31
...
  1 -0.065886262  5
  2 -0.082659812  7
  3  0.583077192  8
  4  2.210999402 21
  5  1.062998692 12
  6  1.476157842 21
  7  0.837483902  8
  8  1.670113132 15
  9 -0.141812642  7
 10  2.567514532 37
...
  1 -0.065886263  6
  2 -0.082659813  6
  3  0.583077193 13
  4  2.210999403 28
  5  1.062998693  8
  6  1.476157843 24
  7  0.837483903  9
  8  1.670113133 16
  9 -0.141812643  7
 10  2.567514533 40
...


My data 'y' comes from x.  So their correlations are not independent.  What 
does the argument 'corstr' mean it defined in the function.  I tried all 
choices.  But the error was still there.  Here was the function I used in my 
programming:

mfit1 - 
gee(y~x,data=newdata,family=poisson(link=log),id=id,corstr=exchangeable)

GEE:  GENERALIZED LINEAR MODELS FOR DEPENDENT DATA
gee S-function, version 4.13 modified 98/01/27 (1998)

Model:
Link:  Logarithm
Variance to Mean Relation: Poisson
Correlation Structure: Exchangeable

Call:
gee(formula = y ~ x, id = id, data = newdata, family = poisson(link = log),
   corstr = exchangeable)

Number of observations :  150

Maximum cluster size   :  1

Coefficients:
(Intercept)   x
 1.5849653   0.7937203

Estimated Scale Parameter:  1.162505
Number of Iterations:  1

Working Correlation[1:4,1:4]
Error in print(x$working.correlation[1:4, 1:4], digits = digits) :
 subscript out of bounds



Is this kind of data not fit for the function 'gee'?  Because when I tested 
this two functions  by using the R data 'warpbreaks' they worked perfect 
although some returned objects were different. I used the following to do it:
1.  (summary(gee(breaks ~ tension, id=wool, data=warpbreaks, 
corstr=exchangeable))
2.  summary(geeglm(breaks ~ tension, id=wool, data=warpbreaks, 
corstr=exchangeable))).

The first one is from the example of ?gee file.

I will attach the part of my programming as .R file.  You can excute in R 
software.  Thanks a lot.


I appreciate your help.


Best regards.

Sincerely,
Cynthia Wu



Thomas Lumley   Assoc. Professor, Biostatistics
[EMAIL PROTECTED]   University of Washington, Seattle__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] creating a vignette

2008-09-07 Thread Duncan Murdoch

On 07/09/2008 4:51 PM, Edna Bell wrote:

Dear R Gurus:

How would I create a vignette, please?

Why would a vignette be better than examples, please?


You can create a vignette any way you like, but the most common way is 
with Sweave:  write a document that is mainly LaTeX, but with R code 
inclusions that are executed and displayed.


The main value of a vignette is that it documents more:  instead of 
documenting one or a few functions, it can document a whole package, or 
a whole type of analysis using several different packages.


Duncan Murdoch

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Help parametric boot

2008-09-07 Thread ctu

Hi R users
Is there any example for nonlinear parametric boot? I google it but I  
can't find it. I am interested in the parameter estimators of a  
nonlinear model. But I really don't know how to code it in the  
ran.gen statement (data set from ?nls)



fm1 - nls(weight ~ Asym/(1+exp((xmid-Time)/scal)), data = ChickWeight,

+start=c(Asym=337, xmid=16, scal=8))


fm1.fun-function(data){coef(update(fm1,data=data))}



ran.sim-function(data,mle){out-rnorm(n=nrow(data,mle));out}



fm1.boot-boot(ChickWeight, statistic = fm1.fun, R=99, sim=parametric,

+ran.gen=ran.sim, mle=coef(fm1))
Error in nrow(data, mle) :
  unused argument(s) (c(337.605336871528, 16.0688379710354, 8.00747460385483))


Any suggestion would be very helpful.
many thanks in advance
Chunhao

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] an error to call 'gee' function in R

2008-09-07 Thread Qinglin Wu
Dear Thomas Lumley:

According to your suggestion I have sorted the 'newdata' by id.  When I print 
the results the error was still there.  So I tried to sort the 'newdata' by y, 
x although I don't think it make sense.  The error was still there.  Here it 
was the part of the code:

data1=newdata[order(newdata$id),]
print(gee model:)
mfit1 - 
gee(y~x,data=data1,family=poisson(link=log),id=id,corstr=exchangeable)
print(mfit1)


the error is still in the results:
 GEE:  GENERALIZED LINEAR MODELS FOR DEPENDENT DATA
 gee S-function, version 4.13 modified 98/01/27 (1998) 

Model:
 Link:  Logarithm 
 Variance to Mean Relation: Poisson 
 Correlation Structure: Exchangeable 

Call:
gee(formula = y ~ x, id = id, data = data1, family = poisson(link = log), 
corstr = exchangeable)

Number of observations :  150 

Maximum cluster size   :  3 


Coefficients:
(Intercept)   x 
 1.6389  0.7619 

Estimated Scale Parameter:  1.012
Number of Iterations:  1

Working Correlation[1:4,1:4]
Error in print(x$working.correlation[1:4, 1:4], digits = digits) : 
  subscript out of bounds


I don't know why it is.  Thanks for your help.

I still attach my part of the programming.  This programming you can run in R.  
Thanks again.

Best regards.

Sincerely,
Cynthia Wu




--- On Sun, 9/7/08, Thomas Lumley [EMAIL PROTECTED] wrote:

 From: Thomas Lumley [EMAIL PROTECTED]
 Subject: Re: [R] an error to call 'gee' function in R
 To: Qinglin Wu [EMAIL PROTECTED]
 Cc: r-help@r-project.org
 Date: Sunday, September 7, 2008, 6:41 PM
 From the gee() help page:
 
 Data are assumed to be sorted so that observations
 on a cluster are contiguous rows for all entities in the
 formula.
 
   -thomas
 
 
 On Sun, 7 Sep 2008, Qinglin Wu wrote:
 
  Dear List:
 
  I found an error when I called the 'gee'
 function.  I cannot solve and explain it.  There are no
 errors when I used the 'geeglm' function.  Both
 functions fit the gee model.  The project supervisor
 recommends me to use the 'gee' function.  But I
 cannot explain to him why this error happens.   Would you
 help me solve this problem?  I appreciate your help.
 
  In this project I will use the 'gee' or
 'geeglm' and 'glmer' to fit the simulated
 multivariate count responses.  I generated the data like
 this:
 
  Set ¦Â0 = ¦Â1 = 1, ¦Ì0 = 3, and n = 50.
  For each 1 ¡Ü i ¡Ü n,
  Simulate xi from N (1, 1).
  Simulate zi0 and zit from
  zi0 follows i.d. Poisson (¦Ì0) ,
  zit | xi follows i.d. Poisson (¦Ìit) , 1 ¡Ü t ¡Ü
 3,
  log (¦Ìit) = log(E (zit | xi)) = ¦Â0t + xi¦Â1t =
 1+xi.
  Let yit = zi0 + zit, 1 ¡Ü t ¡Ü 3.
 
  So my data frame, let me call it 'simdata',
 the first 10 rows look like this:
   id y.1 y.2 y.3   x
1   3   5   6 -0.06588626
2   6   7   6 -0.08265981
3   6   8  13  0.58307719
4  22  21  28  2.21099940
5   5  12   8  1.06299869
6   8  21  24  1.47615784
7  11   8   9  0.83748390
8  16  15  16  1.67011313
9   9   7   7 -0.14181264
   10  31  37  40  2.56751453
 
 
  This is the longitudinal data.  I will change its
 shape to analyze it. 
  The changed 'newdata' looks like this:
 
   id   x time  y
1 -0.065886261  3
2 -0.082659811  6
3  0.583077191  6
4  2.210999401 22
5  1.062998691  5
6  1.476157841  8
7  0.837483901 11
8  1.670113131 16
9 -0.141812641  9
   10  2.567514531 31
  ...
1 -0.065886262  5
2 -0.082659812  7
3  0.583077192  8
4  2.210999402 21
5  1.062998692 12
6  1.476157842 21
7  0.837483902  8
8  1.670113132 15
9 -0.141812642  7
   10  2.567514532 37
  ...
1 -0.065886263  6
2 -0.082659813  6
3  0.583077193 13
4  2.210999403 28
5  1.062998693  8
6  1.476157843 24
7  0.837483903  9
8  1.670113133 16
9 -0.141812643  7
   10  2.567514533 40
  ...
 
 
  My data 'y' comes from x.  So their
 correlations are not independent.  What does the argument
 'corstr' mean it defined in the function.  I tried
 all choices.  But the error was still there.  Here was the
 function I used in my programming:
 
  mfit1 -
 gee(y~x,data=newdata,family=poisson(link=log),id=id,corstr=exchangeable)
 
  GEE:  GENERALIZED LINEAR MODELS FOR DEPENDENT DATA
  gee S-function, version 4.13 modified 98/01/27 (1998)
 
  Model:
  Link:  Logarithm
  Variance to Mean Relation: Poisson
  Correlation Structure: Exchangeable
 
  Call:
  gee(formula = y ~ x, id = id, data = newdata, family =
 poisson(link = log),
 corstr = exchangeable)
 
  Number of observations :  150
 
  Maximum cluster size   :  1
 
  Coefficients:
  (Intercept)   x
   1.5849653   0.7937203
 
  Estimated Scale Parameter:  1.162505
  Number of Iterations:  1
 
  Working Correlation[1:4,1:4]
  Error in 

Re: [R] extracting max row from data matrix

2008-09-07 Thread Jorge Ivan Velez
Dear Srini,
Here is one way:

# Data set
x=read.table(textConnection(fruit weight
1  apple1.3
2  apple1.5
3  apple1.6
4 orange1.4
5 orange1.6),header=TRUE)

x[tapply(x$weight,x$fruit,which.max),]
 apple orange
   1.61.6

or


Try also

x[cumsum(tapply(x$weight,x$fruit,which.max)),]
fruit weight
3  apple1.6
5 orange1.6


HTH,

Jorge



On Sun, Sep 7, 2008 at 10:24 PM, Srinivas Iyyer
[EMAIL PROTECTED]wrote:

 dear group,
 i have a data matrix with some replicate items with different values. I
 want to extract the row with max value.

 for example:
  x
   fruit weight
 1  apple1.3
 2  apple1.5
 3  apple1.6
 4 orange1.4
 5 orange1.6


 x is a data frame.
 I want to extract unique items from fruits that has max weight.

 that is:

 3  apple1.6
 5 orange1.6

 I want to be able to use apply functions. Could some one lend some help
 please.

 Thanks
 srini

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Averaging 'blocks' of data

2008-09-07 Thread Robert A LaBudde
I'm not sure I exactly understand your problem, but if you are 
looking for a recursive algorithm for calculating the average by 
addition of one record only at a time, consider:


y[k] = y[k-1] + (x[k] - y[k-1])/k,  where y(0) = 0, k = 1, 2, ...

At each stage, y[k] = (x[1]+...+x[k])/k.


At 04:46 PM 9/7/2008, Steve Murray wrote:

Gabor - thanks for your suggestion... I had checked the previous 
post, but I found (as a new user of R) this approach to be too 
complicated and I had problems gaining the correct output values. If 
there is a simpler way of doing this, then please feel free to let me know.


Dylan - thanks, your approach is a good start. In answer to your 
questions, my data are 43200 columns and 16800 rows as a data frame 
- I will probably have to read the dataset in segments though, as it 
won't fit into the memory!  I've been able to follow your example - 
how would I be able to apply this technique for finding the average 
of each 60 x 60 block?


Any other suggestions are of course welcome!

Many thanks again,

Steve

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



Robert A. LaBudde, PhD, PAS, Dpl. ACAFS  e-mail: [EMAIL PROTECTED]
Least Cost Formulations, Ltd.URL: http://lcfltd.com/
824 Timberlake Drive Tel: 757-467-0954
Virginia Beach, VA 23464-3239Fax: 757-467-2947

Vere scire est per causas scire

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] cohen's kappa

2008-09-07 Thread Weiwei Shi
one more question,

The third value (kappa (2*PA-1)) is adjusted for prevalence using the
method proposed by Byrt, Bishop and Carlin (1993)  --- from ?cohen.kappa

What does the prevalence refer to?
On Sun, Sep 7, 2008 at 10:43 PM, Weiwei Shi [EMAIL PROTECTED] wrote:

 Dear all,

 I have a question on Cohen's kappa:

 Assume I have two datasets, one has 500 objects, 10 methods and the other,
 1000 different objects, 20 different methods. Could I compare between the
 two datasets to conclude the 10 methods are more concordant than the 20
 ones by looking at some output, for example, cohen.kappa{concord} ?

 One more, could anyone explain in brief, what's the difference between
 kappa(Cohen) and kappa(Siegel)?

 Thanks,


 --
 Weiwei Shi, Ph.D
 Research Scientist
 GeneGO, Inc.

 Did you always know?
 No, I did not. But I believed...
 ---Matrix III




-- 
Weiwei Shi, Ph.D
Research Scientist
GeneGO, Inc.

Did you always know?
No, I did not. But I believed...
---Matrix III

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Regression with nominal data

2008-09-07 Thread paulandpen
Soren,

It sounds like you are new to R so I will refer you to some packages that I 
think some people would find more user friendly as beginners.  

Zelig is excellent.  You could run a series of logistic regressions coding your 
dependent variables as follows (a versus b, a versus c, b versus c)  

See the website below

http://gking.harvard.edu/zelig/docs/index.html

Alternatively you could try Rattle

See the website below

http://rattle.togaware.com/rattle-features.html

Or you could try Rcmder

HTH Paul


 [EMAIL PROTECTED] wrote:
 
 Hi,
 
 y is nominal (3 categories), x1 to 3 is scale. What I want is a  
 regression, showing the probability to fall in one of the three  
 categories of y according to the x. How can I perform such a  
 regression in R?
 
 Thanks for your help
 
 Sören
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide 
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Poisson Distribution - Chi Square Test for Goodness of Fit

2008-09-07 Thread saggak




Dear R-help,



 

Chi Square Test for Goodness of Fit

 

 

Problem Faced :

 

I have got a discrete data
as given below (R script)

 

No_of_Frauds -c 
1,1,1,1,1,1,1,1,1,2,1,1,1,1,1,1,2,1,2,2,2,1,1,2,1,1,1,1,4,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,5,1,2,1,1,1,1,1,1,1,3,2,1,1,1,2,1,1,2,1,1,1,1,1,2,1,3,1,2,1,2,14,2,1,1,38,3,3,2,44,1,4,1,4,1,2,2,1,3)

 

I am trying to fit Poisson
distribution to this data using R.

 

When I run this script using
R – console,

 

I am getting value of Chi – Square Statistics as
high as “6.95753e+37”

 

When I did the same calculations in Excel, I got
the Chi Square Statistics value = 138.34.


 

Although it is clear that the sample data doesn’t
follow Poisson distribution, and I will have to look for other discrete
distribution, my problem is the HIGH Value of Chi Square test statistics. When
I analyzed further, I understood the problem. 

 

(A) By convention, if your Expected
frequency is less than 5, then by we put together such classes and form a new
class such that Expected frequency is greater than 5 and also accordingly
adjust the observed frequencies.

 




 
  
  X
  
  
  Oi
  
  
  Ei
  
  
  ((Oi - Ei)^2)/Ei
  
 
 
  
  0
  
  
  0
  
  
  10
  
  
  9.96
  
 
 
  
  1
  
  
  72
  
  
  23
  
  
  103.79
  
 
 
  
  2
  
  
  17
  
  
  27
  
  
  3.54
  
 
 
  
  3
  
  
  5
  
  
  21
  
  
  11.85
  
 
 
  
  4
  
  
  3
  
  
  12
  
  
  6.71
  
 
 
  
  5
  
  
  4
  
  
  9
  
  
  2.51
  
 
 
  
  Total
  
  
  101
  
  
  101
  
  
  138.34
  
 




 

 

When I apply this logic in Excel, I am getting the
reasonable result (i.e. 138.34), however in Excel also, if I don’t apply this
logic, my Chi square test statistic value is as high as 4.70043E+37.

 

My
question is how do I modify my R – script, so that the logic mentioned in (A)
i.e. adjusting the Expected frequencies (and accordingly Observed frequencies) 
is
applied so that the expected frequency becomes greater than 5 for a given
class, thereby resulting in reasonable value of Chi Square test Statistics.

 

My R – script is given below -



 

# R SCRIPT for Fitting
Poisson Distribution

 

No_of_Frauds -c 
1,1,1,1,1,1,1,1,1,2,1,1,1,1,1,1,2,1,2,2,2,1,1,2,1,1,1,1,4,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,5,1,2,1,1,1,1,1,1,1,3,2,1,1,1,2,1,1,2,1,1,1,1,1,2,1,3,1,2,1,2,14,2,1,1,38,3,3,2,44,1,4,1,4,1,2,2,1,3)

 

 

N              -             length(No_of_Frauds)

 

Average     -             mean(No_of_Frauds)

 

Lambda     -             Average

 

i               -             c(0:(N-1))

 

pmf           -             dpois(i, Lambda, log = FALSE)

 

 

#


 

 

# Ho: The data follow Poisson
Distribution Vs H1: Not Ho

 

 

# observed frequencies (Oi)

 

variable.cnts
      -     table(No_of_Frauds)

variable.cnts.prs
 -     dpois(as.numeric(names(variable.cnts)),
lambda)

variable.cnts
      -     c(variable.cnts, 0)

 

variable.cnts.prs -     c(variable.cnts.prs,
1-sum(variable.cnts.prs))

tst
                   -     chisq.test(variable.cnts,
p=variable.cnts.prs)

 

chi_squared
       -     as.numeric(unclass(tst)$statistic)

p_value             -     as.numeric(unclass(tst)$p.value)

df
                    -     tst[2]$parameter

 

 

cv1                    -     qchisq(p=.01, 
df=tst[2]$parameter, lower.tail = FALSE, log.p =
FALSE)

 

cv2                    -     qchisq(p=.05, 
df=tst[2]$parameter, lower.tail = FALSE, log.p =
FALSE)

 

cv3                    -     qchisq(p=.1, 
df=tst[2]$parameter, lower.tail = FALSE, log.p =
FALSE)

#-

 

# Expected value

 

# variable.cnts.prs *
sum(variable.cnts) 

 

 

#
if tst  cv reject Ho at alpha confidence level

 

#-

 

if(chi_squared  cv1)

 

Conclusion1 - 'Sample
does not come from the postulated probability distribution at 1% los' else

Conclusion1 - 'Sample
comes from postulated prob. distribution at 1% los'

 

 

if(chi_squared  cv2)

 

Conclusion2 - 'Sample
does not come from the postulated probability distribution at 5% los' else

Conclusion2 - 'Sample
comes from postulated prob. distribution at 1% los'

 

if(chi_squared  cv3)

Conclusion3 - 'Sample
does not come from the postulated probability distribution at 10% los' else

Conclusion3 - 'Sample
come from postulated prob distribution at 1% los'

Â