[R] Restricted VAR parameter estimation

2007-08-18 Thread Megh Dal
I have a VAR model with five macro-economic variables, y[1], y[2], y[3], y[4], 
y[5]. They are related to each other in following manner.

y[1,t] = alpha[1,0] + beta[1,1, 1]*y[1,t-1]++beta[1,1, 
12]*y[1,t-12] + beta[1,2, 1]*y[2,t-1]++beta[1,2, 12]*y[2,t-12] + 
e[1,t]

y[2,t] = alpha[2,0] + beta[2,2, 1]*y[2,t-1]++beta[2,2, 
12]*y[2,t-12] + e[2,t]

y[3,t] = alpha[3,0] + beta[3,1, 1]*y[1,t-1]++beta[3,1, 
12]*y[1,t-12] 
+ beta[3,2, 1]*y[2,t-1]++beta[3,2, 12]*y[2,t-12]
+ beta[3,3, 1]*y[3,t-1]++beta[3,3, 12]*y[3,t-12]
+ beta[3,4, 1]*y[4,t-1]++beta[3,4, 12]*y[4,t-12] + e[3,t]

y[4,t] = alpha[4,0] + beta[4,3, 1]*y[3,t-1]++beta[4,3, 
12]*y[3,t-12] 
+ beta[4,4, 1]*y[4,t-1]++beta[4,4, 12]*y[4,t-12] + e[4,t]

y[5,t] = alpha[5,0] + beta[5,3, 1]*y[3,t-1]++beta[5,3, 
12]*y[3,t-12] 
+ beta[5,5, 1]*y[5,t-1]++beta[5,5, 12]*y[4,t-12] + e[5,t]

All variables are stationary

Now I want to estimate the coefficients under a VAR[12] framework. Is it 
mathematically correct to estimate coefficients of each equaltion with simple 
OLS separately? Or how I can use R [mAr.est() function) to estimate them?
   
  Regards,


   
-

[[alternative HTML version deleted]]

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] doubt about string comparison

2007-08-18 Thread ramakanth reddy
I  have two large arrays of strings array1 with 18 names and array2 with 
24000 names ,I want to find the common names in both of them.

My arrays are  for example

Array1 Array2

GAP4 
 HIST1B-histamine

MFG12   SNRPD-signal induced...

CFH1A   RNF-ribose nucleic...

My
array 2 ,has description of the abbreviation ,how can I remove the
description part before using intersect command to match common names
in array1 and array2.
the description and abbreviation are separated by a hyphen --.

I tried using matlab ,but due to the large file size it didnot work.
Can you suggest me something to overcome this problem.

Thank You
rk




  5, 50, 500, 5000 - Store N number of mails in your inbox. Go to 
http://help.yahoo.com/l/in/yahoo/mail/yahoomail/tools/tools-08.html
[[alternative HTML version deleted]]

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Regulatory Compliance and Validation Issues

2007-08-18 Thread Cody Hamilton
As I work as a biostatistician in the medical devices industry, I have been 
very happy to take part in several conversations on this list regarding the use 
of R in a regulated environment.  It was with great interest, therefore, that I 
read the new guidance document for the use of R in regulated clinical trial 
environments now available on the R website.  The purpose of the document is 
"provide a reasonable consensus position on the part of the R Foundation for 
Statistical Computing ... relative to the use of R within ... regulated 
environments and to provide a common foundation for end users to meet their own 
internal standard operating procedures, documentation requirements and 
regulatory obligations."  I believe this work will be a gold mine for those who 
are seeking to convince their organizations that R is a viable everyday 
statistical tool for use in a regulated environment.  I would like to offer 
personal thanks to (in alphabetical order) Frank Harrell, Tony Rossini, and 
Marc Schwartz for all their efforts on this document.

After an initial discussion introducing the relevant regulatory documents, 
section 2 presents the scope of the R guidance document.  I was grateful for 
this section as it spells out for the readers (not all of whom may be 
statisticians or R users) which packages are considered by the document (those 
that bear the copyright of the R foundation).  This is important as it gives 
software quality departments a limit on which packages are under consideration 
- I think some software quality people fear that approving R for use implies 
'opening the flood gates' to any user-created package that might be available.  
I believe that additional packages could perhaps be validated separately if a 
company so chooses (there are several I would be loathe to part with), but the 
packages considered under the guidance document are clearly defined as those 
bearing the R foundation copyright.

Sections 3 and 4 introduce both the R foundation and the R software for those 
who are unfamiliar with both.  Again, I am glad these materials are included as 
they may potentially increase the comfort level amongst those who are 
suspicious of open source software.  The document presents the R foundation as 
the stable organization that it is and provides a good overview of the purpose 
of the R software - this latter item will be particularly useful to me as the 
first questions I receive from software quality are 'what is it for' and 'why 
do you need it?'

Sections 5-7 contain the heart of the document (in my opinion).  They cover 
qualification/validation of sytems for 21 CFR 11 compliance, software 
development life cycle, and 21 CFR 11 compliance functionality.  These sections 
deserve special attention, and I for one will need some time to fully digest 
all the information in these sections.

I have a few specific comments/questions that I would like to present to the R 
help list.

1. The document in no way absolves users from the usual IQ/OQ/PQ required for 
any software to be used in a regulated setting.  I am very glad that this point 
is clearly made in the document (and I believe it was made by presenters at the 
useR meeting as well).

2. While the document's scope is limited to base R plus recommended packages, I 
believe most companies will need access to functionalities provided by packages 
not included in the base or recommended packages.  (For example, I don't think 
I could survive without the sas.get() function from the Design library.)  How 
can a company address the issues covered in the document for packages outside 
its scope?  For example, what if a package's author does not maintain 
historical archive versions of the package?  What if the author no longer 
maintains the package?  Is the solution to add more packages to the recommended 
list (I'm fairly certain that this would not be a simple process) or is there 
another solution?

3. At least at my company, each new version must undergo basically the same 
IQ/OQ/PQ as the first installation.  As new versions of R seem to come at least 
once a year, the ongoing validation effort would be painful if the most 
up-to-date version of R is to be maintained within the company.  Is there any 
danger it delaying the updates (say updating R within the company every two 
years or so)?

As always, I am speaking for myself and not necessarily for Edwards 
Lifesciences.

Regards,
   -Cody Hamilton

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Any parser generator / code assistance for R?

2007-08-18 Thread Ali -
Hi,

Is there any parser generator like www.antlr.org?  Moreover, how does simple 
code assistance work currently in R? By 'simple code assistance' I meant 
things like:

Object$M --> Object$Method

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] multiple colors within same line of text

2007-08-18 Thread Jim Lemon
Andrew Yee wrote:
> Hi, I'm interested in using mtext(), but with the option of having multiple
> colors in the same line of text.
> 
> For example, creating a line of text where:
> 
> Red is red and blue is blue
> 
> How do you create a text argument that lets you do this within mtext()?
> 
You can do something like this with "text" and then use xpd=TRUE to use 
it outside the plot. I think it would be more fiddly trying to use "mtext".

concat.text<-function(x,y,txt,col) {
  thisx<-x
  for(txtstr in 1:length(txt)) {
   text(thisx,y,txt[txtstr],col=col[txtstr],adj=0)
   thisx<-thisx+strwidth(txt[txtstr])
  }
}
plot(0,xlim=c(0,1),ylim=c(0,1),type="n")
ctext<-c("Roses are ","red, ","violets are ","purple")
concat.text(0,0.5,ctext,col=c("black","red","black","purple"))

Jim

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Question about unicode characters in tcltk

2007-08-18 Thread Peter Dalgaard
R Help wrote:
> hello list,
>
> Can someone help me figure out why the following code doesn't work?
> I'm trying to but both Greek letters and subscripts into a tcltk menu.
>  The code creates all the mu's, and the 1 and 2 subscripts, but it
> won't create the 0.  Is there a certain set of characters that R won't
> recognize the unicode for?  Or am I input the \u2080 incorrectly?
>
> library(tcltk)
> m <-tktoplevel()
> frame1 <- tkframe(m)
> frame2 <- tkframe(m)
> frame3 <- tkframe(m)
> entry1 <- tkentry(frame1,width=5,bg='white')
> entry2 <- tkentry(frame2,width=5,bg='white')
> entry3 <- tkentry(frame3,width=5,bg='white')
>
> tkpack(tklabel(frame1,text='\u03bc\u2080'),side='left')
> tkpack(tklabel(frame2,text='\u03bc\u2081'),side='left')
> tkpack(tklabel(frame3,text='\u03bc\u2082'),side='left')
>
> tkpack(frame1,entry1,side='top')
> tkpack(frame2,entry2,side='top')
> tkpack(frame3,entry3,side='top')
>
> thanks
> -- Sam
>
>   
Which OS was this? I can reproduce the issue on SuSE, but NOT Fedora 7.

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] rmeta package forestplot() function

2007-08-18 Thread Christie Jeon
Dear R users,

I am trying to create a forest plot with a table of text using the
forestplot() function in the rmeta package.
I have been trying to reduce the font size of the resulting table of text,
but have not been successful.  I have tried adding options like 'cex' or
'font' but none of them seem to work.  Is there anything I could do?

This is what I have so far:

forestplot(tabletext,m,l,u,zero=0,

is.summary=c(TRUE,TRUE,rep(FALSE,14),TRUE),

clip=c(log(0.5), log(15)), xlog=TRUE,

graphwidth = unit(4, "inches"), xticks=c(0.5, 1, 2, 4, 8,15),

col=meta.colors(box="royalblue",line="darkblue", summary="royalblue"))

Thanks.

Christie

[[alternative HTML version deleted]]

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] residual plots for lmer in lme4 package

2007-08-18 Thread Gregor Gorjanc
John Maindonald  anu.edu.au> writes:
...
> The issue of checking for normality of effects in multi-level
> models has not been very much researched, as far as I can
> tell.  The function residuals()  gives residuals that adjust for
> all except the "highest" level of random effects.  Depending
> on the relative magnitudes of the variance components,
> whether or not these "residuals" are anywhere near normal
> may not be of much or any consequence.ž

For what it is worth I have came across this paper just recently:

http://www3.interscience.wiley.com/cgi-bin/abstract/114280441

Gregor

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Question about unicode characters in tcltk

2007-08-18 Thread Gavin Simpson
On Sat, 2007-08-18 at 14:40 +0200, Peter Dalgaard wrote:
> R Help wrote:
> > hello list,
> >
> > Can someone help me figure out why the following code doesn't work?
> > I'm trying to but both Greek letters and subscripts into a tcltk menu.
> >  The code creates all the mu's, and the 1 and 2 subscripts, but it
> > won't create the 0.  Is there a certain set of characters that R won't
> > recognize the unicode for?  Or am I input the \u2080 incorrectly?
> >
> > library(tcltk)
> > m <-tktoplevel()
> > frame1 <- tkframe(m)
> > frame2 <- tkframe(m)
> > frame3 <- tkframe(m)
> > entry1 <- tkentry(frame1,width=5,bg='white')
> > entry2 <- tkentry(frame2,width=5,bg='white')
> > entry3 <- tkentry(frame3,width=5,bg='white')
> >
> > tkpack(tklabel(frame1,text='\u03bc\u2080'),side='left')
> > tkpack(tklabel(frame2,text='\u03bc\u2081'),side='left')
> > tkpack(tklabel(frame3,text='\u03bc\u2082'),side='left')
> >
> > tkpack(frame1,entry1,side='top')
> > tkpack(frame2,entry2,side='top')
> > tkpack(frame3,entry3,side='top')
> >
> > thanks
> > -- Sam
> >
> >   
> Which OS was this? I can reproduce the issue on SuSE, but NOT Fedora 7.

I can reproduce this on Fedora 7 in that the \u2080 is reproduced as is
and not as a subscript, unlike the other \u which appear as
subscripted characters,

> sessionInfo()
R version 2.5.1 Patched (2007-08-02 r42389) 
i686-pc-linux-gnu 

locale:
LC_CTYPE=en_US.UTF-8;LC_NUMERIC=C;LC_TIME=en_US.UTF-8;LC_COLLATE=en_US.UTF-8;LC_MONETARY=en_US.UTF-8;LC_MESSAGES=en_US.UTF-8;LC_PAPER=en_US.UTF-8;LC_NAME=C;LC_ADDRESS=C;LC_TELEPHONE=C;LC_MEASUREMENT=en_US.UTF-8;LC_IDENTIFICATION=C

attached base packages:
[1] "tcltk" "stats" "graphics"  "grDevices" "utils"
"datasets" 
[7] "methods"   "base" 

If there is something specific to my Fedora installation that is
different to Peter's that I can ascertain from installed packages/fonts
etc, then let me know and I can provide the output from my laptop.

G

-- 
%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%
 Gavin Simpson [t] +44 (0)20 7679 0522
 ECRC, UCL Geography,  [f] +44 (0)20 7679 0565
 Pearson Building, [e] gavin.simpsonATNOSPAMucl.ac.uk
 Gower Street, London  [w] http://www.ucl.ac.uk/~ucfagls/
 UK. WC1E 6BT. [w] http://www.freshwaters.org.uk
%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] doubt about string comparison

2007-08-18 Thread Charles C. Berry

rk,

See

?sub

?regexp

Then try

sub( "-", "-dash-", "a-b" )
sub( "-.*", "", "a-b" )

Chuck

On Sat, 18 Aug 2007, ramakanth reddy wrote:

> I  have two large arrays of strings array1 with 18 names and array2 with 
> 24000 names ,I want to find the common names in both of them.
>
> My arrays are  for example
>
> Array1 Array2
>
> GAP4
> HIST1B-histamine
>
> MFG12   SNRPD-signal induced...
>
> CFH1A   RNF-ribose nucleic...
>
> My
> array 2 ,has description of the abbreviation ,how can I remove the
> description part before using intersect command to match common names
> in array1 and array2.
> the description and abbreviation are separated by a hyphen --.
>
> I tried using matlab ,but due to the large file size it didnot work.
> Can you suggest me something to overcome this problem.
>
> Thank You
> rk
>
>
>
>
>  5, 50, 500, 5000 - Store N number of mails in your inbox. Go to 
> http://help.yahoo.com/l/in/yahoo/mail/yahoomail/tools/tools-08.html
>   [[alternative HTML version deleted]]
>
> __
> R-help@stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

Charles C. Berry(858) 534-2098
 Dept of Family/Preventive Medicine
E mailto:[EMAIL PROTECTED]  UC San Diego
http://famprevmed.ucsd.edu/faculty/cberry/  La Jolla, San Diego 92093-0901

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] SVD for sparse matrices

2007-08-18 Thread Oliver Lyttelton
Dear All,

I wish to compute the SVD of a matrix M which represents n rows of
observations of p column measurements. The p measurements are sampled
from a 2 dimensional surface, such that meaningful "neighbourhood"
relationships exists between the p measurement columns.

p is too large to compute the covariance matrix directly (10,000
columns). However, by using neighbourhood information, many of the
entries (i,j) in the covariance matrix can be "masked" to zero, (on
the basis that points i and j lie too far from each other to be
considered. I will call this binary mask matrix B.

So now I have my original matrix M with n observations of p
measurements, and a sparse binary mask matrix B, with
sum(B==0)>>sum(B==1), and B[i,j]==0 indicates that the relationship
between measurements i and j are considered uninteresting.

I wish to compute the SVD of the covariance matrix t(M)%*%M, which is
too large 10,000^2 entries. However only a small subset of these
measurements are non-zero, after applying the mask B.

I have tried to read the documentation accompanying the SparseM and
Matrix packages which include sparse representations, but am stuck.

Given the original matrix M, and a function areNeighbours(i,j) which
returns true if the B[i,j]==1, can anyone furnish me with an example
of how to do this? I would be extremely grateful

With thanks,

Oliver Lyttelton (Phd student)

PS (I appreciate that I could compute the SVD of t(M) if n

Re: [R] several plots on several pages - bug in par(mfg())?

2007-08-18 Thread ONKELINX, Thierry
Dear Rainer,

Your could try something like this.

test <- try( plot(runif(ff)) )
if(class(test) == "try-error"){
#put here code for an empty plot
}

Cheers,

Thierry


ir. Thierry Onkelinx
Instituut voor natuur- en bosonderzoek / Research Institute for Nature
and Forest
Cel biometrie, methodologie en kwaliteitszorg / Section biometrics,
methodology and quality assurance
Gaverstraat 4
9500 Geraardsbergen
Belgium
tel. + 32 54/436 185
[EMAIL PROTECTED]
www.inbo.be 

Do not put your faith in what statistics say until you have carefully
considered what they do not say.  ~William W. Watt
A statistical analysis, properly conducted, is a delicate dissection of
uncertainties, a surgery of suppositions. ~M.J.Moroney

 

> -Oorspronkelijk bericht-
> Van: [EMAIL PROTECTED] 
> [mailto:[EMAIL PROTECTED] Namens Rainer M. Krug
> Verzonden: vrijdag 17 augustus 2007 9:49
> Aan: Greg Snow
> CC: r-help; [EMAIL PROTECTED]
> Onderwerp: Re: [R] several plots on several pages - bug in par(mfg())?
> 
> Greg Snow wrote:
> > Oops, I read further down in your original post and see that you 
> > already knew about par(mfg=c(2,1)).  To get it to advance to page 2 
> > for the 4th plot try calling plot.new() which should move 
> you to the 
> > next page, then doing par(mfg=c(1,1)) should cause the next 
> graph to be at the top.
> > 
> > Hope this helps,
> > 
> 
> Thanks - I found plot.new() and it is working.
> 
> But: If the first plot command fails, par(mfg=c(2,1)) does 
> NOT move to the second one - if you try the code below, you will see.
> 
> Is this a bug or am I doing something wrong?
> 
> ## Set layout to three rows and only oine column par( 
> mfcol=c(3,1), oma=c(0,0,0,0), mar=c(4, 4, 2, 2) )
> 
> ## First row
> par(mfg=c(1,1))
> try( plot(runif(ff)) ) ## plot fails due to something.
> 
> ## Second row
> par(mfg=c(2,1))
> try( plot(runif(100)) ) ##actually is plotted in first row
> 
> ## Third row
> par(mfg=c(3,1))
> plot(runif(1000))   ## plotted in third row
> 
> 
> --
> NEW EMAIL ADDRESS AND ADDRESS:
> 
> [EMAIL PROTECTED]
> 
> [EMAIL PROTECTED] WILL BE DISCONTINUED END OF MARCH
> 
> Rainer M. Krug, Dipl. Phys. (Germany), MSc Conservation Biology (UCT)
> 
> Plant Conservation Unit
> Department of Botany
> University of Cape Town
> Rondebosch 7701
> South Africa
> 
> Tel:  +27 - (0)21 650 5776 (w)
> Fax:  +27 - (0)86 516 2782
> Fax:  +27 - (0)21 650 2440 (w)
> Cell: +27 - (0)83 9479 042
> 
> Skype:RMkrug
> 
> email:[EMAIL PROTECTED]
>   [EMAIL PROTECTED]
> 
> __
> R-help@stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide 
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Any parser generator / code assistance for R?

2007-08-18 Thread Martin Maechler
> "A-" == Ali - <[EMAIL PROTECTED]>
> on Sat, 18 Aug 2007 00:40:52 +0100 writes:

A-> Hi,
A-> Is there any parser generator like www.antlr.org?  Moreover, how does 
simple 
A-> code assistance work currently in R? By 'simple code assistance' I 
meant 
A-> things like:

A-> Object$M --> Object$Method

If you really meant a list with components
or an S4 object with slots,
such code completion works at least since R 2.5.1, because of
the recent 'rcompletion' extensions of Deepayan Sarkar,
and of course in ESS (Emacs Speaks Statistics),
and I think in several other GUI/Environments as well.

But if you are thinking OOP as in Java or C++ (and I think you
*are* thinking along that way), then rather learn
that S (and hence R) do OOP in a function-centric rather than class-centric
way; something which seems to be quite hard to grasp for many
who have been brought up in Java-like schools
If you are still interested in R, look out for documents with
"S4" (or "formal methods and classes") and "R" in the title  ;-)

Regards,
Martin

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Suspected memory leak with R v.2.5.x and large matrices with dimnames set

2007-08-18 Thread Seth Falcon
Hi Peter,

Peter Waltman <[EMAIL PROTECTED]> writes:
>Admittedly,  this  may  not be the most sophisticated memory profiling
>performed,  but  when using unix's top command, I'm noticing a notable
>memory leak when using R with a large matrix that has dimnames
>set.

I'm not sure I understand what you are reporting.  One thing to keep
in mind is that how memory released by R is handled is OS dependent
and one will often observe that after R frees some memory, the OS does
not report that amount as now free.

Is what you are observing preventing you from getting things done, or
just a concern that there is a leak that needs fixing?  It is worth
noting that the internal handling of character vectors has changed in
R-devel and so IMO testing there would make sense before persuing this
further, I suspect your results will be different.

+ seth

-- 
Seth Falcon | Computational Biology | Fred Hutchinson Cancer Research Center
BioC: http://bioconductor.org/
Blog: http://userprimary.net/user/

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Question about lme and AR(1)

2007-08-18 Thread Francisco Redelico
Dear R users,


As far as I know, EM algorithm can be only applied to estimate parameter from a 
regular exponential family. 
A multivariate normal distribution with an AR(1) matrix as covariance matrix 
does not belong to a regular exponential family, it is belong to  a curved 
exponential family, so EM algorithm can not be applied to estimate parameters 
for this kind of  distribution.
I have used nle function from nlme package to estimate variance components with 
correlation=corAr1, this function uses first EM algorithm to refine the initial 
estimates of the random effects variance-covariance coefficients and uses them 
into a Newton-Raphson algorithm.

Do anyone know what kind of modification of the EM algorithm use lme function 
to solve the problem mentioned below?

Thank you in advance for your help

Francisco



[[alternative HTML version deleted]]


__
Correo Yahoo!
Espacio para todos tus mensajes, antivirus y antispam !gratis!

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] number precision

2007-08-18 Thread pieterprovoost
Hi,

I'm trying to find a way to determine how many digits a number has. I tried 
using nchar(paste(number)), but unfortunately paste will reduce 8.00 to "8".

Any thoughts?
Pieter

--
This message was sent on behalf of [EMAIL PROTECTED] at openSubscriber.com
http://www.opensubscriber.com/messages/r-help@stat.math.ethz.ch/topic.html

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] names not inherited in functions

2007-08-18 Thread david dav
Thank you to all.
And thank you for the extra tips. I had a kind of feeling my
 "names(data.frame(var))"
would seem awkward!

David

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Suspected memory leak with R v.2.5.x and large matrices with dimnames set

2007-08-18 Thread Peter Waltman
Hi Seth -

Thanks for the follow up.  I'll definitely check out the devel version 
at some point since while I've come up with a workaround, this is 
causing problems for me as it uses up so much memory on some systems 
that R starts throwing malloc errors and has to be killed from the 
command line.  The machine I'm thinking of in particular is a MacOS 
machine with 8 gigs of memory.

Also, having the row and column names set to alphanumeric names causes 
the processing to slow down significantly - as much as by a power of 10 
(or more).

As for you speculation that the memory released by R may not be 
recognized as being free'd by the OS, as a further test, I re-ran my 
code snippet three consecutive times w/in the same R interpreter 
window.  In theory, if there were a memory leak, after the first run 
(resulting in a memory stamp of 2 gig), the subsequent runs would 
further increase R's memory stamp, i.e. up to 4 after the second, and 6 
for the 3rd.

This didn't happen, and R's stamp remained at 2 gig, so I can only 
assume that you're correct and I was wrong about a leak. 

Still, it's quite the memory hog when using dimnames, so I'll have to 
avoid those for now and will try the devel version you mentioned.

Thanks and have a good weekend,

Peter

Seth Falcon wrote:
> Hi Peter,
>
> Peter Waltman <[EMAIL PROTECTED]> writes:
>   
>>Admittedly,  this  may  not be the most sophisticated memory profiling
>>performed,  but  when using unix's top command, I'm noticing a notable
>>memory leak when using R with a large matrix that has dimnames
>>set.
>> 
>
> I'm not sure I understand what you are reporting.  One thing to keep
> in mind is that how memory released by R is handled is OS dependent
> and one will often observe that after R frees some memory, the OS does
> not report that amount as now free.
>
> Is what you are observing preventing you from getting things done, or
> just a concern that there is a leak that needs fixing?  It is worth
> noting that the internal handling of character vectors has changed in
> R-devel and so IMO testing there would make sense before persuing this
> further, I suspect your results will be different.
>
> + seth
>
>

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] recommended combo of apps for new user?

2007-08-18 Thread Martin Brown
Hi there,

I would like some advice, not so much about how to use R, but about software
that I need to complement R.  I've rooted around in the FAQ's and done a few
searches on this mailing list but haven't quite found the perspective I
need.

I am an experienced data analyst in my field (forest ecology and ecological
monitoring) but new to R. I am a long time user of SPSS and have gotten
pretty handy with it.  However, I am frustrated with SPSS for several
reasons:  There's the cost (I'm a freelancer; I pay for my software
myself);  the Windows dependence (I use Kubuntu as my usual OS now, and
switching back and forth is a pain); the horrible inefficiency when I do
certain types of file manipulations; and the inability to do the kind of
publication-quality graphs I want... I've usually ended up using a
commercial graphing program (another source of expense and limitation).

I'd like to switch to using R on Kubuntu, for all those reasons.  In
addition I think the mathematical formality that R encourages might be good
for me.

However, reviewing the FAQ's on the R project web site makes me realize that
I've been using SPSS as three kinds of software really:  a DBMS; a
statistical analysis package; and a graphing package.  It looks like moving
to R might involve learning three kinds of software, not just one.  I
wonder:

1) What open-source DBMS works most seamlessly with R?  I have seen MySQL
recommended but wonder if there are alternatives.  I sometimes need to
handle big data files.  In fact a lot of my work involves exploratory and
descriptive analyses of rather large and messy databases from ecological
monitoring, rather than statistical tests per se.  In SPSS the data files I
have been generating have dozens of columns and thousands of rows, often
with value and variable labels helpful for documenting my work.
2) For the purpose of creating publication-quality graphs, do R users
typically need to go outside of the R system? If so, what open-source
programs would you all recommend?
3) Any other software I need to learn that would make my work in R more
productive? (for example, a code editor).

Thank you for your time,

Martin J. Brown
Portland, Oregon

[[alternative HTML version deleted]]

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] recommended combo of apps for new user?

2007-08-18 Thread Martin Brown
[i sent this message earlier but apparently should have sent it plain
text, as follows..]

Hi there,

I would like some advice, not so much about how to use R, but about
software that I need to complement R.  I've rooted around in the FAQ's
and done a few searches on this mailing list but haven't quite found
the perspective I need.

I am an experienced data analyst in my field (forest ecology and
ecological monitoring) but new to R. I am a long time user of SPSS and
have gotten pretty handy with it.  However, I am frustrated with SPSS
for several reasons:  There's the cost (I'm a freelancer; I pay for my
software myself);  the Windows dependence (I use Kubuntu as my usual
OS now, and switching back and forth is a pain); the horrible
inefficiency when I do certain types of file manipulations; and the
inability to do the kind of publication-quality graphs I want... I've
usually ended up using a commercial graphing program (another source
of expense and limitation).

I'd like to switch to using R on Kubuntu, for all those reasons.  In
addition I think the mathematical formality that R encourages might be
good for me.

However, reviewing the FAQ's on the R project web site makes me
realize that I've been using SPSS as three kinds of software really:
a DBMS; a statistical analysis package; and a graphing package.  It
looks like moving to R might involve learning three kinds of software,
not just one.  I wonder:

1) What open-source DBMS works most seamlessly with R?  I have seen
MySQL recommended but wonder if there are alternatives.  I sometimes
need to handle big data files.  In fact a lot of my work involves
exploratory and descriptive analyses of rather large and messy
databases from ecological monitoring, rather than statistical tests
per se.  In SPSS the data files I have been generating have dozens of
columns and thousands of rows, often with value and variable labels
helpful for documenting my work.
2) For the purpose of creating publication-quality graphs, do R users
typically need to go outside of the R system? If so, what open-source
programs would you all recommend?
3) Any other software I need to learn that would make my work in R
more productive? (for example, a code editor).

Thank you for your time,

Martin J. Brown
Portland, Oregon

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Problem Connecting to Oracle with R from Windows XP

2007-08-18 Thread Adrian Dragulescu
You also need the ROracle package.

On 8/15/07, Song, Alex <[EMAIL PROTECTED]> wrote:
>
> Hello,
>
>
>
> I installed RGui 2.5.1 and package DBI on Windows XP and tried to connect
> to Oracle database which is on a Linux server. When I tried to use
> dbDriver("Oracle"), I got an error as follows:
>
>
>
> > drv <- dbDriver("Oracle")
>
> Error in do.call(as.character(drvName), list(...)) :
>
> could not find function "Oracle"
>
> >
>
>
>
> Could anyone tell me how to connect to Oracle with R from Windows XP? Do I
> need to configure any environment variables? Do I need to configure ODBC? Do
> I need to install any other packages? I have Oracle client 10g installed on
> my computer and I can connect to the Oracle database using other client
> software like Toad, SQLPlus, or SQL Developer.
>
>
>
> Thank you very much.
>
>
>
> Alex
>
>
>
> Alex Song
>
> Data Management Specialist / Spécialiste en gestion des données
>
> National Forest Inventory / Inventaire forestier national
>
> Pacific Forestry Centre / Centre de foresterie du Pacifique
> Canadian Forest Service / Service canadien des forêts
> Natural Resources Canada / Ressources naturelles Canada
>
> Government of Canada / Gouvernement du Canada
>
> 506 West Burnside Road / 506 chemin Burnside ouest
>
> Victoria, BC V8Z 1M5
>
> Phone (250) 363-3342
>
> Facs: (250) 363-0775
>
>
>
> Email: [EMAIL PROTECTED] 
>
>
>
>
> [[alternative HTML version deleted]]
>
>
> __
> R-help@stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>
>

[[alternative HTML version deleted]]

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Installing Rstem on Mac Intel

2007-08-18 Thread Walter Rojas
Hi all.

How do I install Rstem on my mac os X, with Intel processor? I need  
Rstem in order to run de lsa package.

When I run the following command in the R interface:

install.packages("Rstem", repos = "http://www.omegahat.org/R";,  type  
= "source"),

I get a whole bunch of errors, most of which referred to the dylib  
(see below).

Is there an Rstem package available for Intel Macs?

  Thanks in advance!
Tina.

...
mytestpointer targets in passing argument 1 of 'Rf_mkChar' differ in  
signedness
ld: warning can't open dynamic library: /Developer/SDKs/ 
MacOSX10.4u.sdk/Library/Frameworks/R.framework/Versions/2.5/Resources/ 
lib/libRblas.dylib referenced from: /Library/Frameworks/ 
R.framework/../R.framework/R (checking for undefined symbols may be  
affected) (No such file or directory, errno = 2)
ld: warning can't open dynamic library: /Developer/SDKs/ 
MacOSX10.4u.sdk/Library/Frameworks/R.framework/Versions/2.5/Resources/ 
lib/libgfortran.2.dylib referenced from: /Library/Frameworks/ 
R.framework/../R.framework/R (checking for undefined symbols may be  
affected) (No such file or directory, errno = 2)
ld: warning can't open dynamic library: /Developer/SDKs/ 
MacOSX10.4u.sdk/Library/Frameworks/R.framework/Versions/2.5/Resources/ 
lib/libgcc_s.1.dylib referenced from: /Library/Frameworks/ 
R.framework/../R.framework/R (checking for undefined symbols may be  
affected) (No such file or directory, errno = 2)
ld: warning can't open dynamic library: /Developer/SDKs/ 
MacOSX10.4u.sdk/Library/Frameworks/R.framework/Versions/2.5/Resources/ 
lib/libreadline.5.2.dylib referenced from: /Library/Frameworks/ 
R.framework/../R.framework/R (checking for undefined symbols may be  
affected) (No such file or directory, errno = 2)
ld: Undefined symbols:
___gcc_qadd referenced from libR expected to be defined in /Library/ 
Frameworks/R.framework/Versions/2.5/Resources/lib/libgcc_s.1.dylib
___gcc_qdiv referenced from libR expected to be defined in /Library/ 
Frameworks/R.framework/Versions/2.5/Resources/lib/libgcc_s.1.dylib
_dgemm_ referenced from libR expected to be defined in /Library/ 
Frameworks/R.framework/Versions/2.5/Resources/lib/libRblas.dylib
_dsyrk_ referenced from libR expected to be defined in /Library/ 
Frameworks/R.framework/Versions/2.5/Resources/lib/libRblas.dylib
_zgemm_ referenced from libR expected to be defined in /Library/ 
Frameworks/R.framework/Versions/2.5/Resources/lib/libRblas.dylib
___gcc_qmul referenced from libR expected to be defined in /Library/ 
Frameworks/R.fraWarning message:
installation of package 'Rstem' had non-zero exit status in:  
install.packages("Rstem", repos = "http://www.omegahat.org/R";,
.o norwegian_stem.o portuguese_stem.o russian_stem.o spanish_stem.o  
swedish_stem.o utilities.o   -F/Library/Frameworks/R.framework/.. - 
framework R
** Removing '/Library/Frameworks/R.framework/Versions/2.5/Resources/ 
library/Rstem'
mework/Versions/2.5/Resources/lib/libgcc_s.1.dylib
___gcc_qsub referenced from libR expected to be defined in /Library/ 
Frameworks/R.framework/Versions/2.5/Resources/lib/libgcc_s.1.dylib
___floatdidf referenced from libR expected to be defined in /Library/ 
Frameworks/R.framework/Versions/2.5/Resources/lib/libgcc_s.1.dylib
_dcopy_ referenced from libR expected to be defined in /Library/ 
Frameworks/R.framework/Versions/2.5/Resources/lib/libRblas.dylib
_dtrsm_ referenced from libR expected to be defined in /Library/ 
Frameworks/R.framework/Versions/2.5/Resources/lib/libRblas.dylib
_daxpy_ referenced from libR expected to be defined in /Library/ 
Frameworks/R.framework/Versions/2.5/Resources/lib/libRblas.dylib
_dswap_ referenced from libR expected to be defined in /Library/ 
Frameworks/R.framework/Versions/2.5/Resources/lib/libRblas.dylib
_ddot_ referenced from libR expected to be defined in /Library/ 
Frameworks/R.framework/Versions/2.5/Resources/lib/libRblas.dylib
_dasum_ referenced from libR expected to be defined in /Library/ 
Frameworks/R.framework/Versions/2.5/Resources/lib/libRblas.dylib
_dscal_ referenced from libR expected to be defined in /Library/ 
Frameworks/R.framework/Versions/2.5/Resources/lib/libRblas.dylib
_dnrm2_ referenced from libR expected to be defined in /Library/ 
Frameworks/R.framework/Versions/2.5/Resources/lib/libRblas.dylib
_drot_ referenced from libR expected to be defined in /Library/ 
Frameworks/R.framework/Versions/2.5/Resources/lib/libRblas.dylib
_drotg_ referenced from libR expected to be defined in /Library/ 
Frameworks/R.framework/Versions/2.5/Resources/lib/libRblas.dylib
_add_history referenced from libR expected to be defined in /Library/ 
Frameworks/R.framework/Versions/2.5/Resources/lib/libreadline.5.2.dylib
_clear_history referenced from libR expected to be defined in / 
Library/Frameworks/R.framework/Versions/2.5/Resources/lib/libreadline. 
5.2.dylib
_history_truncate_file referenced from libR expected to be defined  
in /Library/Frameworks/R.framework/Versions/2.5/Resources/lib/ 
libreadline.5.2.dylib
_read_history 

Re: [R] Suspected memory leak with R v.2.5.x and large matrices with dimnames set

2007-08-18 Thread Luke Tierney
Seth's analysis is correct.  R does return what it can to the malloc
system by calling free.  When and how much memory malloc releases back
to the OS varies with OS and malloc system and also depends on the
sizes of allocations. R curently allocates its memory for small
objects in pages of about 2K.  On Mac OS X if that is increased to
about 16K then much more is returned to the OS. On Linux (Fedora 7 on
i386 at least) the amount would have to be pushed up to around 2M to
make a difference. Increasing page size reduces R's ability to release
pages, so an increase to that level would probably not be a good idea.

Whether or not malloc releases memory back to the OS shouldn't make
much difference to a single R process; it might come into play if you
are trying to run multiple memory-intensive pocesses on the same
machine, though even that may vary among OS/malloc systems.

The changes Seth mentiones are not likely to help in this case. They
are primarily intended to improve performance when there are many
non-unique character vectors; there is additional overhead for many
unique vectors, which we will try to reduce over time.

Best,

luke

On Sat, 18 Aug 2007, Peter Waltman wrote:

> Hi Seth -
>
> Thanks for the follow up.  I'll definitely check out the devel version
> at some point since while I've come up with a workaround, this is
> causing problems for me as it uses up so much memory on some systems
> that R starts throwing malloc errors and has to be killed from the
> command line.  The machine I'm thinking of in particular is a MacOS
> machine with 8 gigs of memory.
>
> Also, having the row and column names set to alphanumeric names causes
> the processing to slow down significantly - as much as by a power of 10
> (or more).
>
> As for you speculation that the memory released by R may not be
> recognized as being free'd by the OS, as a further test, I re-ran my
> code snippet three consecutive times w/in the same R interpreter
> window.  In theory, if there were a memory leak, after the first run
> (resulting in a memory stamp of 2 gig), the subsequent runs would
> further increase R's memory stamp, i.e. up to 4 after the second, and 6
> for the 3rd.
>
> This didn't happen, and R's stamp remained at 2 gig, so I can only
> assume that you're correct and I was wrong about a leak.
>
> Still, it's quite the memory hog when using dimnames, so I'll have to
> avoid those for now and will try the devel version you mentioned.
>
> Thanks and have a good weekend,
>
> Peter
>
> Seth Falcon wrote:
>> Hi Peter,
>>
>> Peter Waltman <[EMAIL PROTECTED]> writes:
>>
>>>Admittedly,  this  may  not be the most sophisticated memory profiling
>>>performed,  but  when using unix's top command, I'm noticing a notable
>>>memory leak when using R with a large matrix that has dimnames
>>>set.
>>>
>>
>> I'm not sure I understand what you are reporting.  One thing to keep
>> in mind is that how memory released by R is handled is OS dependent
>> and one will often observe that after R frees some memory, the OS does
>> not report that amount as now free.
>>
>> Is what you are observing preventing you from getting things done, or
>> just a concern that there is a leak that needs fixing?  It is worth
>> noting that the internal handling of character vectors has changed in
>> R-devel and so IMO testing there would make sense before persuing this
>> further, I suspect your results will be different.
>>
>> + seth
>>
>>
>
> __
> R-help@stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

-- 
Luke Tierney
Chair, Statistics and Actuarial Science
Ralph E. Wareham Professor of Mathematical Sciences
University of Iowa  Phone: 319-335-3386
Department of Statistics andFax:   319-335-3017
Actuarial Science
241 Schaeffer Hall  email:  [EMAIL PROTECTED]
Iowa City, IA 52242 WWW:  http://www.stat.uiowa.edu

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] number precision

2007-08-18 Thread Duncan Murdoch
[EMAIL PROTECTED] wrote:
> Hi,
>
> I'm trying to find a way to determine how many digits a number has. I tried 
> using nchar(paste(number)), but unfortunately paste will reduce 8.00 to "8".
>   

I think your problem is that "the number of digits a number has" is not 
a property of the number (since the numbers 8.00 and 8 are the same 
number).  You need to keep track of the digits in some other way.  One 
possibility is to store two numbers:  the lowest possible value and the 
highest possible value.  Then 8.00 would be stored as (7.995,  8.005).  
It's then possible (but not easy) to propagate these ranges through 
transformations.

Duncan Murdoch
> Any thoughts?
> Pieter
>
> --
> This message was sent on behalf of [EMAIL PROTECTED] at openSubscriber.com
> http://www.opensubscriber.com/messages/r-help@stat.math.ethz.ch/topic.html
>
> __
> R-help@stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] recommended combo of apps for new user?

2007-08-18 Thread Duncan Murdoch
Martin Brown wrote:
> [i sent this message earlier but apparently should have sent it plain
> text, as follows..]
>
> Hi there,
>
> I would like some advice, not so much about how to use R, but about
> software that I need to complement R.  I've rooted around in the FAQ's
> and done a few searches on this mailing list but haven't quite found
> the perspective I need.
>
> I am an experienced data analyst in my field (forest ecology and
> ecological monitoring) but new to R. I am a long time user of SPSS and
> have gotten pretty handy with it.  However, I am frustrated with SPSS
> for several reasons:  There's the cost (I'm a freelancer; I pay for my
> software myself);  the Windows dependence (I use Kubuntu as my usual
> OS now, and switching back and forth is a pain); the horrible
> inefficiency when I do certain types of file manipulations; and the
> inability to do the kind of publication-quality graphs I want... I've
> usually ended up using a commercial graphing program (another source
> of expense and limitation).
>
> I'd like to switch to using R on Kubuntu, for all those reasons.  In
> addition I think the mathematical formality that R encourages might be
> good for me.
>
> However, reviewing the FAQ's on the R project web site makes me
> realize that I've been using SPSS as three kinds of software really:
> a DBMS; a statistical analysis package; and a graphing package.  It
> looks like moving to R might involve learning three kinds of software,
> not just one.  I wonder:
>
> 1) What open-source DBMS works most seamlessly with R?  I have seen
> MySQL recommended but wonder if there are alternatives.  I sometimes
> need to handle big data files.  In fact a lot of my work involves
> exploratory and descriptive analyses of rather large and messy
> databases from ecological monitoring, rather than statistical tests
> per se.  In SPSS the data files I have been generating have dozens of
> columns and thousands of rows, often with value and variable labels
> helpful for documenting my work.
>   

I think you won't find much difference in the R interface between MySQL, 
PostgreSQL, or SQLite.  The choice should be made based on the qualities 
of the database (and I don't know enough about the differences to give a 
recommendaton.)
> 2) For the purpose of creating publication-quality graphs, do R users
> typically need to go outside of the R system? If so, what open-source
> programs would you all recommend?
>   
R is great for this, but you might need to go outside for some 
specialized stuff (e.g. medical imaging).

> 3) Any other software I need to learn that would make my work in R
> more productive? (for example, a code editor).

A lot of people are happy with ESS mode in Emacs.

Duncan Murdoch

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Problem with lsa package (data.frame) on Windows XP

2007-08-18 Thread Walter Rojas
Dear R team,

The following piece of code (to use the lsa package) works fine on my  
mac os x, but when I run the same code on Windows XP, it doesn't work  
any more.

### code:
library("lsa")
matrix1 = textmatrix("C:\\Documents and Settings\\tine stalmans.TINE. 
000\\LSA\\cuentos\\", stemming=TRUE, language="spanish",  
minWordLength=2, minDocFreq=1, stopwords=NULL, vocabulary=NULL)
print(matrix1,bag_lines = 3, bag_cols = 3)
matrix1 = lw_bintf(matrix1) * gw_idf(matrix1)
space = lsa(matrix1, dims = dimcalc_share())
as.textmatrix(space)

### the following line fails on windows XP
matrix2 = textmatrix("C:\\Documents and Settings\\tine stalmans.TINE. 
000\\LSA\\respuestas\\", stemming=TRUE, language="spanish",  
minWordLength=2, minDocFreq=1, stopwords=NULL,vocabulary=rownames 
(matrix1))
matrix2 = lw_bintf(matrix2)
matrix2fld = fold_in(matrix2, space)
r <- cor(matrix2fld[,"respId1.txt"], matrix2fld[,"respAl1.txt"],  
method = "pearson")
print(r)


An error occurs when creating the second textmatrix with the  
vocabulary of the first. The error I get is:

in data.frame(docs = basename(file), terms = names(tab), Freq = tab,  :
 arguments imply differing number of rows: 1, 0

When I change the vocabulary argument to NULL, it doesn't report this  
error any more; however, then the code will fail on the fold_in  
method further down.

I found another user who reported this same problem on-line; however,  
I didn't find any answers.

Thank you very much in advance for your reply.
Tine.

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] recommended combo of apps for new user?

2007-08-18 Thread John Kane
I'm just starting to get a grasp on how R works so
don't take my words too seriously but have a look at 
http://addictedtor.free.fr/graphiques/ for some idea
of what R can do for publication quality graphics.  It
is always possible that you might need another
graphics package as well but I think it unlikely.  

About the data bases I don't know really however you
might want to have a look at Frank Harrell's Hmic
package for things like labels. It also includes SAS
and SPSS import funtions as does the foreign package. 

I'd say you definately need a code editor. I'm on
Windows and happy with Tinn-R but for Linux something
like http://ess.r-project.org/ seems to be
recommended.


If you have not already found it 
Bob Muenchen's R for SAS and SPSS Users
http://oit.utk.edu/scc/RforSAS&SPSSusers.pdf may be
very helpful. 

--- Martin Brown <[EMAIL PROTECTED]> wrote:

> Hi there,
> 
> I would like some advice, not so much about how to
> use R, but about software
> that I need to complement R.  I've rooted around in
> the FAQ's and done a few
> searches on this mailing list but haven't quite
> found the perspective I
> need.
> 
> I am an experienced data analyst in my field (forest
> ecology and ecological
> monitoring) but new to R. I am a long time user of
> SPSS and have gotten
> pretty handy with it.  However, I am frustrated with
> SPSS for several
> reasons:  There's the cost (I'm a freelancer; I pay
> for my software
> myself);  the Windows dependence (I use Kubuntu as
> my usual OS now, and
> switching back and forth is a pain); the horrible
> inefficiency when I do
> certain types of file manipulations; and the
> inability to do the kind of
> publication-quality graphs I want... I've usually
> ended up using a
> commercial graphing program (another source of
> expense and limitation).
> 
> I'd like to switch to using R on Kubuntu, for all
> those reasons.  In
> addition I think the mathematical formality that R
> encourages might be good
> for me.
> 
> However, reviewing the FAQ's on the R project web
> site makes me realize that
> I've been using SPSS as three kinds of software
> really:  a DBMS; a
> statistical analysis package; and a graphing
> package.  It looks like moving
> to R might involve learning three kinds of software,
> not just one.  I
> wonder:
> 
> 1) What open-source DBMS works most seamlessly with
> R?  I have seen MySQL
> recommended but wonder if there are alternatives.  I
> sometimes need to
> handle big data files.  In fact a lot of my work
> involves exploratory and
> descriptive analyses of rather large and messy
> databases from ecological
> monitoring, rather than statistical tests per se. 
> In SPSS the data files I
> have been generating have dozens of columns and
> thousands of rows, often
> with value and variable labels helpful for
> documenting my work.
> 2) For the purpose of creating publication-quality
> graphs, do R users
> typically need to go outside of the R system? If so,
> what open-source
> programs would you all recommend?
> 3) Any other software I need to learn that would
> make my work in R more
> productive? (for example, a code editor).
> 
> Thank you for your time,
> 
> Martin J. Brown
> Portland, Oregon
> 
>   [[alternative HTML version deleted]]
> 
> __
> R-help@stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained,
> reproducible code.
>

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] number precision

2007-08-18 Thread pieterprovoost
I had the impression that when using read.table the number of digits was 
somehow preserved, but that doesn't make much sense of course. I just 
discovered I can assign types to columns while doing read.table (colClasses), 
so my problem is solved.

Thanks
Pieter

--
This message was sent on behalf of [EMAIL PROTECTED] at openSubscriber.com
http://www.opensubscriber.com/message/r-help@stat.math.ethz.ch/7406349.html

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] recommended combo of apps for new user?

2007-08-18 Thread Gabor Grothendieck
On 8/18/07, Martin Brown <[EMAIL PROTECTED]> wrote:
> Hi there,
>
> I would like some advice, not so much about how to use R, but about software
> that I need to complement R.  I've rooted around in the FAQ's and done a few
> searches on this mailing list but haven't quite found the perspective I
> need.
>
> I am an experienced data analyst in my field (forest ecology and ecological
> monitoring) but new to R. I am a long time user of SPSS and have gotten
> pretty handy with it.  However, I am frustrated with SPSS for several
> reasons:  There's the cost (I'm a freelancer; I pay for my software
> myself);  the Windows dependence (I use Kubuntu as my usual OS now, and
> switching back and forth is a pain); the horrible inefficiency when I do
> certain types of file manipulations; and the inability to do the kind of
> publication-quality graphs I want... I've usually ended up using a
> commercial graphing program (another source of expense and limitation).
>
> I'd like to switch to using R on Kubuntu, for all those reasons.  In
> addition I think the mathematical formality that R encourages might be good
> for me.

>From a strictly language perspective, mathematical formality is pretty
far from R.  Its actually quite loose.  Underneath there are some Lisp/Scheme
ideas but you are not very close to that as a user.

>
> However, reviewing the FAQ's on the R project web site makes me realize that
> I've been using SPSS as three kinds of software really:  a DBMS; a
> statistical analysis package; and a graphing package.  It looks like moving
> to R might involve learning three kinds of software, not just one.  I
> wonder:
>
> 1) What open-source DBMS works most seamlessly with R?  I have seen MySQL
> recommended but wonder if there are alternatives.  I sometimes need to
> handle big data files.  In fact a lot of my work involves exploratory and
> descriptive analyses of rather large and messy databases from ecological
> monitoring, rather than statistical tests per se.  In SPSS the data files I
> have been generating have dozens of columns and thousands of rows, often
> with value and variable labels helpful for documenting my work.

Databases. SQLite is the easiest to install since its embedded rather
than client/server so I would use that unless your application requires
client/server or other features of MySQL.  MySQL is probably the most
popular of the free data bases so that would be the next one to go with.
If you intend to create a commercial application you might want to
consider Postgres instead of MySQL as the latter charges for
commercial implementations but Postgres does not.  Some heavy
Postgres users might feel that it should be considered after SQLite
rather than MySQL and there is a certain amount of arbitrariness here.
See the R packages RSQLite, RMySQL and DBI.  The R packages sqldf and
SQLiteDF are beginning to blur the boundary between R and the database.

> 2) For the purpose of creating publication-quality graphs, do R users
> typically need to go outside of the R system? If so, what open-source
> programs would you all recommend?

Graphics.  R should be ok.  Check out:
   http://cran.r-project.org/src/contrib/Views/Graphics.html
and also google for
   R Graphics Gallery

> 3) Any other software I need to learn that would make my work in R more
> productive? (for example, a code editor).
>

Other.  You need to know a text editor.  I use vim but there are
many good choices here with ESS being one that is often mentioned.

http://www.sciviews.org/_rgui/projects/Editors.html
http://ess.r-project.org/

If you intend to write C routines to run with R then, of course, you
need to know C.
For certain R packages that interface with outside software (tcltk, Rgraphviz,
Ryacas, XML, etc.) you will need to know something about the interfaced-to
software if you intend to use those packages.

For package development you will need to know latex and possibly subversion,
i.e. svn, the UNIX screen program, tar and various other UNIX commands.
Certain auxilliary programs that come with and are used with R are written
in perl although its unlikely you will need to know it.

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] recommended combo of apps for new user?

2007-08-18 Thread Prof Brian Ripley
Some additional comments on the DBMS front.

(a) SPSS is not a DBMS, so it is not clear that you need this. But if you 
do and are storing valuable data in a DBMS a lot of further questions come 
into play, like how you are going to do backups.  I'd say PostgreSQL was 
really only for professional-level administrators.  My sysadmins recommend 
MySQL for most people.  We do also run PostgreSQL and they find it a lot 
trickier to maintain.

'dozens of columns and thousands of rows' is not big.  A data frame with 
50 columns and 5000 rows would only take 2Mb to store, and R will easily 
handle 100x with 4GB of RAM (and if you have less, get 4GB).  So storing 
data in .rda (R's save() format) is most likely viable.  R's indexing etc 
operations make it good at data manipulation, and using a DBMS will 
involve learning SQL, a non-trivial cost.

(b) You have a choice of interfaces to a DBMS, RODBC and the DBI+ family, 
e.g. DBI+RMySQL and DBI+RSQLite.  I'm biased, but I find RODBC more 
intuitive, and many people have reported it to be faster.  If all you want 
is non-permanent storage for manipulation of large data sets, consider 
also SQLiteDF.

On Sat, 18 Aug 2007, Duncan Murdoch wrote:

> Martin Brown wrote:
>> [i sent this message earlier but apparently should have sent it plain
>> text, as follows..]
>>
>> Hi there,
>>
>> I would like some advice, not so much about how to use R, but about
>> software that I need to complement R.  I've rooted around in the FAQ's
>> and done a few searches on this mailing list but haven't quite found
>> the perspective I need.
>>
>> I am an experienced data analyst in my field (forest ecology and
>> ecological monitoring) but new to R. I am a long time user of SPSS and
>> have gotten pretty handy with it.  However, I am frustrated with SPSS
>> for several reasons:  There's the cost (I'm a freelancer; I pay for my
>> software myself);  the Windows dependence (I use Kubuntu as my usual
>> OS now, and switching back and forth is a pain); the horrible
>> inefficiency when I do certain types of file manipulations; and the
>> inability to do the kind of publication-quality graphs I want... I've
>> usually ended up using a commercial graphing program (another source
>> of expense and limitation).
>>
>> I'd like to switch to using R on Kubuntu, for all those reasons.  In
>> addition I think the mathematical formality that R encourages might be
>> good for me.
>>
>> However, reviewing the FAQ's on the R project web site makes me
>> realize that I've been using SPSS as three kinds of software really:
>> a DBMS; a statistical analysis package; and a graphing package.  It
>> looks like moving to R might involve learning three kinds of software,
>> not just one.  I wonder:
>>
>> 1) What open-source DBMS works most seamlessly with R?  I have seen
>> MySQL recommended but wonder if there are alternatives.  I sometimes
>> need to handle big data files.  In fact a lot of my work involves
>> exploratory and descriptive analyses of rather large and messy
>> databases from ecological monitoring, rather than statistical tests
>> per se.  In SPSS the data files I have been generating have dozens of
>> columns and thousands of rows, often with value and variable labels
>> helpful for documenting my work.

See above.

>
> I think you won't find much difference in the R interface between MySQL,
> PostgreSQL, or SQLite.  The choice should be made based on the qualities
> of the database (and I don't know enough about the differences to give a
> recommendaton.)
>> 2) For the purpose of creating publication-quality graphs, do R users
>> typically need to go outside of the R system? If so, what open-source
>> programs would you all recommend?
>>
> R is great for this, but you might need to go outside for some
> specialized stuff (e.g. medical imaging).
>
>> 3) Any other software I need to learn that would make my work in R
>> more productive? (for example, a code editor).
>
> A lot of people are happy with ESS mode in Emacs.
>
> Duncan Murdoch
>
> __
> R-help@stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

-- 
Brian D. Ripley,  [EMAIL PROTECTED]
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford, Tel:  +44 1865 272861 (self)
1 South Parks Road, +44 1865 272866 (PA)
Oxford OX1 3TG, UKFax:  +44 1865 272595

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Does anyone else think this might be worth a warning?!?

2007-08-18 Thread Matthew Walker
Hi,

I was *very* surprised by this little trick for new players: mean() only 
considers its first argument!

 > mean(1,1,2)
[1] 1
 > mean(2,1,1)
[1] 2


I found this very different behaviour to max():

 > max(1,1,2)
[1] 2
 > max(2,1,1)
[1] 2



Perhaps this is the wrong list to ask, but does anyone else think this a 
little on the interesting side?  Is it not possible to detect a first 
argument of length one in the presence of other un-named arguments and 
at least produce a warning?


Cheers,


Matthew

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.