Re: [R] factor analysis (pca): how to get the 'communalities'?

2003-01-03 Thread ripley
On Fri, 3 Jan 2003, Wolfgang Lindner wrote:

 I try some test data for a factorAnalysis (resp. pca) in the sense of Prof.

Well, factor analysis and pca are different things, and only one
is appropriate in a given problem.

 Ripley's MASS § 11.1, p. 330 ff.,

Eh?  Would that be *Venables  Ripley's* MASS, and if so which edition (it
is not the current one).  Those editions which cover factor analysis do
explain the difference.

just to prepare myself for an analysis of my
 own empirical data using R (instead of SPSS).

 1. the data.

 ## The test data is (from the book of Backhaus et al.: Multivariate ##
 Analysemethoden. Springer 2000 [9th ed.], p. 300 ff):

 a-c(4.5,5.167,5.059,3.8,3.444,3.5,5.25,5.857,5.083,5.273,4.5)
 b-c(4.0,4.25,3.824,5.4,5.056,3.5,3.417,4.429,4.083,3.6,4.0)
 c-c(4.375,3.833,4.765,3.8,3.778,3.875,4.583,4.929,4.667,3.909,4.2)
 d-c(3.875,3.833,3.438,2.4,3.765,4.0,3.917,3.857,4.0,4.091,3.9)
 e-c(3.25,2.167,4.235,5.0,3.944,4.625,4.333,4.071,4.0,4.091,3.7)
 f-c(3.75,3.75,4.471,5.0,5.389,5.250,4.417,5.071,4.25,4.091,3.9)
 g-c(4.0,3.273,3.765,5.0,5.056,5.5,4.667,2.929,3.818,4.545,3.6)
 h-c(2.0,1.857,1.923,4.0,5.615,6.0,3.25,2.091,1.545,1.6,1.5)
 i-c(4.625,3.75,3.529,4.0,4.222,4.75,4.5,4.571,3.75,3.909,3.5)
 j-c(4.125,3.417,3.529,4.6,5.278,5.375,3.583,3.786,4.167,3.818,3.7)

 m-data.frame(a,b,c,d,e,f,g,h,i,j)

 2. My try of a pca with R.

 ## My R input was:

 m
 cor(m)
 library(mva)
 m.pca-princomp(m,cor=T)
 m.pca
 summary(m.pca)
 loadings(m.pca)
 m.pca$scores
 m.FA - factanal(factors = 3, covmat=cov(m))
 m.FA

 3. Here are my questions.

 Q1.
 The cor(m)-Matrix is the same as reported by using SPSS (or OpenStats2).
 But in R I get other eigenvalues compared with the following SPSS output:

You don't get eigenvalues at all in R.  You do get `Proportion of
Variance' which are these numbers divided by their total.

 Original matrix trace =  10,00
 Roots (Eigenvalues) Extracted:
1  5,052
2  1,771
3  1,427
4  0,819
5  0,430
6  0,247
7  0,159
8  0,062
9  0,029
   10  0,003

 - What is going behind the scene?

Why don't you ask the SPSS people that?  R at least gives you sensible
labels on the output.

 - Or what I am doing wrong in my use of R?
 - If I am doing the pca correct, can I use the R results as equally aceptable
   without further discussion?

No, as more acceptable: at least they have meaningful labels.

   Maybe a different 'hidden' algorithm is the reason for different results?

Ask SPSS that.  R's code is open, and nothing is hidden.  You have not
demonstrated that the results are different, anyway!

 Q2. How to get the so called 'Communality Estimates' with R?

First, use the data as in

 (m.FA - factanal(m, factors=3))

and where did the number of factors come from?

100*(1 - m.FA$uniquenesses) gives the communalities.  They are different
from SPSS, because (1) R uses maximum likelihood FA and (2) tries a lot
harder to find a maximum and there are many local maxima in most FA
problems.

In this case you have fitted too many factors, and just one suffices.

 Here the values reported by SPSS for the above test data.frame m:
 Communality Estimates as percentages:
   1 88,619
   2 76,855
   3 89,167
   4 85,324
   5 76,043
   6 84,012
   7 80,223
   8 92,668
   9 63,297
  10 88,786

 Any help, suggestions or hints are very welcome.

1) Be a lot more accurate.
2) Read the help pages to find out what the output means.  In the case of
R the information is there, but you may well have to post on an SPSS help
list to find out why SPSS gives different output from R.
3) Don't believe SPSS knows what it is doing.

-- 
Brian D. Ripley,  [EMAIL PROTECTED]
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford, Tel:  +44 1865 272861 (self)
1 South Parks Road, +44 1865 272866 (PA)
Oxford OX1 3TG, UKFax:  +44 1865 272595

__
[EMAIL PROTECTED] mailing list
http://www.stat.math.ethz.ch/mailman/listinfo/r-help



[R] R talking to Oracle, ODBC drivers available ?

2003-01-03 Thread Luke Whitaker

[sorry, but this is a re-post - I forgot to set the subject line 
 the first time around]

Hello,

I would like to access an Oracle database running on Solaris from
R on my linux desktop. I have had a look at the R Data Import/Export
manual, and downloaded RODBC and unixODBC, but I am still quite confused
about how to proceed. It appears to me that I still need to get an Oracle
ODBC driver, and the only one I can find is from Easysoft for 880 euros,
which is not available for this project.

Is there any free software which will enable R on linux to talk to Oracle
on Solaris (both reading and writing data) ? 

My R setup is as follows:

platform i386-pc-linux-gnu
arch i386 
os   linux-gnu
system   i386, linux-gnu  
status
major1
minor5.1  
year 2002 
month06   
day  17   
language R

Thanks,

Luke Whitaker.

__
[EMAIL PROTECTED] mailing list
http://www.stat.math.ethz.ch/mailman/listinfo/r-help



[R] Interfacing R and C++ under Windows

2003-01-03 Thread Pijus Virketis
Dear all, 

My colleague, who has been helping me with wrapping some older C++ code for use in R, 
has been running into some issues, which he asked me to post here:

- ERROR is defined in RS.h which is included in Rdefines.h which conflicts with Visual 
Studio's ERROR
- TRACE is defined in Rinternals.h which conflicts with Visual Studio's TRACE
- math.h is included within extern C linkage in R.h, however Visual Studio's math.h 
includes templates which are only valid in C++

Basically, he feels that there are some clear conflicts between what R expects of C++ 
and what Visual C++ does. So, while the simpler .C technique can be used, the .Call 
approach does not work. What are we missing here? 


Thank you!

Pijus

__
[EMAIL PROTECTED] mailing list
http://www.stat.math.ethz.ch/mailman/listinfo/r-help



[R] RE: stange behavior of subset [] (was: lowess + turnpoints = doubling integers?)

2003-01-03 Thread Philippe Grosjean
Tom Blackwell wrote:
...
I summarized this to myself as computed subscripts need explicit
rounding in R, but not in S.  Here's the sample code which gave
me different results with R than with Splus.  I no longer have
Splus available, so I can't check it again.

look - (10 * seq(14)) - 76
chk.1 - seq(1420)[ 10 * (73.1 + look) ] #  NOT what I expected
chk.2 - seq(1420)[ round(10 * (73.1 + look), 0) ]   #  much better.
...

Hum, hum,... It reminds me a bug that was very difficult to track in a
function! This is a very interesting point. Actually, it works exactly the
same in Splus 6.1 for Windows and in R 1.6.1. Look also at the following:

 look - (10 * seq(14)) - 76
 10 * (73.1 + look)
 [1]   71  171  271  371  491  586  681  791  886  981 1101 1201 1301 1401
 as.integer(10 * (73.1 + look))
 [1]   70  170  270  370  490  586  681  791  886  981 1101 1201 1301 1401

as.integer() does not round doubles, but truncates them toward zero... and
[] coerces doubles to integers using as.integer(). This is not a bug, since
it is fully documented in Splus:

In help([):

...
The expressions may also be logical, numeric, or character. Numeric
subscripts should be integers, such as the output from : (the sequence
operator).

...
Subscripting coerces non-integer numeric subscripts to integers using
as.integer . Because as.integer creates integers by truncating the numeric
representation, this coercion can lead to unexpected results.

and in help(as.integer):
...
The numbers are truncated (moved to the closest integer the original number
that is towards zero). Attributes are deleted.

However, this could be vicious! Why not to round double in as.integer()?
Perhaps for performance questions?
Yet, this is not documented at all in R (neither in help([), nor in
help(as.integer)!!!). Any other comment on this?

Best,

Philippe Grosjean

...](({°...°}))...
 ) ) ) ) )
( ( ( ( (   Dr. Philippe Grosjean
 ) ) ) ) )
( ( ( ( (   LOV, UMR 7093
 ) ) ) ) )  Station Zoologique
( ( ( ( (   Observatoire Océanologique
 ) ) ) ) )  BP 28
( ( ( ( (   06234 Villefranche sur mer cedex
 ) ) ) ) )  France
( ( ( ( (
 ) ) ) ) )  tel: +33.4.93.76.38.16, fax: +33.4.93.76.38.34
( ( ( ( (
 ) ) ) ) )  e-mail: [EMAIL PROTECTED]
( ( ( ( (   SciViews project coordinator (http://www.sciviews.org)
 ) ) ) ) )
...

__
[EMAIL PROTECTED] mailing list
http://www.stat.math.ethz.ch/mailman/listinfo/r-help



Re: [R] Interfacing R and C++ under Windows

2003-01-03 Thread ripley
On Fri, 3 Jan 2003, Pijus Virketis wrote:

 My colleague, who has been helping me with wrapping some older C++ code for use in 
R, has been running into some issues, which he asked me to post here:

 - ERROR is defined in RS.h which is included in Rdefines.h which conflicts with 
Visual Studio's ERROR
 - TRACE is defined in Rinternals.h which conflicts with Visual Studio's TRACE
 - math.h is included within extern C linkage in R.h, however Visual Studio's 
math.h includes templates which are only valid in C++

 Basically, he feels that there are some clear conflicts between what
R expects of C++ and what Visual C++ does. So, while the simpler .C
technique can be used, the .Call approach does not work. What are we missing
here?

A real C++ compiler (and enough netiquette to wrap your lines).

Why are you using VC++ with R: it is most definitely not the recommended C
compiler?  Using standard C++ and the recommended compiler usually works,
modulo undefining symbols defined in windows.h or its dependencies.


-- 
Brian D. Ripley,  [EMAIL PROTECTED]
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford, Tel:  +44 1865 272861 (self)
1 South Parks Road, +44 1865 272866 (PA)
Oxford OX1 3TG, UKFax:  +44 1865 272595

__
[EMAIL PROTECTED] mailing list
http://www.stat.math.ethz.ch/mailman/listinfo/r-help



[R] help.start() directory?

2003-01-03 Thread Michael A. Miller
The help for help.start says that 

  All the packages in the known library trees are linked to
  directory `.R' in the per-session temporary directory. The
  links are re-made each time help.start is run, which should be
  done after packages are installed, updated or removed.

It used to be the case that this was the temporary directory was
the .R directory in my home directory, but since I've upgraded to
R 6.1, the help files are stored in /tmp/Rtmpxxx/.R, where
Rtmpxxx is the result of a call to tempdir.  A side effect of
this is that they are removed when ever I exit R and recreated
when I start R and run help.start again.  I like to be able to
have a couple book marks to my favorite parts of the help pages,
but since the tempdir is different for each instance of R, that
no longer works.  Is there a way to configure R to leave the help
pages in a static location such as ~/.R?  Even if they get
recreated each time I run help.start, I'd rather they always
showed up in the same place.

Thanks, Mike


 R.version
 _
platform i386-pc-linux-gnu
arch i386 
os   linux-gnu
system   i386, linux-gnu  
status
major1
minor6.1  
year 2002 
month11   
day  01   
language R

__
[EMAIL PROTECTED] mailing list
http://www.stat.math.ethz.ch/mailman/listinfo/r-help



[R] number plot symbol in scatterplot?

2003-01-03 Thread Simpson, William
If I make a scatterplot and several (e.g. 5) points lie on top of each other
at a given x,y location I would like the plot symbol to be the number of
superimposed points (e.g. 5). Could someone please tell me how to do this
in R? Thanks!

Bill Simpson

__
[EMAIL PROTECTED] mailing list
http://www.stat.math.ethz.ch/mailman/listinfo/r-help



Re: [R] number plot symbol in scatterplot?

2003-01-03 Thread Martin Maechler
 BillS == Simpson, William [EMAIL PROTECTED]
 on Fri, 3 Jan 2003 09:41:32 -0500 writes:

BillS If I make a scatterplot and several (e.g. 5) points
BillS lie on top of each other at a given x,y location I
BillS would like the plot symbol to be the number of
BillS superimposed points (e.g. 5). Could someone please
BillS tell me how to do this in R? Thanks!

(typical Martin's answer:  ``You don't want what you are asking for'').

No, seriously, I think using sunflowerplot() {as recommended by
Chambers et al (1983) see help} is a bit better {- ?sunflowerplot and examples}

Other solutions to the same problem:
 b. use jittering   jitter()

 c. use a ``2d density estimate'' and plot that
  1) e.g. use image on a 2d histog
  2) better use the hexbin package from bioconductor,
this uses Dan Carr's Hexagons instead of squares and (some
 more ideas).

Regards,
Martin Maechler [EMAIL PROTECTED]http://stat.ethz.ch/~maechler/
Seminar fuer Statistik, ETH-Zentrum  LEO C16Leonhardstr. 27
ETH (Federal Inst. Technology)  8092 Zurich SWITZERLAND
phone: x-41-1-632-3408  fax: ...-1228   

__
[EMAIL PROTECTED] mailing list
http://www.stat.math.ethz.ch/mailman/listinfo/r-help



Re: [R] number plot symbol in scatterplot?

2003-01-03 Thread Thomas W Blackwell
Bill  -

The behavior of the old-S function  printer()  was to count overstrikes
in the way you describe.  This was a non-graphics output device which
would make a crude scatterplot using ascii characters (spaces and
asterisks, for example) in response to  plot() commands.

I've looked for printer in  help(Devices)  and I don't find it.
Non-graphics scatterplots probably haven't been implemented for R
yet, and probably aren't a high priority.  I think you'll have to
count the overstrikes at each location yourself (hash the x-y
coordinates), and then use the function  text()  to plot the
appropriate symbol at each x-y location.

-  tom blackwell  -  university of michigan medical  -  ann arbor  -


On Fri, 3 Jan 2003, Simpson, William wrote:

 If I make a scatterplot and several (e.g. 5) points lie on top of each other
 at a given x,y location I would like the plot symbol to be the number of
 superimposed points (e.g. 5). Could someone please tell me how to do this
 in R? Thanks!

 Bill Simpson

 __
 [EMAIL PROTECTED] mailing list
 http://www.stat.math.ethz.ch/mailman/listinfo/r-help


__
[EMAIL PROTECTED] mailing list
http://www.stat.math.ethz.ch/mailman/listinfo/r-help



Re: [R] as.POSIXct problem?

2003-01-03 Thread Don MacQueen
No problem on a couple of systems here, one Solaris and one Mac OS X. 
See below.

The conversion of a character string to a POSIXct is taking place in 
two steps--character string to POSIXlt, then POSIXlt to POSIXct. 
Which step has the problem?

Compare your unclass(x) with my unclass(x). If it's different, the 
problem would appear to be in converting text to POSIXlt.

My guess would be a bug in the underlying Linux code, since, as Dr. 
Ripley said, your system thinks it's an invalid time--yet the time is 
not invalid.

Does it fail only on that particular day? If there was a EDT to EST 
change that day, does it fail on other EDT to EST change days? If 
there was an EDT to EST change that day, did it occur at the usual 
2:00 AM? What about EST to EDT changes?

If your character strings were in the ISO standard format, it would 
be simpler to use as.POSIXct() directly, as in

 as.POSIXct(c('1969-10-10','2002-12-31'))

[1] 1969-10-10 PDT 2002-12-31 PST

 class(as.POSIXct(c('1969-10-10','2002-12-31')))

[1] POSIXt  POSIXct

But you probably don't have that luxury. Even so, it would be 
interesting to find out if it succeeds on your system.

-Don

 version

 _  
platform sparc-sun-solaris2.7
arch sparc  
os   solaris2.7 
system   sparc, solaris2.7  
status  
major1  
minor6.1
year 2002   
month11 
day  01 
language R
 x - strptime(c('10/10/1969','12/31/2002'),format='%m/%d/%Y')
 x

[1] 1969-10-10 2002-12-31

 as.POSIXct(x)

[1] 1969-10-10 PDT 2002-12-31 PST

 class(x)

[1] POSIXt  POSIXlt

 unclass(x)

$sec
[1] 0 0

$min
[1] 0 0

$hour
[1] 0 0

$mday
[1] 10 31

$mon
[1]  9 11

$year
[1]  69 102

$wday
[1] 5 2

$yday
[1] 282 364

$isdst
[1] 1 0





 OS X --

 version

 _ 
platform powerpc-apple-darwin6.2
arch powerpc   
os   darwin6.2 
system   powerpc, darwin6.2
status 
major1 
minor6.1   
year 2002  
month11
day  01
language R

 x - strptime(c('10/10/1969','12/31/2002'),format='%m/%d/%Y')
 x

[1] 1969-10-10 2002-12-31

 as.POSIXct(x)

[1] 1969-10-10 PDT 2002-12-31 PST

 unclass(x)

$sec
[1] 0 0

$min
[1] 0 0

$hour
[1] 0 0

$mday
[1] 10 31

$mon
[1]  9 11

$year
[1]  69 102

$wday
[1] 5 2

$yday
[1] 282 364

$isdst
[1] 1 0


At 8:56 PM -0500 1/2/03, Frank E Harrell Jr wrote:

Under

platform i686-pc-linux-gnu
arch i686
os   linux-gnu   
system   i686, linux-gnu 
status   
major1   
minor6.1 
year 2002
month11  
day  01  
language R   

 x - strptime(c('10/10/1969','12/31/2002'),format='%m/%d/%Y')
 x

[1] 1969-10-10 2002-12-31

 as.POSIXct(x)

[1] NA   2002-12-31 EST

Why the NA?  If this is not the preferred way to convert a character 
string to POSIXct what is?  On a more minor note why the EST if no 
time is printed?

Thanks,

Frank
--
Frank E Harrell Jr  Prof. of Biostatistics  Statistics
Div. of Biostatistics  Epidem. Dept. of Health Evaluation Sciences
U. Virginia School of Medicine  http://hesweb1.med.virginia.edu/biostat

__
[EMAIL PROTECTED] mailing list
http://www.stat.math.ethz.ch/mailman/listinfo/r-help


--
--
Don MacQueen
Environmental Protection Department
Lawrence Livermore National Laboratory
Livermore, CA, USA
--

__
[EMAIL PROTECTED] mailing list
http://www.stat.math.ethz.ch/mailman/listinfo/r-help



Re: [R] R talking to Oracle, ODBC drivers available ?

2003-01-03 Thread Don MacQueen
I believe that Oracle provides ODBC drivers for the server 
side--provided you have an Oracle administrator who can figure out 
how to install them. I couldn't find much documentation.

For another commercial source see www.openlink.com (don't know if 
they have client drivers for Linux).

-Don

At 1:10 PM + 1/3/03, Luke Whitaker wrote:
[sorry, but this is a re-post - I forgot to set the subject line
 the first time around]

Hello,

I would like to access an Oracle database running on Solaris from
R on my linux desktop. I have had a look at the R Data Import/Export
manual, and downloaded RODBC and unixODBC, but I am still quite confused
about how to proceed. It appears to me that I still need to get an Oracle
ODBC driver, and the only one I can find is from Easysoft for 880 euros,
which is not available for this project.

Is there any free software which will enable R on linux to talk to Oracle
on Solaris (both reading and writing data) ?

My R setup is as follows:

platform i386-pc-linux-gnu
arch i386
os   linux-gnu   
system   i386, linux-gnu 
status   
major1   
minor5.1 
year 2002
month06  
day  17  
language R   

Thanks,

Luke Whitaker.

__
[EMAIL PROTECTED] mailing list
http://www.stat.math.ethz.ch/mailman/listinfo/r-help


--
--
Don MacQueen
Environmental Protection Department
Lawrence Livermore National Laboratory
Livermore, CA, USA
--

__
[EMAIL PROTECTED] mailing list
http://www.stat.math.ethz.ch/mailman/listinfo/r-help



[R] Tutorials?

2003-01-03 Thread Joshua Gramlich
Are there any tutorials or books for learning R?  Of course, I have the
manual, but that seems more of a reference than a teaching tool.

Joshua Gramlich
Chicago, IL

__
[EMAIL PROTECTED] mailing list
http://www.stat.math.ethz.ch/mailman/listinfo/r-help



Re: [R] Tutorials?

2003-01-03 Thread Marc R. Feldesman
An Introduction to R, The R Core Team
An Introduction to Statistics with R, Peter Dalgaard
Modern Applied Statistics with S, W.N. Venables and B.D. Ripley, 4th Edition.

And many others with links on http://cran.r-project.org/ under 
Contributed|Documentation

__
[EMAIL PROTECTED] mailing list
http://www.stat.math.ethz.ch/mailman/listinfo/r-help


Re: [R] help.start() directory?

2003-01-03 Thread ripley
On Fri, 3 Jan 2003 [EMAIL PROTECTED] wrote:

 No, but as these are links you can bookmark the places they link to.

Oh, and the reason we now do it this way is that you can have two
simultaneous R sessions with quite different library trees, possibly
different versions of R running on different machines.

 On 3 Jan 2003, Michael A. Miller wrote:

  The help for help.start says that
 
All the packages in the known library trees are linked to
directory `.R' in the per-session temporary directory. The
links are re-made each time help.start is run, which should be
done after packages are installed, updated or removed.
 
  It used to be the case that this was the temporary directory was
  the .R directory in my home directory, but since I've upgraded to
  R 6.1, the help files are stored in /tmp/Rtmpxxx/.R, where
  Rtmpxxx is the result of a call to tempdir.  A side effect of
  this is that they are removed when ever I exit R and recreated
  when I start R and run help.start again.  I like to be able to
  have a couple book marks to my favorite parts of the help pages,
  but since the tempdir is different for each instance of R, that
  no longer works.  Is there a way to configure R to leave the help
  pages in a static location such as ~/.R?  Even if they get
  recreated each time I run help.start, I'd rather they always
  showed up in the same place.

-- 
Brian D. Ripley,  [EMAIL PROTECTED]
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford, Tel:  +44 1865 272861 (self)
1 South Parks Road, +44 1865 272866 (PA)
Oxford OX1 3TG, UKFax:  +44 1865 272595

__
[EMAIL PROTECTED] mailing list
http://www.stat.math.ethz.ch/mailman/listinfo/r-help



[R] Re: Embedding windows in a text widget

2003-01-03 Thread John Zhang
Could someone tell me how to embed windows in a text box using tkcreate 
command of R tcltk package? I tried the following and was not successful;

base - tktoplevel()
text - tktext(base, width = 30, height = 10)
tkpack(text)
button - tkbutton(text, text = try)
tkcreate(text, window, end, window = button)

Thanks.

JZ

__
[EMAIL PROTECTED] mailing list
http://www.stat.math.ethz.ch/mailman/listinfo/r-help



RE: Take care with codes()! (was [R] type of representation)

2003-01-03 Thread Warnes, Gregory R
Ahh yes, sorry about that.

Here's the corrected snippet:

# Create an Example Data Frame Containing Car x Color data
carnames - c(bmw,renault,mercedes,seat)
carcolors - c(red,white,silver,green)
datavals - round(rnorm(16, mean=10, sd=4),1)
data - data.frame(Car=rep(carnames,4),
   Color=rep(carcolors, c(4,4,4,4) ),
   Value=datavals )
# show the data
data

# plot the Car x Color combinations, using 'cex' to specify the dot size
plot(x=as.numeric(data$Car), # as.numeric give numeric values
 y=as.numeric(data$Color), 
 cex=data$Value/max(data$Value)*12,  # standardize size to (0,12)
 pch=19,  # filled circle
 col=skyblue, # dot color
 xlab=Car, # x axis label
 ylab=Color, # y axis label
 xaxt=n, # no x axis lables
 yaxt=n, # no y axis lables
 bty=n,  # no box around the plot
 xlim=c(0,nlevels(data$Car  )+0.5), # extra space on either end of plot
 ylim=c(0.5,nlevels(data$Color)+1.5)  # so dots don't cross into margins
 )

# add text labels
text(x=1:nlevels(data$Car), y=nlevels(data$Car)+1, labels=levels(data$Car))
text(x=0, y=1:nlevels(data$Color), labels=levels(data$Color) )

# add borders between cells
abline(v=(0:nlevels(data$Car)+0.5))
abline(h=(0:nlevels(data$Color)+0.5))

# annotate with actual values
text(x=as.numeric(data$Car), # as.numeric give numeric values
 y=as.numeric(data$Color), 
 labels=format(data$Value),   # label value
 col=black, # textt color
 )

# put a nice title
title(main=Car by Color Popularity\n(Dot size proportional to popularity))


-Greg

 -Original Message-
 From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED]]
 Sent: Friday, January 03, 2003 1:53 PM
 To: Warnes, Gregory R
 Cc: '[EMAIL PROTECTED]'; '[EMAIL PROTECTED]'
 Subject: RE: Take care with codes()! (was [R] type of representation)
 
 
 From the help page of codes():
 
  Normally `codes' is not the appropriate function to use with an
  unordered factor.  Use `unclass' or `as.numeric' to extract the
  codes used in the internal representation of the factor, as these
  do not assume that the codes are sorted.
 
 and this is one of the `normally' cases.  Your code will only work
 correctly if the levels are in alphabetical order (in the 
 locale in use).
 
 On Fri, 3 Jan 2003, Warnes, Gregory R wrote:
 
  How about this snippet:
 
  # Create an Example Data Frame Containing Car x Color data
  carnames - c(bmw,renault,mercedes,seat)
  carcolors - c(red,white,silver,green)
  datavals - round(rnorm(16, mean=10, sd=4),1)
  data - data.frame(Car=rep(carnames,4),
 Color=rep(carcolors, c(4,4,4,4) ),
 Value=datavals )
  # show the data
  data
 
  # plot the Car x Color combinations, using 'cex' to specify 
 the dot size
  plot(x=codes(data$Car), # codes give numeric values
   y=codes(data$Color),
   cex=data$Value/max(data$Value)*12,  # standardize size 
 to (0,12)
   pch=19,  # filled circle
   col=skyblue, # dot color
   xlab=Car, # x axis label
   ylab=Color, # y axis label
   xaxt=n, # no x axis lables
   yaxt=n, # no y axis lables
   bty=n,  # no box around the plot
   xlim=c(0,nlevels(data$Car  )+0.5), # extra space on 
 either end of plot
   ylim=c(0.5,nlevels(data$Color)+1.5)  # so dots don't 
 cross into margins
   )
 
  # add text labels
  text(x=1:nlevels(data$Car), y=nlevels(data$Car)+1, 
 labels=levels(data$Car))
  text(x=0, y=1:nlevels(data$Color), labels=levels(data$Color) )
 
  # add borders between cells
  abline(v=(0:nlevels(data$Car)+0.5))
  abline(h=(0:nlevels(data$Color)+0.5))
 
  # annotate with actual values
  text(x=codes(data$Car), # codes give numeric values
   y=codes(data$Color),
   labels=format(data$Value),   # label value
   col=black, # textt color
   )
 
  # put a nice title
  title(main=Car by Color Popularity\n(Dot size proportional 
 to popularity))
 
 
  -Greg
 
   -Original Message-
   From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED]]
   Sent: Friday, January 03, 2003 4:46 AM
   To: [EMAIL PROTECTED]
   Cc: [EMAIL PROTECTED]
   Subject: [R] type of representation
  
  
   Hi
  
   I have some data that i want to plot but i don't find how to
   do it. I have car
   types (bmw,renault,mercedes,seat ...), colors and a number
   for each car
   type-color relation.I want to come up with a matrix
   representation of cars vs
   colors where in each intersection i could set a dot
   proportional in size to my
   third variable.
  
  
   Can anybody give me a clue of hoe to come up with such 
 representation.
  
   Thanks
  
   Ramon
  
   __
   [EMAIL PROTECTED] mailing list
   http://www.stat.math.ethz.ch/mailman/listinfo/r-help
  
 
 
  LEGAL NOTICE\ Unless expressly stated otherwise, this 
 message is ... [[dropped]]
 
  __
  [EMAIL PROTECTED] mailing list
  

Re: [R] factor analysis (pca): how to get the 'communalities'?

2003-01-03 Thread Wolfgang Lindner
Scot,

thank you very much for your wonderful clear and short fix of my first problem: 
seeing your solution as one-liner in the impressive insightful syntax of R is 
really an aesthetic experience for me:

|  I ran your example and found that you can get the eigenvalues SPSS by [..]
|m.pca$sdev^2
|  So squaring the standard deviations (sdev) of the components gives you the
|  eigenvalues SPSS reports.

I am a little sorrow of not having seen it for myself ;-) - but I think that's 
live in becoming a friend of R and making the first steps with pca, fa, ca  co. 
R is indeed a first choice tool in doing understandable statistics and Prof 
Ripley's indication to R's open code points definitive in the same direction for 
me. Now the two worlds become reconciled and the fog gets thinner for me.
Thank you both.

Wolfgang
--
Wolfgang Lindner   [EMAIL PROTECTED]
   Gerhard-Mercator-Universitaet Duisburg  Tel: +49 0203 379-1326

__
[EMAIL PROTECTED] mailing list
http://www.stat.math.ethz.ch/mailman/listinfo/r-help



Re: [R] Tutorials?

2003-01-03 Thread Wolfgang Lindner
Joshua,

it's is always risky to give recommendations, but I have also learned much from

[1] Myatt: Open Source Solutions - R. 
(a free brief introduction to using the R environment in Rich Text (.RTF) and 
Acrobat (PDF) formats and including sample data written by Mark Myatt 
[EMAIL PROTECTED], available at
http://www.myatt.demon.co.uk/Rex1031.zip )

[2] P. Dalgaard: Introductory Statistics with R. Springer 2002. 
ISBN 0-387-95475-9

[3] J. Fox: An R and S-plus Companion to Applied Regression. Sage 2002.
ISBN 0-7619-2280-6

Take a look and happy R'ing

Wolfgang

--
Wolfgang Lindner   [EMAIL PROTECTED]
   Gerhard-Mercator-Universitaet Duisburg  Tel: +49 0203 379-1326

__
[EMAIL PROTECTED] mailing list
http://www.stat.math.ethz.ch/mailman/listinfo/r-help



Re: [R] factor analysis (pca): how to get the 'communalities'?

2003-01-03 Thread Brett Magill
If interested, on my web site I have code to do factor analysis by PC.  Does
exactly as below, but a nice wrapper to print methods, rotations, sorting, and
other conveniences.

  home.earthlink.net/~bmagill/MyMisc.html 

The relevant code snipets are prinfact, plot.pfa, and print.pfa, along
with the other required functions as indiciated on the web site.


On Fri, 3 Jan 2003 21:04:21 +0100 Wolfgang Lindner [EMAIL PROTECTED]
wrote:

 Scot,
 
 thank you very much for your wonderful clear
 and short fix of my first problem: 
 seeing your solution as one-liner in the
 impressive insightful syntax of R is 
 really an aesthetic experience for me:
 
 |  I ran your example and found that you can
 get the eigenvalues SPSS by [..]
 |m.pca$sdev^2
 |  So squaring the standard deviations (sdev)
 of the components gives you the
 |  eigenvalues SPSS reports.
 
 I am a little sorrow of not having seen it for
 myself ;-) - but I think that's 
 live in becoming a friend of R and making the
 first steps with pca, fa, ca  co. 
 R is indeed a first choice tool in doing
 understandable statistics and Prof 
 Ripley's indication to R's open code points
 definitive in the same direction for 
 me. Now the two worlds become reconciled and
 the fog gets thinner for me.
 Thank you both.
 
 Wolfgang
 --
 Wolfgang Lindner  
 [EMAIL PROTECTED]
Gerhard-Mercator-Universitaet Duisburg  Tel:
 +49 0203 379-1326
 
 __
 [EMAIL PROTECTED] mailing list
 http://www.stat.math.ethz.ch/mailman/listinfo/r-help


__
[EMAIL PROTECTED] mailing list
http://www.stat.math.ethz.ch/mailman/listinfo/r-help



Re: [R] as.POSIXct problem?

2003-01-03 Thread Peter Dalgaard BSA
[EMAIL PROTECTED] writes:

 Can you supply us with details?  For the ISO C99 standard actually says
 
 The mktime function returns the specified calendar time encoded as a value
 of type time_t. If the calendar time cannot be represented, the function
 returns the value (time_t)-1.
 
 and that is the behaviour that R expects.  Note that POSIX specifies what
 time_t is, but ISO C does not, so I am at a loss as to how this can be
 `more compliant with the ISO C standard'.

Just do a Google groups search for mktime glibc and the whole mess
turns up, including some pretty irate postgresql developers...

The spec in question appears to be

http://www.opengroup.org/onlinepubs/007904975/basedefs/xbd_chap04.html#tag_04_14 

although that doesn't actually mention mktime(), it just talks about
the definition of Seconds Since the Epoch. The actual definition is
in

http://www.opengroup.org/onlinepubs/007904975/functions/mktime.html

One particular piece of sillyness with mktime() is that it uses a
return value of -1 to signal error, leaving you with a problem for
1969-12-31 23:59:59 UTC if you allow extension to times before the
Epoch.

-- 
   O__   Peter Dalgaard Blegdamsvej 3  
  c/ /'_ --- Dept. of Biostatistics 2200 Cph. N   
 (*) \(*) -- University of Copenhagen   Denmark  Ph: (+45) 35327918
~~ - ([EMAIL PROTECTED]) FAX: (+45) 35327907

__
[EMAIL PROTECTED] mailing list
http://www.stat.math.ethz.ch/mailman/listinfo/r-help



Re: [R] as.POSIXct problem?

2003-01-03 Thread Suchandra Thapa
On Fri, 2003-01-03 at 13:52, [EMAIL PROTECTED] wrote:
Can you supply us with details?  For the ISO C99 standard actually says

The mktime function returns the specified calendar time encoded as a value
of type time_t. If the calendar time cannot be represented, the function
returns the value (time_t)-1.

and that is the behaviour that R expects.  Note that POSIX specifies what
time_t is, but ISO C does not, so I am at a loss as to how this can be
`more compliant with the ISO C standard'.

There was a discussion of the problem and possible workarounds on the
postgresql-hackers list.  If you do a search for glibc and mktime on
the  postgresql developer's website or use the following link
http://archives.postgresql.org/search.php?ps=10q=glibc+mktimeps=10wm=wrdo=0ul=%2Fpgsql-hackers%2Fm=allwf=11cat=
then you should be able to read the discussion.  The short of it seemed
to have been to document the problem and try to fix the problem in next
release.

Also there is a related bug report in redhat's bug database
(bugzilla.redhat.com) as bug 65227.  

The relevant section in the IEEE standard is
http://www.opengroup.org/onlinepubs/007904975/basedefs/xbd_chap04.html#tag_04_14
Basically, it defines the seconds since the epoch as being undefined
before 1970.  Since mktime returns the calendar time, the glibc
maintainers seem to have decided to change the return value for dates
earlier than 1970.

-- 
--

Suchandra S. Thapa 
[EMAIL PROTECTED]

--

__
[EMAIL PROTECTED] mailing list
http://www.stat.math.ethz.ch/mailman/listinfo/r-help