Re: [R] small sample size confidence interval by bootstrap

2006-04-01 Thread Prof Brian Ripley
On Fri, 31 Mar 2006, Urania Sun wrote:

 I only have 4 samples. I wish to get a confidence interval around the mean.
 Is it reasonable? If not, is there a way to compute a confidence interval
 for such small sample size's mean?

(BTW, the CI is for the population mean, not the sample mean.  I'll also 
assume that you are prepared to assume that you have a single random 
sample of size 4 from a location family.)

For a confidence interval, you need to make some assumptions about the 
distribution.  If you assume normality, you can use t.test, but the 
estimate of the standard deviation (on just 3 df) will be very variable 
and this will be reflected in the length of the CI.

Your subject line mentions the bootstrap.  You could use one of several 
different types of bootstrap CI but they also make assumptions, weaker 
assumptions that lead to even more variability.  For a sample of size 4 
there are (at most) 36 distinct means of bootstrap resamples, so none of 
the methods I know of will work adequately (and most not at all).

As an example to ponder, the Cauchy distribution does not even have a 
mean, but from small samples you will have no idea that is very 
long-tailed.  And getting a CI for a location parameter is often better 
done from a robust estimator of location than from the sample mean.
Alternatively, your true distribution might be a discrete distribution on 
5 points, and you have no idea at all about the 5th value.

-- 
Brian D. Ripley,  [EMAIL PROTECTED]
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford, Tel:  +44 1865 272861 (self)
1 South Parks Road, +44 1865 272866 (PA)
Oxford OX1 3TG, UKFax:  +44 1865 272595

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] [S] S-PLUS 8 beta program [repost]

2006-04-01 Thread A.J. Rossini
On 3/31/06, eugene dalt [EMAIL PROTECTED] wrote:
 In addition, I think people should always explore ways
 to repackage exclusive S-Plus facilities and libraries
 into R. By rewritting them or using GPL loopholes.

Licenses are important.  They state for the author, precisely what the
author intends.  Just because it's not really what the author might
want, is a different story.

The GPL has no loopholes.  It does precisely what it is supposed to
do, which it to provide free-as-in-speech software, which maintains
that freedom.  Read it; love it; use it :-).  I do.

Take BioConductor, which isn't GPL (it would've been by my choice,
which is what I argued for when we released it the first time).  The
intent was for others to use it, and not be forced to share the
changes.  I went along with that, fully aware of what I was going to
be contributing to.

 It seems to me that Insightful is very good at
 protecting whatever they create and the same time
 feels very comfortable taking R stuff to keep they
 clients happy. In essence they are selling free stuff.

Packaging is worth something.  It's why folks pay for Linux distributions.

I'm very happy that Insightful packages up free stuff.

I wish they'd do it better and add incredible amounts of value to
S-PLUS, so that it could truly compete with R.

There are some really nice facilities that R doesn't have, and
probably will never have. But it's still missing key features.

And I'm happy that they are finally listening to comments (I can't
claim to be the only one that suggested this over 7 years ago -- I
also can't claim to be a user (again), until my new job; but it's nice
that it took them less than a decade).


best,
-tony

[EMAIL PROTECTED]
Muttenz, Switzerland.
Commit early,commit often, and commit in a repository from which we can easily
roll-back your mistakes (AJR, 4Jan05).

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] [S] S-PLUS 8 beta program [repost]

2006-04-01 Thread A.J. Rossini
On 3/31/06, A.J. Rossini [EMAIL PROTECTED] wrote:


 There are some really nice facilities that R doesn't have, and
 probably will never have. But it's still missing key features.

It's late, and somewhere in my 18+ hour day today, I spent a miserable
evening in Heathrow.  Yech.  I meant:

There are some really nice facilities in S-PLUS that R doesn't have,
and probably never will have.  But S-PLUS is still missing key
features.

best,
-tony

[EMAIL PROTECTED]
Muttenz, Switzerland.
Commit early,commit often, and commit in a repository from which we can easily
roll-back your mistakes (AJR, 4Jan05).

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] Disable LazyLoading mechanism completely

2006-04-01 Thread Paul Roebuck
On Sat, 1 Apr 2006, Prof Brian Ripley wrote:

 On Fri, 31 Mar 2006, Dirk Eddelbuettel wrote:

  On 31 March 2006 at 16:15, Paul Roebuck wrote:
  | Is there a global option somewhere that can completely
  | disable the LazyLoad option? I want all my packages
  | in source format for searching purposes, crippled by
  | the conversion to database format.
 
  AFAICT setting LazyLoad=No in DESCRIPTION might do the trick.
 
  For reasons I fail to understand I have to set it to that value for Debian
  builds of the Rcmdr package, and the resulting package does indeed have R
  code as plain sources. I'd agree with you that being able to read, and hence
  search, the code is a good thing but I also trust the R masters on the
  advantages of lazy loading.  There may be an option to diable it globally 
  ...

 Quite a large number of packages (including almost all using S4 methods)
 *do not work* without lazyloading or saved images.  So this option would
 just break other people's work.

Can you elaborate a bit on this statement? S4 predates
LazyLoad and used to work just fine before that mechanism
was introduced. Indeed, several of my own packages have S4
methods and work with neither LazyLoading nor saved images.
Should that be read as since package authors made one-way
changes to support LazyLoad? Is there some way to tell
whether a package can do without it?

--
SIGSIG -- signature too long (core dumped)

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] reference paper about SOM

2006-04-01 Thread Colm Connolly
On eof the best sources of info on the SOM is Kohonen's book. Try and
get this if you can.

@Book{kohonen95:_self,
  author =   {Teuvo Kohonen},
  title ={Self-organizing maps},
  publisher ={Springer},
  year = 1995,
  address =  {Berlin}
}

On 01/04/06, Linda Lei [EMAIL PROTECTED] wrote:
 Hi All,



 I'm looking for some reference paper about SOM (self organizing map)
 algorithm. I tried the paper which is mentioned in



 the help page of function som (package:som):
 http://www.cis.hut.fi/research/papers/som_tr96.ps.Z



 But I can't open it for some reason. Could you please help me with it ?



 Thanks a lot!




 [[alternative HTML version deleted]]

 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] Disable LazyLoading mechanism completely

2006-04-01 Thread Duncan Murdoch
On 3/31/2006 6:03 PM, Paul Roebuck wrote:
 On Fri, 31 Mar 2006, Duncan Murdoch wrote:
 
 On 3/31/2006 5:15 PM, Paul Roebuck wrote:
 Is there a global option somewhere that can completely
 disable the LazyLoad option? I want all my packages
 in source format for searching purposes, crippled by
 the conversion to database format.
 Why not just keep the source?  Maybe I misunderstood where
 you're thinking of searching, but I think even without
 LazyLoad, you'll lose comments in the functions.
 
 My thought is something along the lines of
 
 options(lazy.load = FALSE)
 
 which would override the setting in the DESCRIPTION file
 as though every package had set it as:
 
 LazyLoad: no
 
 I want the ability to patch/search the code by default.
 LazyLoad removes this capability. I know there will be a
 minor speed hit and accept that in exchange for the above
 capability.

Lazy loading just replaces each object with a promise to load it, right? 
  So if you need to do a search/patch, couldn't you force those promises 
during the search?

Duncan Murdoch

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] Problem with help() respectively ?

2006-04-01 Thread Uwe Ligges
Werner Wernersen wrote:

 Hi,
 
 I have a problem with my the R help system. I have
 been searching through the help archives but I can't
 find anything about it and I don't know how to specify
 my search better.
 
 When I type for instance ?hist, just the next
 command line prompt shows but nothing actually
 happens. There is no help window coming up.
 
 I already tried reinstalling R (2.2.1, windows XP) but
 that didn't change anything. I am pretty sure the help
 has been working before on this machine.
 
 Has anybody an idea how to recover the help system?



Which help format do you use?

What about
   help(help, chmhelp=FALSE, htmlhelp=FALSE, offline=FALSE)
Does this work fo you?

Uwe Ligges



 Thanks for your help,
   Werner
 
 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] rowVars

2006-04-01 Thread Uwe Ligges
r user wrote:

 I am using the R 2.2.1 in a Windows XP environment.
 
 I have a dataframe with 12 columns and 1,000 rows.
 (Some of the rows have 1 or fewer values.)

fewer than 1? So you mean 0?
Well, the commonly used estimator for the variance is defined for n=2 
elements only!


 I am trying to use rowVars to calculate the variance
 of each row.

I don't know a function rowVars in R-2.2.1 - and cannot find it ...

 I am getting the following message:
 “Error in na.remove.default(x) : length of 'dimnames'
 [1] not equal to array extent”

Hard to tell without knowing what you actually did.

Uwe Ligges


 
 Is there a good work-around?
 
 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


[R] (no answer)

2006-04-01 Thread Frank E Harrell Jr
I have never taken a statistics class nor read a statistics text, but
I am in dire need of help with a trivial data analysis problem for
which I need to write a report in two hours.  I have spent 10,000
hours of study in my field of expertise (high frequency noise-making
plant biology) but I've always thought that statistics is something
that can be mastered on short notice.

Briefly, I have an experiment in which a response variable is
repeatedly measured at 1-day intervals, except that after a plant
becomes sick, it is measured every three days.  We forgot to randomize
on one of the important variables (soil pH) and we forgot to measure
the soil pH.  Plants that begin to respond to treatment are harvested
and eaten (deep fried if they don't look so good), but we want to make
an inference about long-term responses.  In addition, we forgot to
measure the response on some of the days before the plant was
terminated.  Some baseline variables were not measured for some
plants, when some of the other variables looked OK.  The response
variable is only known to exceed a certain value in some cases, and in
others is only known to be less than a certain value.  The response
variable also has a great number of ties at zero, and has extreme high
outliers.  The variability of responses seems to depend on whether
there was missing variables for the plant.  And halfway through the
experiment we changed instrumentation and personnel.  All of these
problems seem trivial when compared to what I have to deal with every
day in measuring plant sounds, so I hope that someone can help me as
soon as possible.  I would appreciate receiving a few paragraphs of
description of the analysis that I can include in my report, and I
would like to receive R code to analyze the data no matter which
variables I collect.  I do value your time, so you will get my
everlasting thanks.


Note that I will be out of the office from 1:15pm to 1:25pm today.
This information should be valuable to many.

I. Ben Fuld
Technical University of Plant Kinetics
Slapout, Alabama

LEGAL NOTICE\ Unless expressly stated otherwise, this messag...{{dropped}}

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] (no answer)

2006-04-01 Thread Ted Harding
On 01-Apr-06 Frank E Harrell Jr wrote:
 I have never taken a statistics class nor read a statistics text, but
 I am in dire need of help with a trivial data analysis problem for
 which I need to write a report in two hours.  I have spent 10,000
 hours of study in my field of expertise (high frequency noise-making
 plant biology) but I've always thought that statistics is something
 that can be mastered on short notice.

Dear Ibn Fuld (I apologise for rewriting your name correctly,
and I do appreciate the problems of people who do not natively
speak English, but I thought that doind so would be useful for
the members of the list who have the coverse problem),

You have an evidently complex problem there, but apparently a
very short time in which to solve it. Happily, I see a very
simple solution.

Just talk to your plants, ask them how they are, record their
acoustic responses, and use your existing expertise to analyse
and interpret the latter. I judge that you need to learn nothing
new to do this, your institution should posses the required
technology, and I suspect that little if any new R code would be
required.

Good luck,
ZB [EMAIL PROTECTED]


 Briefly, I have an experiment in which a response variable is
 repeatedly measured at 1-day intervals, except that after a plant
 becomes sick, it is measured every three days.  We forgot to randomize
 on one of the important variables (soil pH) and we forgot to measure
 the soil pH.  Plants that begin to respond to treatment are harvested
 and eaten (deep fried if they don't look so good), but we want to make
 an inference about long-term responses.  In addition, we forgot to
 measure the response on some of the days before the plant was
 terminated.  Some baseline variables were not measured for some
 plants, when some of the other variables looked OK.  The response
 variable is only known to exceed a certain value in some cases, and in
 others is only known to be less than a certain value.  The response
 variable also has a great number of ties at zero, and has extreme high
 outliers.  The variability of responses seems to depend on whether
 there was missing variables for the plant.  And halfway through the
 experiment we changed instrumentation and personnel.  All of these
 problems seem trivial when compared to what I have to deal with every
 day in measuring plant sounds, so I hope that someone can help me as
 soon as possible.  I would appreciate receiving a few paragraphs of
 description of the analysis that I can include in my report, and I
 would like to receive R code to analyze the data no matter which
 variables I collect.  I do value your time, so you will get my
 everlasting thanks.
 
 
 Note that I will be out of the office from 1:15pm to 1:25pm today.
 This information should be valuable to many.
 
 I. Ben Fuld
 Technical University of Plant Kinetics
 Slapout, Alabama
 
 LEGAL NOTICE\ Unless expressly stated otherwise, this
 messag...{{dropped}}
 
 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide!
 http://www.R-project.org/posting-guide.html


E-Mail: (Ted Harding) [EMAIL PROTECTED]
Fax-to-email: +44 (0)870 094 0861
Date: 01-Apr-06   Time: 15:44:50
-- XFMail --

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] (no answer)

2006-04-01 Thread Gabor Grothendieck
Oh, and I forgot to add.  Please generate some test data for me since
I can't possibly take time out to provide such in order to clarify the
question.  By the way, I did try out R a bit but it did not work and its
too much effort to provide the R code I have or to reduce it to a small
self contained reproducible example to illustrate the salient point.

I am thanking you in advance since I will probably be too busy
to acknowledge the help or to summarize the answers for the
benefit of others and the list archives.  Its not that I don't want to
but I probably won't follow up on your answers anyways if they involve reading
and thinking about help pages, the manual, the FAQ, the posting guide,
statistics, mathematics, programming or other material.

By the way, please email me directly since I normally don't read
the list.

:)

On 4/1/06, Frank E Harrell Jr [EMAIL PROTECTED] wrote:
 I have never taken a statistics class nor read a statistics text, but
 I am in dire need of help with a trivial data analysis problem for
 which I need to write a report in two hours.  I have spent 10,000
 hours of study in my field of expertise (high frequency noise-making
 plant biology) but I've always thought that statistics is something
 that can be mastered on short notice.

 Briefly, I have an experiment in which a response variable is
 repeatedly measured at 1-day intervals, except that after a plant
 becomes sick, it is measured every three days.  We forgot to randomize
 on one of the important variables (soil pH) and we forgot to measure
 the soil pH.  Plants that begin to respond to treatment are harvested
 and eaten (deep fried if they don't look so good), but we want to make
 an inference about long-term responses.  In addition, we forgot to
 measure the response on some of the days before the plant was
 terminated.  Some baseline variables were not measured for some
 plants, when some of the other variables looked OK.  The response
 variable is only known to exceed a certain value in some cases, and in
 others is only known to be less than a certain value.  The response
 variable also has a great number of ties at zero, and has extreme high
 outliers.  The variability of responses seems to depend on whether
 there was missing variables for the plant.  And halfway through the
 experiment we changed instrumentation and personnel.  All of these
 problems seem trivial when compared to what I have to deal with every
 day in measuring plant sounds, so I hope that someone can help me as
 soon as possible.  I would appreciate receiving a few paragraphs of
 description of the analysis that I can include in my report, and I
 would like to receive R code to analyze the data no matter which
 variables I collect.  I do value your time, so you will get my
 everlasting thanks.


 Note that I will be out of the office from 1:15pm to 1:25pm today.
 This information should be valuable to many.

 I. Ben Fuld
 Technical University of Plant Kinetics
 Slapout, Alabama

 LEGAL NOTICE\ Unless expressly stated otherwise, this messag...{{dropped}}

 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


[R] -newbie | RODBC import query

2006-04-01 Thread Evan Cooch
Greetings -

After 20+ years of using SAS, for a variety of reasons, I'm using [R] 
for a bunch of things - while I'm getting a pretty good a handling 
[R] for script programming, and statistical analysis, I'm struggling 
with 'pulling data into [R]'. For reasons beyond my control, a number 
of the files I get sent to 'work with' are in Dbase format (*.dbf). 
For another host of reasons, I need to be able to read directly into 
[R] from these files (no using intermediate .CVS or delimited ASCII files).

OK, so after a bit of reading, seems I need to use RODBC (I'm using 
[R] 2.2.1 for Windows, at the moment). But, I can't seem to figure 
out the basics. Suppose the file I need to 'work with' is 
test.dbf  So, I try the following:

  library(RODBC);
  import_dat - odbcConnectDbase(c:\documents and 
settings\egc\desktop\test.dbf)

OK, so far so good - well, at least no outright errors gets chunked 
out to the console. Now what? Here's where I get stuck. There is a 
table in the test.dbf file called TEST. But, the following

tester - sqlFetch(import_dat,TEST)

blows up - I get the following error message in the console:

Error in odbcTableExists(import_dat, sqtable) :
 'TEST': table not found on channel

OK - so it doesn't seem to find the table TEST in test.dbf. I tried 
lower-case for TEST (i.e., test), but that doesn't seem to solve the 
problem. OK, so lets pretend I don't know what the table in test.dbf 
is called, and use sqlTables instead:

table_list - sqlTables(import_dat)

When I then enter table_list in the console, I get

[1] TABLE_CAT   TABLE_SCHEM TABLE_NAME  TABLE_TYPE  REMARKS
0 rows (or 0-length row.names)

Meaning, what? It almost seems that its telling me there is nothing 
in test.dbf. Well, there definitely is (I can open it up in Excel - 
shudder), but, perhaps it is unable to recognize whats there.


Suggestions? Apologies if this is easy, or (worse) and FAQ.

Thanks!

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] -newbie | RODBC import query

2006-04-01 Thread Gabor Grothendieck
On 4/1/06, Evan Cooch [EMAIL PROTECTED] wrote:
 Greetings -

 After 20+ years of using SAS, for a variety of reasons, I'm using [R]
 for a bunch of things - while I'm getting a pretty good a handling
 [R] for script programming, and statistical analysis, I'm struggling
 with 'pulling data into [R]'. For reasons beyond my control, a number
 of the files I get sent to 'work with' are in Dbase format (*.dbf).
 For another host of reasons, I need to be able to read directly into
 [R] from these files (no using intermediate .CVS or delimited ASCII files).

 OK, so after a bit of reading, seems I need to use RODBC (I'm using
 [R] 2.2.1 for Windows, at the moment). But, I can't seem to figure
 out the basics. Suppose the file I need to 'work with' is
 test.dbf  So, I try the following:

  library(RODBC);
  import_dat - odbcConnectDbase(c:\documents and
 settings\egc\desktop\test.dbf)

\ is the escape character so if you want a \ you must use \\ like this:

odbcConnectionDbase(c:\\documents and settings\\egc\\desktop\\test.dbf)

or do this:

odbcConnectionDbase(file.choose())





 OK, so far so good - well, at least no outright errors gets chunked
 out to the console. Now what? Here's where I get stuck. There is a
 table in the test.dbf file called TEST. But, the following

 tester - sqlFetch(import_dat,TEST)

 blows up - I get the following error message in the console:

 Error in odbcTableExists(import_dat, sqtable) :
 'TEST': table not found on channel

 OK - so it doesn't seem to find the table TEST in test.dbf. I tried
 lower-case for TEST (i.e., test), but that doesn't seem to solve the
 problem. OK, so lets pretend I don't know what the table in test.dbf
 is called, and use sqlTables instead:

 table_list - sqlTables(import_dat)

 When I then enter table_list in the console, I get

 [1] TABLE_CAT   TABLE_SCHEM TABLE_NAME  TABLE_TYPE  REMARKS
 0 rows (or 0-length row.names)

 Meaning, what? It almost seems that its telling me there is nothing
 in test.dbf. Well, there definitely is (I can open it up in Excel -
 shudder), but, perhaps it is unable to recognize whats there.


 Suggestions? Apologies if this is easy, or (worse) and FAQ.

 Thanks!

 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


[R] barchart in black white with 10 categories?

2006-04-01 Thread Fredrik Karlsson
Dear list,

I am trying to plot  a barchart (from lattice) in B  W, with 10
categories per bar.
It seems that the colours are recycled after reachingcategory number
7, which creates a problem interpreting the chart. I therefore have
two questions:

1) How do I get more shades?

2) Does anyone have a theme to share with me with distinctive shades
in Black  White?


Thankful for all the help I can get.

/Fredrik

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] -newbie | RODBC import query

2006-04-01 Thread Evan Cooch
At 12:14 PM 4/1/2006, Gabor Grothendieck wrote:

  OK, so after a bit of reading, seems I need to use RODBC (I'm using
  [R] 2.2.1 for Windows, at the moment). But, I can't seem to figure
  out the basics. Suppose the file I need to 'work with' is
  test.dbf  So, I try the following:
 
   library(RODBC);
   import_dat - odbcConnectDbase(c:\documents and
  settings\egc\desktop\test.dbf)

\ is the escape character so if you want a \ you must use \\ like this:

odbcConnectionDbase(c:\\documents and settings\\egc\\desktop\\test.dbf)

or do this:

odbcConnectionDbase(file.choose())

Well, OK, but that doesn't make any apparent difference it in terms 
of solving the problem I'm having. the odbcConnecDbase command is 
working fine (i.e., its finding the test.dbf file), but after that, nada.

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] S-PLUS 8 beta program [repost]

2006-04-01 Thread Spencer Graves
  This thread may already contain too many contributions, but I feel 
compelled to add something:

STEALING?

  Insightful was accused of stealing GNU software.  On that issue, I 
will as the accusor(s) and other to please read the GNU license (e.g., 
at http://www.gnu.org/copyleft/gpl.html), making special not of the 
following portion of paragraph 2 of TERMS AND CONDITIONS FOR COPYING, 
DISTRIBUTION AND MODIFICATION:

  If identifiable sections of that work [software that uses GNU code] 
are not derived from the Program [GNU code or derivative], and can be 
reasonably considered independent and separate works in themselves, then 
this License, and its terms, do not apply to those sections when you 
distribute them as separate work.

  I work for a company that develops and licenses specialized software 
for a narrow market niche.  We do not currently connect our software to 
R, but we plan to do so in the future, using a completely separate 
add-on module that we can distribute free with the source code, 
consistent with the GNU license.

  If Insightful is guilty of stealing when making GNU code available to 
their customers, then I'm guilty of stealing each time I apply something 
I've learned from a published research report or book.

  I have not yet contributed any packages to CRAN, though I have 
contributed suggestions that have improved some of R's capabilities. 
Moreover, I expect to contribute packages in the future.  I would be 
happy to have Insightful make my code more widely available (especially 
if I don't have to do any more work to make it available to the wider 
audience).


FOR PROFIT vs. NOT FOR PROFIT

  Some years ago, I worked for a not-for-profit research firm.  I was 
paid to work there, and my boss carefully explained to me, We're not 
for profit, but we're not for loss, either.

  I now work for a company that develops and distributes data analysis 
software for a narrow, specialized market.  Some of our former 
competitors went out of business, because they did not charge as much as 
we do and could not afford to develop and maintain their code enough to 
compete with us.  Our customers pay our license fees, because they 
believe the use of our software (1) saves them substantially more money 
than we charge and (2) allows them to provide their customers with 
better products at lower cost.

  I also like the R model, with (a) many of the R core team being 
university professors who support R as part of the research obligations 
of their job and (b) substantive contributions by many others around the 
world who contribute small portions of their spare time to support this 
r-help listserve and contribute code to CRAN.

  I don't agree with the ideologues of either the Right or the Left: 
For profit, not for profit and volunteer efforts all make positive 
contributions to building a better world.

  Best Wishes,
  spencer graves

eugene dalt wrote:

 Point well taken. You should, however, expect R users
 to bring up concerns. This isn't a win-win situation
 as
 you sound...and you want to keep s-news in the dark
 too.
 
 Frankly, you didn't address the real issue. How would 
  Insightful reacts for example if they find R users
 repackaging your products named Infact or
 Insightful Miner? or as you said R users (instead of
 Splus users) need access to the cutting-edge, quality
 statistical software available in Splus (instead of
 other way round).
 
 Insightful wants to take any cutting-edge, quality
 statistical software from R, but they patent any
 cutting-edge, quality statistical software they
 create. Hence my call to people to use any loopholes
 to make these available or rewrite them. That would be
 win-win.
 
 ps. This is my last email on this issue.
 
 
   
 
 
 --- David Smith [EMAIL PROTECTED] wrote:
 
 
eugene dalt [EMAIL PROTECTED] writes:

It seems to me that Insightful is very good at
protecting whatever they create and the same

time

feels very comfortable taking R stuff to keep they
clients happy. In essence they are selling free

stuff.

I must defend Insightful on this point. As an
employee of Insightful, that's
to be expected, but I've also been an author of free
software since the
earliest days when the term free software was in
use ... but more on that,
later.

In no sense are we taking R stuff with these
improvements in S-PLUS 8.
Packages written for R will remain free, as they
must.  This isn't just
because the license says so, but because we believe
that an open-source
community around packages that will run with both
S-PLUS and R is a good
thing.  It's good for S-PLUS users, certainly: we've
heard loud and clear
from our users that they need a better way to
package user-contributed
libraries for S-PLUS, and they need access to the
cutting-edge, quality
statistical software available in the open-source
world.  But it's also good
for the statistical 

Re: [R] Data assimilation / inverse modeling in R

2006-04-01 Thread Spencer Graves
  1.  The Details for ?help.search indicate that it searches ONLY 
packages currently installed locally.

  2.  See Also for ?help.search includes 'RSiteSearch' to access 
an on-line search of R resources.

  3.  'help.search(Kalman filter)' for me just now identified only 
KalmanLike(stats):  Kalman Filtering.  However, 'RSiteSearch(Kalman 
filter)' produced 80 documents, and 'RSiteSearch(Kalman filter, 
functions)' produced 18.

  4.  RSiteSearch(assimilation) produced only 3 hits, being your post 
and two others connected to it appearently because two other posters 
just replied to your post, changing subject and content, not realizing 
that the RSiteSearch software somehow keeps those connections.

  5.  RSiteSearch(inverse model) produce 723 hits, so that's not much 
help either.

  6.  Have you read Venables and Ripley (2002) Modern Applied Statists 
with S (Springer), especially the chapter on time series?  I highly 
recommend this book and this chapter in particular.  Also, Have you 
worked through the vignettes associated with the zoo package? If no, 
you might find that quite useful. [Are you aware that 
edit(vignette(...)) will provide a script file with the R code discussed 
in the vignette, which can be viewed in Adobe Acrobat while you are 
working throught the examples line by line, modifying them, etc.? I've 
found this to be very useful. If you use XEmacs, edit(vignette(...)) 
may not work. Instead, try Stangle(vignette(...)$file). This will copy 
the R code to a file in the working directory, which you can then open.] 
Also, the dse bundle does some Kalman filtering and has vignettes.  =

  7.  I think I have some familiarity with a fairly broad range of 
statistical methods, but I can't guess what you might mean by data 
assimilation / inverse modeling, apart from your use of the term 
Kalman filter.  PLEASE do read the posting guide! 
www.R-project.org/posting-guide.html, then prepare another post 
preferably including a small, self contained example to illustrate 
something you tried, explaining why that did not quite do what you want.

  hope this helps,
  spencer graves

Lapointe, Pierre wrote:

 Hello, 
 
 I'm trying to find out if something has been written in R regarding data
 assimilation and inverse modeling.
 
 These searches do not return anything that look like Kalman filter
 variations (EK, SEEK, ROEK, etc.)
 
 help.search(assimilation)
 help.search(inverse model)
 
 Regards,
 
 
 
 **
 AVIS DE NON-RESPONSABILITE: Ce document transmis par courrie...{{dropped}}
 
 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] Random Coefficients using coxme

2006-04-01 Thread Spencer Graves
  Have you tried contacting the authors and / or maintainer for the 
'kinship' package?  Name and email address should appear with 
help(package=kinship);  this does NOT work for me, because I don't 
have kinship installed.  From www.r-project.org - CRAN - (select a 
mirror) - packages, I learned that kinship was developed by Beth 
Atkinson for pedigree functions [and] Terry Therneau for all other 
functions and maintained by Jing hua Zhao.  This provided email 
addresses for Atkinson and Therneau but hot for Zhao.

  If you need more help on this, have you considered 
www.bioconductor?  This looks like it might be closer to their 
interests.

  hope this helps.
  spencer graves

Powers, Mark wrote:

 Hello, I was hoping someone could answer a question for me that may
 either be statistical or script related.  I don't come from a statistics
 background, so I am not positive if I am using the correct nomenclature
 or even the correct procedure.  Is it possible to model random
 coefficients in a mixed effects cox-regression using coxme from the
 Kinship package?  For example, using lmer from the lme4 package, I can
 model V1 and V2 as a fixed  random coefficient:
 
 
 Mod1=lmer(y ~ V1 + V2 + V3 + V4 + (1 + V1 + V2|GROUP))
 
 
 Can I do something like this, though correctly without getting the error
 (Error in max(kindex) : object kindex not found)?
 
 Mod2=coxme(Surv(YDAYS, Y) ~ V1 + V2 + V3 + V4, data=ds1, random=~1 + V1
 + V2|GROUP
 
 The GROUP variable is a census tract, while V1  V2 are individual-level
 characteristics of people. Thanks in advance.
 
 
 Mark Powers
 [EMAIL PROTECTED]
 
 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] Problem with help() respectively ?

2006-04-01 Thread Werner Wernersen
  Hi,
  
  I have a problem with my the R help system. I have
  been searching through the help archives but I
 can't
  find anything about it and I don't know how to
 specify
  my search better.
  
  When I type for instance ?hist, just the next
  command line prompt shows but nothing actually
  happens. There is no help window coming up.
  
  I already tried reinstalling R (2.2.1, windows XP)
 but
  that didn't change anything. I am pretty sure the
 help
  has been working before on this machine.
  
  Has anybody an idea how to recover the help
 system?
 
 
 
 Which help format do you use?
 
 What about
help(help, chmhelp=FALSE, htmlhelp=FALSE,
 offline=FALSE)
 Does this work fo you?
 
 Uwe Ligges
 
Yes! Thanks, that works. And now it works just fine
again... I have no idea what changed...

Thanks for your help,
  Werner

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] small sample size confidence interval by bootstrap

2006-04-01 Thread J Dougherty
On Friday 31 March 2006 18:21, Urania Sun wrote:
 Hi, All:

 I only have 4 samples. I wish to get a confidence interval around the mean.
 Is it reasonable? If not, is there a way to compute a confidence interval
 for such small sample size's mean?

 Many thanks,

With a sample that small, it is far safer to simply consider them as four 
examples and leave it at that.  In a population where there is little 
variation (say an archaeological projectile point type with a nech width that 
varies between 3 and 5 mm), the examples are likely to be close to typical, 
and the difference isn't really llikely to be important anyway.  However, in 
a population with considerable variation (for example height in humans) you 
can see that trying to make any generalizations from 4 examples is going to 
be more likely to be misleading than anything else.  

If your sample of four is your entire population, you have all the information 
possible through simple measurements.  But, if the population were 100 the 
number of possible samples of size 4 is, as Gnumeric assures me, about 4 x 
10^306, which, to put it scientifically, is a whole bunch.  It'is better not 
to generalize from small samples.

JD

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


[R] Sourcing data files into different array fields

2006-04-01 Thread Werner Wernersen
Hi,

I make a number of experiments with a software which
spits out one .R file of data objects to be sourced
for each experiment. Now, the files repeat the data
object names but ultimately I also want to perform
comparision statistics between several objects. 

Now my question is, is there a slick way to wrap
those data objects while sourcing each experiment file
so that for instance I get an array m[] for each
experiment and can access the objects by using
m[1].object1 and so on?

I am thankful for any suggestion!
  Werner

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] loglikelihood and lmer

2006-04-01 Thread Douglas Bates
On 3/31/06, Marco Geraci [EMAIL PROTECTED] wrote:
 Dear R users,
 I am estimating Poisson mixed models using glmmPQL
 (MASS) and lmer (lme4). We know that glmmPQL do not
 provide the correct loglikelihood for such models (it
 gives the loglike of a 'pseudo' or working linear
 mixed model). I would like to know how the loglike is
 calculated by lmer.

Good point.  The person who wrote the documentation (i.e. I) should
have mentioned that.

With lmer, one can fit a generalized linear mixed model using PQL or
by optimizing the Laplacian approximation to the deviance directly. 
Even when you use PQL, which is the default method, the Laplace
approximation is evaluated at the converged parameter estimates.  This
is the value of the loglikelihood that is reported.

I am reconsidering having PQL as the default method for fitting
generalized linear mixed models with lmer. I would appreciate it if
you and others who do fit such models could tell me if it is
noticeably slower or less reliable to do direct optimization of the
Laplace approximaiton.  That is, if you use the combination of
optional arguments method = Laplace and control = list(usePQL =
FALSE) does the fit take substantially longer?

On your example I get

 system.time(fit1.lmer - lmer(z ~ X-1 + (1|id), family=poisson))
[1] 0.48 0.01 0.54 0.00 0.00
 system.time(fit2.lmer - lmer(z ~ X-1 + (1|id), family=poisson, method = 
 Laplace, control = list(usePQL = FALSE)))
[1] 0.61 0.00 0.62 0.00 0.00
 fit1.lmer
Generalized linear mixed model fit using PQL
Formula: z ~ X - 1 + (1 | id)
 Family: poisson(log link)
  AIC  BIClogLik deviance
 922.2406 934.8844 -458.1203 916.2406
Random effects:
 Groups NameVariance Std.Dev.
 id (Intercept) 0.82446  0.908
number of obs: 500, groups: id, 100

Estimated scale (compare to 1)  1.021129

Fixed effects:
   Estimate Std. Error z value  Pr(|z|)
X1 1.003639   0.098373  10.202  2.2e-16 ***
X2 2.075037   0.052099  39.829  2.2e-16 ***
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Correlation of Fixed Effects:
   X1
X2 -0.337
 fit2.lmer
Generalized linear mixed model fit using Laplace
Formula: z ~ X - 1 + (1 | id)
 Family: poisson(log link)
 AIC  BIC   logLik deviance
 921.958 934.6019 -457.979  915.958
Random effects:
 Groups NameVariance Std.Dev.
 id (Intercept) 0.8895   0.94313
number of obs: 500, groups: id, 100

Estimated scale (compare to 1)  0.04472136

Fixed effects:
   Estimate Std. Error z value  Pr(|z|)
X1 0.985721   0.101645   9.698  2.2e-16 ***
X2 2.075060   0.052114  39.818  2.2e-16 ***
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Correlation of Fixed Effects:
   X1
X2 -0.326

The only unexpected part of that output is the estimated scale which
is wrong (well, it is never calculated in this case and consequently
should not be displayed).

 A minor question is: why do glmmPQL and lmer give
 different degrees-of-freedom for the same estimated
 model? Does glmmPQL consider the scale parameter 'phi'
 as a degree of freedom?

I believe that is the reason.  The class of an object fit by glmmPQL is
 class(fm1)
[1] glmmPQL lme

and the only method I could find for the glmmPQL class is
 methods(class = glmmPQL)
[1] predict.glmmPQL*

   Non-visible functions are asterisked

Generic functions other than predict will choose the method for the
lme class of linear mixed effects models (or, if there isn't an lme
method, the default method).  The lme methods defined in the nlme
package are appropriate for linear mixed effects models (which is what
Jose and I wrote them for) and typically are not appropriate for a
generalized linear mixed model.

 A toy example

 set.seed(100)
 m - 5
 n - 100
 N - n*m
 X - cbind(1,runif(N))
 Z - kronecker(diag(n),rep(1,m))
 z - rpois(N, exp(X%*%matrix(c(1,2)) +
 Z%*%matrix(rnorm(n
 id - rep(1:n,each=m)
 fit.glmm - glmmPQL(z ~ X-1, random = ~1|id,
 family=poisson,verbose=F)
 fit.lmer - lmer(z ~ X-1 + (1|id),
 family=poisson,verbose=F)

  logLik(fit.glmm)
 'log Lik.' -386.4373 (df=4)
  logLik(fit.lmer)
 'log Lik.' -458.1203 (df=3)

 Thanks,
 Marco



  sessionInfo()
 R version 2.2.1, 2005-12-20, i386-pc-mingw32

 attached base packages:
 [1] methods   stats graphics  grDevices
 utils
 [6] datasets  base

 other attached packages:
mvtnormSemiParcluster   lme4lattice
 Matrix
0.7-21.0-1   1.10.4  0.995-2  0.12-11
  0.995-5
   nlme   MASS
 3.1-68.1   7.2-24
 

 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


[R] Using vectorization instead of for loop for performing a calculation efficiently

2006-04-01 Thread Peter Wilkinson
 I am trying to write an efficient function that will do the following:

Given an nxm matrix, 10 rows (observations) by 10 columns (samples) 
for each row, test of all values in the row are greater than a value k
If all values are greater than k, then set all values to NA (or something),
Return an nxm matrix with the modified rows.

If I do this with a matrix of 20,000 rows, I will be waiting until Christmas
for it to finish:

For rows in Matrix:
if rows  filter
set all elements in rows to NA   (or something)
else
do nothing
Return the matrix with the modified rows


I don’t know how to code this properly. The following:

If (sum(ifelse(nvectorfilter,1,0) == 0 ) )

Tells me if any row has at least 1 value above the filter. How do I get rid
of the 'outer' loop?

Peter

-

Peter Wilkinson
Senior Bioinformatician / Programmer-Analyst
National Immune Monitoring Laboratory
tel: (514)-343-7876

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] Using vectorization instead of for loop for performing a calculation efficiently

2006-04-01 Thread Gabor Grothendieck
Try this:

 set.seed(1)
 mat - matrix(rnorm(2 * 10), 2)
 system.time(mat2 - replace(mat, rowSums(mat  0) == 10, NA))
[1] 0.04 0.01 0.05   NA   NA
 R.version.string # Windows XP
[1] R version 2.2.1, 2005-12-20


On 4/1/06, Peter Wilkinson [EMAIL PROTECTED] wrote:
  I am trying to write an efficient function that will do the following:

 Given an nxm matrix, 10 rows (observations) by 10 columns (samples)
 for each row, test of all values in the row are greater than a value k
 If all values are greater than k, then set all values to NA (or something),
 Return an nxm matrix with the modified rows.

 If I do this with a matrix of 20,000 rows, I will be waiting until Christmas
 for it to finish:

 For rows in Matrix:
if rows  filter
set all elements in rows to NA   (or something)
else
do nothing
 Return the matrix with the modified rows


 I don't know how to code this properly. The following:

 If (sum(ifelse(nvectorfilter,1,0) == 0 ) )

 Tells me if any row has at least 1 value above the filter. How do I get rid
 of the 'outer' loop?

 Peter

 -

 Peter Wilkinson
 Senior Bioinformatician / Programmer-Analyst
 National Immune Monitoring Laboratory
 tel: (514)-343-7876

 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] Sourcing data files into different array fields

2006-04-01 Thread Gabor Grothendieck
Use the local= argument of source.

Here we createa a list, L, whose ith component is an environment,
viz. the environment of the instance of f that did the sourcing,
containing the sourced objects.  First we create some test
files: /a1 and /a2 and then we create L.  The components
of L can be referred to as L[[1]], L[[2]] or using filenames
L[[a1]], L[[a2]] and the components can be extracted
via L[[a1]]$a, etc.

# create test files
files - paste(/a, 1:2, sep = )
for(i in seq(files)) cat(a -, i, \n, file = files[i])

# read them back
f - function(.x) { source(.x, local = TRUE); environment() }
L - sapply(files, f, simplify = FALSE)
ls(L[[1]]) # a
L[[1]]$a # 1
L[[/a2]][[a]] # 2

On 4/1/06, Werner Wernersen [EMAIL PROTECTED] wrote:
 Hi,

 I make a number of experiments with a software which
 spits out one .R file of data objects to be sourced
 for each experiment. Now, the files repeat the data
 object names but ultimately I also want to perform
 comparision statistics between several objects.

 Now my question is, is there a slick way to wrap
 those data objects while sourcing each experiment file
 so that for instance I get an array m[] for each
 experiment and can access the objects by using
 m[1].object1 and so on?

 I am thankful for any suggestion!
  Werner

 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


[R] VARIANCE across each ROW

2006-04-01 Thread mark salsburg
I have a very large matrix. I would like to display the variance across each
row.

In other words, I want to output a vector containing the values of variance
across row.

When I use the function var(), it seems to give me the variability of each
column.

Any ideas??

[[alternative HTML version deleted]]

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] Unbalanced Manova

2006-04-01 Thread Spencer Graves
  What have you tried?  I just did help.searhc(manova), which led me 
to manova and summary.manova, which contained a balanced example, which 
I unbalanced as follows:

tear - c(6.5, 6.2, 5.8, 6.5, 6.5, 6.9, 7.2, 6.9, 6.1, 6.3,
6.7, 6.6, 7.2, 7.1, 6.8, 7.1, 7.0, 7.2, 7.5, 7.6)
gloss - c(9.5, 9.9, 9.6, 9.6, 9.2, 9.1, 10.0, 9.9, 9.5, 9.4,
 9.1, 9.3, 8.3, 8.4, 8.5, 9.2, 8.8, 9.7, 10.1, 9.2)
opacity - c(4.4, 6.4, 3.0, 4.1, 0.8, 5.7, 2.0, 3.9, 1.9, 5.7,
   2.8, 4.1, 3.8, 1.6, 3.4, 8.4, 5.2, 6.9, 2.7, 1.9)
rate - factor(gl(2,10), labels=c(Low, High))
additive - factor(gl(2, 5, len=20), labels=c(Low, High))
DF - data.frame(tear, gloss, opacity, rate, additive)

fit - manova(cbind(tear, gloss, opacity) ~ rate * additive, DF)
f18 - manova(cbind(tear, gloss, opacity) ~ rate * additive, DF[1:18,])
summary(fit, test=Wilks) # ANOVA table of Wilks' lambda

   summary(fit, test=Wilks) # ANOVA table of Wilks' lambda
   Df  Wilks approx F num Df den Df   Pr(F)
rate   1 0.3819   7.5543  3 14 0.003034 **
additive   1 0.5230   4.2556  3 14 0.024745 *
rate:additive  1 0.7771   1.3385  3 14 0.301782
Residuals 16
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
   summary(f18, test=Wilks) # ANOVA table of Wilks' lambda
   Df  Wilks approx F num Df den Df   Pr(F)
rate   1 0.3743   6.6880  3 12 0.006637 **
additive   1 0.5825   2.8671  3 12 0.080857 .
rate:additive  1 0.7012   1.7047  3 12 0.218966
Residuals 14
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
 
  If you would like further help from this listserve, PLEASE do read 
the posting guide! www.R-project.org/posting-guide.html and provide a 
self-contained example of what you tried, explaining why it does not 
meet your needs.

  hope this helps,
  spencer graves

Naiara S. Pinto wrote:

 Dear all,
 
 I need to do a Manova but I have an unbalanced design. I have
 morphological measurements similar to the iris dataset, but I don't have
 the same number of measurements for all species. Does anyone know a
 procedure to do Manova with this kind of input in R?
 
 Thank you very much,
 
 Naiara.
 
 
 Naiara S. Pinto
 Ecology, Evolution and Behavior
 1 University Station A6700
 Austin, TX, 78712
 
 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] VARIANCE across each ROW

2006-04-01 Thread Kenneth Cabrera
?apply
apply(data,1,var)
On Sat, 01 Apr 2006 21:27:56 -0500, mark salsburg  
[EMAIL PROTECTED] wrote:

 I have a very large matrix. I would like to display the variance across  
 each
 row.

 In other words, I want to output a vector containing the values of  
 variance
 across row.

 When I use the function var(), it seems to give me the variability of  
 each
 column.

 Any ideas??

   [[alternative HTML version deleted]]

 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide!  
 http://www.R-project.org/posting-guide.html



-- 
Using Opera's revolutionary e-mail client: http://www.opera.com/mail/

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


[R] Rattle: A simple R/Gnome GUI for Data Mining

2006-04-01 Thread Graham Williams
An oft heard issue with R is the learning curve. Yet R is a very
powerful language for data mining, if only one persists with it.

Attempting to provide a fast track for anyone to learning R through a
GUI, without limiting the user to just the GUI, I've pulled together
some basic R functionality for typical data mining into a GUI
written in R using RGtk2 (need the RGtk2 package from
http://www.ggobi.org/rgtk2/). This differs from other R GUIs in that
it is specifically for data mining type tasks and maps to how we
typically proceed through a project.

The GUI is simple and basic, covering the common tasks of loading a
CSV file, selecting Variables, sampling the data, summarising the
data, clustering, model building (decision tree, regression, random
forrest, boosting, svm), and evaluation (confusion table, ROC, Risk
Chart). Everything is logged to a Log window and direct cut-n-paste
from there to the R Console should work. This provides tuition or a
reminder of how to do things.

I've not made it into a formal R package yet but as time permits will
do so. The package has been available and in use in a number of data
mining projects for a couple of months now, and is under continuing
evolution. It works well for what it was designed for (basically
binary classification) but should handle anything thrown at it (at
least gracefully informing you if it can not).

It is freely available (GPL) from 

   http://www.togaware.com/datamining/rattle.html

Comments, suggestions, bugs, and code are always welome.

Regards,
Graham Williams

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] matching a given sd range

2006-04-01 Thread Ray Brownrigg
 Date: Fri, 31 Mar 2006 14:44:05 -0800 (PST)
 From: Fred J. [EMAIL PROTECTED]
  
  given a numeric array (a sequence of reals), I am interested in
  finding the subsets of sequences (each with start and end index) which match 
 a given sd range.
  
  I read the docs on match and which and the see also but could not come up 
 with a way. I could loop with a stepping window over the sequence but that 
 would be limited to a fixed size window, I guess I as well loop through all 
 window sizes from 1:length(data) but that is not elegant.  any hints towards 
 a solution?
  
Presumably your array is a vector, in which case you can use
vectorisation to eliminate one of the loops.  The following function
produces a matrix in upper triangular form for which res[i, j] is the
variance of the sequence dat[i:j] (where dat is your array).  It is
(approximately) two orders of magnitude faster than the double loop,
but doesn't produce exactly the same numbers because of numerical
rounding because it uses the formula:  variance = (sum(xi^2) -
n*xbar^2)/(n - 1)

Now all you have to do is test this matrix against the range of
variances you want to isolate, and use the matching indices to provide
you with the endpoints of your subsets.

Hope this helps,
Ray Brownrigg

allvar - function(dat) {
  len - length(dat)
  res - numeric((len - 1)*len/2)
  mym - dat[1]
  myv - 0
  for (j in 2:len) {
oldm - mym
mym - (((j - 1):1)*mym + dat[j])/(j:2)
myv - (((j - 2):0)*myv + ((j - 1):1)*oldm^2 + dat[j]^2 - (j:2)*mym^2)/
  ((j - 1):1)
res[((j - 2)*(j - 1)/2 + 1):((j - 1)*j/2)] - myv
mym -c(mym, dat[j])
myv - c(myv, 0)
  }
return(res)
}
# Now test it
 n - 500; dat - runif(n)
 unix.time({res2 - matrix(0, n, n); res2[upper.tri(res2)] - allvar(dat)})
[1] 0.64 0.00 0.64 0.00 0.00
 unix.time({res1 - matrix(0, n, n); for (j in 2:n) for (i in 1:(j - 1)) 
 res1[i, j] - var(dat[i:j])})
[1] 72.64 13.74 86.38  0.00  0.00
 max(abs(res1 - res2))
[1] 1.249001e-15
 


__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] Unbalanced Manova

2006-04-01 Thread Peter Dalgaard
Naiara S. Pinto [EMAIL PROTECTED] writes:

 Dear all,
 
 I need to do a Manova but I have an unbalanced design. I have
 morphological measurements similar to the iris dataset, but I don't have
 the same number of measurements for all species. Does anyone know a
 procedure to do Manova with this kind of input in R?
 
 Thank you very much,

If you have complete *responses*, the anova and summary methods for
mlm objects (from lm(Y~) where Y is a matrix) should do it. 

-- 
   O__   Peter Dalgaard Øster Farimagsgade 5, Entr.B
  c/ /'_ --- Dept. of Biostatistics PO Box 2099, 1014 Cph. K
 (*) \(*) -- University of Copenhagen   Denmark  Ph:  (+45) 35327918
~~ - ([EMAIL PROTECTED])  FAX: (+45) 35327907

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html