from:"Marc Schwartz"

Re: [R] generalized hypergeometric function

2006-04-16 Thread Marc Schwartz

On Sun, 2006-04-16 at 17:54 +, Ben Bolker wrote:
 Marc Schwartz MSchwartz at mn.rr.com writes:
 
  
  On Sat, 2006-04-15 at 20:59 -0300, Bernardo Rangel tura wrote:
   Hi R-masters
   
   I need compute generalized hypergeometric function.
   I look in R-project and R-help list and not find nothing about 
   generalized hypergeometric function
   
   Is possible calculate the generalized hypergeometric function?
   
   Somebody have a script for this?
   
   Note that he is looking for the h'geom FUNCTION, not DISTRIBUTION
 (e.g. http://mathworld.wolfram.com/GeneralizedHypergeometricFunction.html);
 Robin Hankin wrote some code (hypergeo in the Davies package on CRAN)
 to compute a particular Gaussian h'geom function, and was asking
 at one point on the mailing list whether anyone was interested
 in other code; I don't know whether it will be generalized
 enough for you.
 
Ben Bolker


Thanks Ben. I stand corrected on that point. Didn't click in my initial
reading.

Regards,

Marc

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

Re: [R] creating empty cells with table()

2006-04-19 Thread Marc Schwartz

On Wed, 2006-04-19 at 09:21 -0400, Owen, Jason wrote:
 Hello,
 
 Suppose I simulate 20 observations from the Poisson distribution
 with lambda = 4.  I can summarize these values using table() and
 then feed that to barplot() for a graph.
 
 Problem: if there are empty categories (e.g. 0) or empty categories
 within the range of the data (e.g. observations for 6, 7, 9), barplot()
 does not include the empty cells in the x-axis of the plot.  Is there
 any way to specify table() to have specific categories (in the above
 example, probably 0:12) so that zeroes are included?
 
 Thanks,
 
 Jason

One thought comes to mind, which is based upon table()'s internal
behavior, where it interprets the vectors passed as a factor, for the
purpose of the [cross-]tabulation.

Thus:

 x - rpois(20, 4)

 x
 [1] 4 4 3 8 2 4 5 2 3 2 4 5 5 5 6 4 5 8 2 5

 table(x)
x
2 3 4 5 6 8
4 2 5 6 1 2

# Add the desired factor levels
 table(factor(x, levels = 0:12))

 0  1  2  3  4  5  6  7  8  9 10 11 12
 0  0  4  2  5  6  1  0  2  0  0  0  0


For the barplot:

  barplot(table(factor(x, levels = 0:12)))


HTH,

Marc Schwartz

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

Re: [R] prop.table on three-way table?

2006-04-20 Thread Marc Schwartz

   
dim1_no1 dim2_no1 00000
 dim2_no3 01100
 dim2_no4 00000
 dim2_no5 00000
dim1_no3 dim2_no1 00000
 dim2_no3 00000
 dim2_no4 00201
 dim2_no5 00100
dim1_no4 dim2_no1 01000
 dim2_no3 00000
 dim2_no4 10010
 dim2_no5 00000
dim1_no5 dim2_no1 00000
 dim2_no3 10000
 dim2_no4 00000
 dim2_no5 00000



and...the output of using ctab() is:

 ctab(x, type = row, percentages = FALSE)
   dim3_no1 dim3_no2 dim3_no3 dim3_no4 dim3_no5
   
dim1_no1 dim2_no1   NaN  NaN  NaN  NaN  NaN
 dim2_no3  0.00 0.50 0.50 0.00 0.00
 dim2_no4   NaN  NaN  NaN  NaN  NaN
 dim2_no5   NaN  NaN  NaN  NaN  NaN
dim1_no3 dim2_no1   NaN  NaN  NaN  NaN  NaN
 dim2_no3   NaN  NaN  NaN  NaN  NaN
 dim2_no4  0.00 0.00 0.67 0.00 0.33
 dim2_no5  0.00 0.00 1.00 0.00 0.00
dim1_no4 dim2_no1  0.00 1.00 0.00 0.00 0.00
 dim2_no3   NaN  NaN  NaN  NaN  NaN
 dim2_no4  0.50 0.00 0.00 0.50 0.00
 dim2_no5   NaN  NaN  NaN  NaN  NaN
dim1_no5 dim2_no1   NaN  NaN  NaN  NaN  NaN
 dim2_no3  1.00 0.00 0.00 0.00 0.00
 dim2_no4   NaN  NaN  NaN  NaN  NaN
 dim2_no5   NaN  NaN  NaN  NaN  NaN


Note that for rows where the total is 0, you end up with NaN (Not a
Number), as opposed to 0.


Does that get you want you want?

HTH,

Marc Schwartz

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

Re: [R] x-axis tick mark labels running vertically

2004-05-07 Thread Marc Schwartz

On Fri, 2004-05-07 at 07:02, Mohamed Abdolell wrote:
 I'm plotting obesity rates (y-axis) vs Public Health Unit (x-axis) for the 
 province of Ontario and would like to have the Public Health Unit names appear 
 vertically rather than the default, horizontally.
 
 I'm actually using the 'barplot2' function in the {gregmisc} library ... I 
 haven't been able to find a solution in either the barplot2 options or the 
 general plotting options.
 
 Any pointers would be appreciated.
 
 - Mohamed


You need to adjust par(las) to alter the orientation of the of the
axis labels:

barplot2(1:4, names.arg = c(one, two, three, four), las = 2)

You can also use the axis() function separately:

barplot2(1:4)
axis(1, labels = c(one, two, three, four), las = 2)

Setting par(las = 2) rotates the axis labels so that they are
perpendicular to the axis.

See ?par for more information.

HTH,

Marc Schwartz

__
[EMAIL PROTECTED] mailing list
https://www.stat.math.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

Re: [R] log Y scales for parplot

2004-05-14 Thread Marc Schwartz

On Fri, 2004-05-14 at 12:11, Monica Palaseanu-Lovejoy wrote:
 Hi,
 
 I am doing a barplot, and the fist bar is very big (high values) and 
 the rest of the bars are quite small (small values). So - is there any 
 way to make the  Y scale logarithmic so that i have a wider 
 distance from 0 to 50 for example than for 50 to 100, and so on? 
 
 Thanks in advance for any help,
 
 Monica


Monica,

See the barplot2() function in the 'gregmisc' package on CRAN, which
supports the use of log axis scaling.

For example:

barplot2(c(5000, 50, 75, 100), log = y)


HTH,

Marc Schwartz

__
[EMAIL PROTECTED] mailing list
https://www.stat.math.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

Re: [R] UseR 2004 Proceeding

2004-05-25 Thread Marc Schwartz

On Tue, 2004-05-25 at 09:11, Agustin Calatroni wrote:
 In the past conference (i.e. DSC 2003) the proceedings were available
 for download. This year UseR website only has the abstracts of the
 papers. Does anybody know if the full text will be available for
 download?
 
 Thanks,
 Agustin Calatroni ;-)


According to an announcement made on Saturday there will not be a
proceedings.

Some authors may post their complete presentation on their web sites
(where available) or perhaps may be willing to e-mail their presentation
in PDF format. You may be best served to contact the author(s) directly,
if there is a particular paper you are interested in.

HTH,

Marc Schwartz

__
[EMAIL PROTECTED] mailing list
https://www.stat.math.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

Re: [R] Opening help pages in new tab of existing Mozilla Firebird

2004-05-26 Thread Marc Schwartz

On Wed, 2004-05-26 at 10:32, Kevin Wright wrote:
 Subject pretty much says it all.  I currently have options()$browser
 set to open help pages in Mozilla Firebird, but it starts a new window
 each time and I would like a new 'tab' in an existing window.
 
 Sorry if this is obvious, but I can't find anything.
 
 Kevin Wright


You do not indicate which OS you are running, but under Linux, you can
use a script such as the following. It will check the current process
list to see if an instance of Firefox is already present. If so, it will
open a new tab. Otherwise, it opens a new window.

#!/bin/sh 

# if 'firefox-bin' in current ps listing,
if ps -e|grep firefox-bin /dev/null 21; then
  firefox -remote openURL(${1}, new-tab)  exit 0
else 
#open new instance 
  firefox $1  exit 0
fi


Copy the above into a script file and set it to be executable (chmod +x
ScriptFileName). Then set options()$browser to use the script file.

Note also that the recent version of the Mozilla standalone browser is
called Firefox, in recognition of the existence of the Firebird
(formerly Interbase) SQL database project.

HTH,

Marc Schwartz

__
[EMAIL PROTECTED] mailing list
https://www.stat.math.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

Re: [R] gauss.hermite?

2004-05-27 Thread Marc Schwartz

On Thu, 2004-05-27 at 23:35, Spencer Graves wrote:
   The search at www.r-project.org mentioned a function 
 gauss.hermite{rmutil}.  However, 'install.packages(rmutil)' 
 produced, 'No package rmutil on CRAN.'  How can I find the current 
 status of gauss.hermite and rmutil? 
 
   Thanks,
   Spencer Graves


Spencer,

A Google search indicates that rmutil is listed on
http://cran.r-project.org/other-software.html. There is a link at the
bottom of the page to Jim Lindsey's site at:

http://popgen0146uns50.unimaas.nl/~jlindsey/rcode.html

There are links for rmutil.tgz and rmutil.zip below the middle of the
page.

HTH,

Marc Schwartz

__
[EMAIL PROTECTED] mailing list
https://www.stat.math.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

Re: [R] How to Describe R to Finance People

2004-06-04 Thread Marc Schwartz

On Fri, 2004-06-04 at 09:19, Paul Gilbert wrote:
 Ko-Kang Kevin Wang wrote:
 
 It is not only used by statisticians or scientists,
 but also econometricians and people in finance due to its cost (FREE)
 and its powerfulness.
 
 I think (FREE) will distract your intended audience from the real 
 point. In a corporate environment, lots of people argue that free 
 software actually costs more than commercial software because of 
 internal support cost, etc, etc.  These arguments will all hinge 
 critically on the corporate IT support abilities. For R, I have never 
 seen a convincing argument that it costs more, but the real point is 
 that this is irrelevant. If it costs less, that is nice.  If it costs 
 more, then that is what you pay to use something that is better. If they 
 need to, I think people in finance are generally willing to pay, so I 
 think it is a mistake to put much emphasis on the cost. Put the emphasis 
 on how good it is.


I agree that quality and value are important, but I think that the issue
of cost should not be discounted out of hand. Value (for both company
and client) is directly tied to cost.

Cost may be less of a concern for very large corporations to some
extent, though certainly non-trivial as we continue to see companies
finding ways to reduce their cost of operations as an important part of
the strategy to improve profitability. Typically, this is done via
reductions in personnel costs (ie. layoffs, reductions in benefits,
salary/wage cuts, etc.), but IT costs are surely a target as is noted
daily/weekly/monthly in various IT and business trade rags. IT costs are
not just those associated with the initial purchase, but with ongoing
operating costs as well.

I can speak from personal experience, as the President and Owner of a
health care consulting business who has funded this operation with my
own funds, that cost is a significant issue. I do not have shareholders
or private investors providing operating capital with the promise of
future returns on their investments. Every dollar I spend has to be
recovered via client billings or it comes out of my own pocket.

This is not just important for me, but for my clients as well.

The more I spend on the cost of doing business, the more that I would
have to pass on to my clients to recoup those same costs. My ability to
offer clients reasonable project fees is directly correlated to my
underlying cost structure. There is a market driven threshold beyond
which I could not pass those costs on to clients and still have clients
willing to pay for services.

A product like SAS for example, which I had previously used for a number
of years working for a larger medical software company, is no longer
affordable to me as a small business owner. The last time that I
checked, the annual licensing for a single user commercial license for
Base, Stat and Graph was in the neighborhood of $5,000 U.S. **Per
Year**. That is for _one person_. Calculate those costs for a larger
staff...

Even a product such as that other R like commercial offering, while
less costly than SAS, still adds to overhead. I would rather allocate
the funds for that product along with my time, to supporting the R
Foundation and this community to repay the value and benefit that I
receive from it (which is nothing short of phenomenal).

The bottom line is that cost is a non-trivial issue. If a company is
willing to pay more for a functionally equivalent product, because the
training and support is (or is perceived to be) superior so be it. That
may enable managers and other decision makers to sleep better at night.

I would however challenge the level of support provided by any
commercial company to that which is provided by this community, given
the depth and breadth of expertise present and the expedience with which
communications take place here.

I use R. My company benefits from it. My clients benefit from it.

..and I sleep just fine (when I do sleep)...  :-)

Regards,

Marc Schwartz

__
[EMAIL PROTECTED] mailing list
https://www.stat.math.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

Re: [R] Plot documentation; Axis documentation

2004-06-04 Thread Marc Schwartz

On Fri, 2004-06-04 at 12:01, Glynn, Earl wrote:
 Why when I do a  help(plot) do I not see anything about parameters
 such as xlim or ylim?  As someone new to R, finding that xlim and ylim
 even existed wasn't all that easy. Even help.search(xlim) shows
 nothing once I know xlim exists.  
 
 I'd like to change the default axes but help(axis) isn't that
 informative about changing the frequency of ticks on the axes.
 
 Do people really refer to the x-axis as 1 and the y-axis as 2 as
 shown in help(axis)? 
 
plot(1:4, rnorm(4), axes=FALSE)
axis(1, 1:4, LETTERS[1:4])
axis(2)
 
 I hadn't a clue what the 1 and 2 meant here without reading
 additional documentation.  And where is the LETTERS constant defined
 and what else is defined there?
 
 Are there no common R constants defined somewhere so the axes be defined
 symbolically?  Perhaps AXIS_X = 1, AXIS_Y = 2 would be better than just
 1 and 2:
 
plot(1:4, rnorm(4), axes=FALSE)
axis(AXIS_X, 1:4, LETTERS[1:4])
axis(AXIS_Y)
 
 This would at least provide a clue about what is going on here.
 
 Why is R such a graphics rich language and the documentation is so
 lacking in graphics examples?  Why can't the documentation include
 graphics too so one can study code and graphics at the same time?  How
 do I know the graphics I'm seeing is what it's supposed to look like?
 
 I'd rather do more in R than MatLab but I find the R documentation
 somewhat lacking.  I prefer not to read the R source code to find the
 answers.
 
 Thanks for any insight about this.
 
 efg


Reading the posting guide, for which there is a link at the bottom of
each list e-mail, would be a good place to start. The section on
Further Resources provides important links.

Specifically on graphics:

1. Start by reading chapter 12 in An Introduction to R, which covers
graphics basics.

2. VR's MASS also has an excellent chapter (4) on graphics.

3. There is also an article in R News R Help Desk
(http://cran.r-project.org/doc/Rnews/Rnews_2003-2.pdf) that would likely
be helpful as well.

Reviewing these resources would be crucial to assist your comprehension.
I think that you will find the documentation for R to be substantial, if
you take the time to properly research it. The posting guide will help
get you started in that endeavor.

In most cases this obviates any need to review source code, though a
critical advantage of R is the ability to do just that when you need to.

HTH,

Marc Schwartz

__
[EMAIL PROTECTED] mailing list
https://www.stat.math.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

Re: [R] How to Describe R to Finance People

2004-06-04 Thread Marc Schwartz

On Fri, 2004-06-04 at 12:47, Tamas Papp wrote:
 On Fri, Jun 04, 2004 at 11:06:59AM -0500, Marc Schwartz wrote:
 
  I agree that quality and value are important, but I think that the issue
  of cost should not be discounted out of hand. Value (for both company
  and client) is directly tied to cost.
 
  [...]
  
  The bottom line is that cost is a non-trivial issue. If a company is
  willing to pay more for a functionally equivalent product, because the
  training and support is (or is perceived to be) superior so be it. That
  may enable managers and other decision makers to sleep better at night.
 
 I agree with your points.  However, as far as I remember, the original
 poster wants to give a 2-3 minute summary about the benefits of R.  I
 would not open such a complicated issue (the total cost of ownership)
 in such a small timeframe, but focus on the more technical benefits
 instead.

Actually, Kevin indicated 1 to 2 minutes...  ;-)

My response was clearly more detailed than what Kevin could incorporate
into the presentation structure. It was more a response to Paul's
comments on cost not being an issue. My experience indicates otherwise.

The TCO issue is clearly a subject of much debate, especially when
fueled by widely disseminated studies funded (overtly or otherwise) by
a certain large software company based in the northwestern U.Sthat
tends to introduce a certain 'a priori' bias.

I agree that you cannot adequately address the issue in such a short
talk. However, I think that it can be raised briefly, with supporting
comments, such as those made by Frank Harrell regarding reproducible
analyses, which also supports cost reduction via improvements in quality
and productivity. Something that point and click based tools cannot
offer effectively.

Thanks for raising the clarification Tamas.

Marc

__
[EMAIL PROTECTED] mailing list
https://www.stat.math.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

Re: [R] How to Describe R to Finance People

2004-06-04 Thread Marc Schwartz

On Fri, 2004-06-04 at 14:26, Paul Gilbert wrote:
 Marc Schwartz wrote:

snip

 
 I agree that quality and value are important, but I think that the issue
 of cost should not be discounted out of hand. Value (for both company
 and client) is directly tied to cost.
   
 
 [snip ...]
 
 Marc
 
 I agree with this, and most of what you say. Cost is important in both 
 large and small companies, and also in government. The point is really 
 that total cost of ownership is a very complicate thing, and you should 
 not get into it without the specifics of a particular company and 
 situation in mind. For example, if the end users takes responsibility 
 for all the support, the cost implications will be very different from 
 the situation where the IT department needs to guarantee availability. 
 Even in your own company you may have a very different attitude with 
 respect to your research software and your accounts receivable software. 
 People in finance making real time market decisions will have a very 
 different cost structure from academics in finance.

Paul,

Sorry for the delay in my reply. A sudden request came from a client
this afternoon and I just finished the analysis.

I am in agreement with you that situational differences will bias the
focus on costs and perhaps even the ability to define them well. Within
the timeframe that Kevin and/or his associate has for this presentation,
this topic cannot be adequately covered. That does not mean that you
cannot raise it for further consideration by the audience within the
context of pointing out R's strengths.

I suspect that if you point out these issues, somebody with the right
insight will have a light bulb go on relative to the possibility of cost
savings versus their current set of tools. They will of course need to
pursue that line of thinking outside of the scope of this presentation,
which is fine.

I do use a commercial product for accounting (which is the only reason
that I still dual-boot Windows and Fedora Core 2.) My alternative is to
send all of my paperwork to my accountant to do all of my ledger
entries, which at $300 U.S. per hour is not inconsequential. So, yes, I
make a cost based decision to purchase a commercial OTS product that
enables me to do the grunt work and send an electronic file to my
accountant for review. My cost per unit of time is cheaper than my
accountant's. If I could do the same thing with an open source product,
hallelujah. It would save me even more. Unfortunately, as is oft
discussed, this is one area in which the Linux world is still lacking. I
suspect that will change in time however.

On the other hand, I use a payroll services company to handle that part
of the business. Their costs to run the payroll, pay taxes, worker's
comp and all the rest of the associated procedures are cheaper than what
I could do on my own. It is not that I could not do it technically, but
that use of my time would in the long run cost me more money than what I
pay for the service.

Each component of the process does need to be evaluated within the
context of the alternatives and appropriate risk/benefit considerations.

 The point was really that many people are very sensitive to arguments 
 about cost, and often have positions that they feel obliged to promote. 
 So, as soon as you mention cost you are likely to get into a very long 
 discussion that will not be fruitful unless you are prepared to talk 
 about very particular situations. For this particular audience I think 
 it would probably be more to the point to describe how good and reliable 
 R is. A particular company may decided R is just too expensive. For 
 example, some companies in finance are worried about real time decision 
 making. They may have to hire 10 more IT staff to guarantee 24/7 
 availaility with no more than 5 minutes outage per week whereas, with 
 commercial software,  they may be able to buy guarantees. (This is not a 
 statement about commercial software being more reliable, it is a 
 statement about being able to buy insurance.)  But, as you say, for must 
 of us R is a real bargain.

There is no doubt that folks are willing to spend more in some cases for
a piece of paper that guarantees availability and perhaps some form of
compensation for downtime, which in these situations will likely result
in lost revenue. As long as clients are willing to pay for those
features and/or companies determine that business imperatives warrant
such expenditures...

I think that your closing point relative to R's reliability is
important. This goes to the old saying Facts are negotiable, perception
is reality. Why would R be more or less available or reliable than
another analytic tool? More often than not, I suspect that it is not the
application but the underlying infrastructure that results in reduced
availability.

The perception issue may be the biggest hurdle that the open source
world needs to (and will) overcome in the commercial marketplace.

Anyway, I think

Re: [R] error during make of R-patched on Fedora core 2

2004-06-07 Thread Marc Schwartz

On Mon, 2004-06-07 at 15:08, Gavin Simpson wrote:
 Dear list,
 
 I've just upgraded to Fedora Core 2 and seeing as there wasn't an rpm 
 for this OS on CRAN yet I thought it was about time I had a go at 
 compiling R myself. Having run into the X11 problem I switched to trying 
 to install R-patched. I followed the instructions in the R Installation 
  Admin manual to download the sources of the Recommended packages and 
 place the files in R_HOME/src/library/Recommended. ./configure worked 
 fine so I progressed to make, which has hit upon this error when the 
 process arrived at the Recommended packages:
 
 make[2]: Leaving directory `/home/gavin/tmp/R-patched/src/library'
 make[2]: Entering directory `/home/gavin/tmp/R-patched/src/library'
 building/updating vignettes for package 'grid' ...
 make[2]: Leaving directory `/home/gavin/tmp/R-patched/src/library'
 make[2]: Entering directory `/home/gavin/tmp/R-patched/src/library'
 make[2]: Leaving directory `/home/gavin/tmp/R-patched/src/library'
 make[1]: Leaving directory `/home/gavin/tmp/R-patched/src/library'
 make[1]: Entering directory 
 `/home/gavin/tmp/R-patched/src/library/Recommended'
 make[2]: Entering directory 
 `/home/gavin/tmp/R-patched/src/library/Recommended'
 make[2]: Leaving directory 
 `/home/gavin/tmp/R-patched/src/library/Recommended'
 make[2]: Entering directory 
 `/home/gavin/tmp/R-patched/src/library/Recommended'
 make[2]: *** No rule to make target `survival.ts', needed by 
 `stamp-recommended'.  Stop.
 make[2]: Leaving directory 
 `/home/gavin/tmp/R-patched/src/library/Recommended'
 make[1]: *** [recommended-packages] Error 2
 make[1]: Leaving directory 
 `/home/gavin/tmp/R-patched/src/library/Recommended'
 make: *** [stamp-recommended] Error 2
 
 Being a relative newbie to Linux I have no-idea how to continue to solve 
 this issue :-(
 
 The only difference I can see between the /src/library/Recommended 
 directories of R-1.9.0 and R-patched is that in R-1.9.0 it contains 
 links to each of the tar.gz (excluding the version info) as well as the 
 tar.gz themselves for each of the packages. Is this in some way related 
 to my problem?
 
 If anyone can help me solve this issue I'd be most grateful.
 
 Thanks in advance,
 
 Gavin


You might want to try the following commands using rsync as an
alternative to downloading the tarball:

rsync -rCv rsync.r-project.org::r-patched R-patched
./R-patched/tools/rsync-recommended
cd R-patched
./configure
make

I actually have the above in a script file that I can just run quickly,
when I want to update the code.

I am now running FC2, so if you have any problems, drop me a line.

Best regards,

Marc Schwartz

__
[EMAIL PROTECTED] mailing list
https://www.stat.math.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

Re: [R] error during make of R-patched on Fedora core 2

2004-06-07 Thread Marc Schwartz

On Mon, 2004-06-07 at 15:51, Gavin Simpson wrote:

snip

 Thanks Roger and Marc, for suggesting I use ./tools/rsync-recommended 
 from within the R-patched directory.
 
 This seems to have done the trick as make completed without errors this 
 time round. The Recommended directory also contained the links to the 
 actual tar.gz files after doing the rsync command, so I guess this was 
 the problem (or at least related to it.) I'm off home now with the 
 laptop to see if I can finish make check-all and make install R.
 
 I have re-read the section describing the installation process for 
 R-patched or R-devel in the R Installation and Administration manual 
 (from R.1.9.0) just in case I missed something. Section 1.2 of this 
 manual indicates that one can proceed *either* by downloading R-patched 
 and then the Recommended packages from CRAN and placing the tar.gz files 
 in R_HOME/src/library/Recommended, or by using rsync to download 
 R-patched, and then to get the Recommended packages. The two are quite 
 separately documented in the manual, and do seem to be in disagreement 
 with the R-sources page on the CRAN website, which doesn't mention the 
 manual download method (for Recommended) at all.
 
 Is there something wrong with the current Recommended files on CRAN, or 
 is the section in the R Installation  Admin manual out-of-date or in 
 error, or am I missing something vital here? This isn't a complaint: I'm 
 just pointing this out in case this is something that needs updating in 
 the documentation.
 
 All the best,
 
 Gavin

Perhaps I am being dense, but in reviewing the two documents (R Admin
and the CRAN sources page), I think that the only thing lacking is a
description on the CRAN page of the manual download option for the Rec
packages.

You would need to go here now for 1.9.1 Alpha/Beta which is where the
current r-patched is:

http://www.cran.mirrors.pair.com/src/contrib/1.9.1/Recommended/

The standard links on CRAN are for the current 'released' version, which
is still 1.9.0 for the moment.

Procedurally, I think that the rsync approach is substantially easier
(one step instead of multiple downloads) and certainly less error prone.
Also the ./tools/rsync-recommended script is set up to pick up the
proper package versions, which also helps to avoid conflicts.

HTH,

Marc

__
[EMAIL PROTECTED] mailing list
https://www.stat.math.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

Re: [R] error during make of R-patched on Fedora core 2

2004-06-08 Thread Marc Schwartz

On Tue, 2004-06-08 at 06:23, Prof Brian Ripley wrote:
 On Tue, 8 Jun 2004, Gavin Simpson wrote:
 
  Marc Schwartz wrote:
   On Mon, 2004-06-07 at 15:51, Gavin Simpson wrote:
   
   snip

snip

   
   
   Perhaps I am being dense, but in reviewing the two documents (R Admin
   and the CRAN sources page), I think that the only thing lacking is a
   description on the CRAN page of the manual download option for the Rec
   packages.

snip

  
  Yes, but having downloaded the contents of that directory (as VERSION 
  indicated that R-patched was 1.9.1 alpha), the links to the source files 
  for the Recommended packages or not present (obviously). And make 
  doesn't seem to work without these links. The rsync approach places the 
  package sources *and* the links in the correct directory.

Yep. I was being dense. Missed the symlink part of the process. My
error.

I also missed the venus transit this morning due to clouds...  :-(

  So the instructions in the Admin manual are lacking a statement that you 
  need to create links to each of the package sources in the following 
  form name-of-package.tgz which links to name-of-package_version.tar.gz. 
  As it stands, the instructions in the Installation  Admin manual are 
  not sufficient to get the manual download method to work.
 
 You need to run tools/link-recommended.  I've added that to R-admin.

Should Fritz also add that to the CRAN 'R Sources' page so that both
locations are in synch procedurally?

   Procedurally, I think that the rsync approach is substantially easier
   (one step instead of multiple downloads) and certainly less error prone.
   Also the ./tools/rsync-recommended script is set up to pick up the
   proper package versions, which also helps to avoid conflicts.
  
  I agree - being a bit of a Linux newbie, I hadn't used rsync before. 
  Seeing how easy it was to use this method of getting the required 
  sources I will be using this method in future.
 
 rsync is great, *provided* you have permission to use the ports it uses.  
 Users with http proxies often do not, hence the description of the manual 
 method.  During alpha/beta periods, we do make a complete tarball 
 available, and I wonder if we should not be doing so with 
 R-patched/R-devel at all times.

Good point on rsync. Perhaps another option to consider/suggest (though
it might complicate things) is to use wget. Since wget supports proxy
servers, etc. and can use http, it might be an alternative for folks.

The wget command syntax (assuming that your working dir is the main R
source dir) would be:

wget -r -l1 --no-parent -A*.gz -nd -P src/library/Recommended
http://www.cran.mirrors.pair.com/src/contrib/1.9.1/Recommended

The above _should_ be one one line, but of course will wrap here. There
should be a space   between the two lines. The above will copy the tar
files (-A*.gz) from the server (-r -l1 --no-parent) to the appropriate
'Recommended' directory (-P), without recreating the source server's
tree (-nd).

One could refer the reader to 'man wget' or
http://www.gnu.org/software/wget/wget.html for further information on
how to use wget behind proxies and related issues.

You would then of course run the ./tools/link-recommended script to
create the symlinks, followed by ./configure and make.

HTH,

Marc

__
[EMAIL PROTECTED] mailing list
https://www.stat.math.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

Re: [R] error during make of R-patched on Fedora core 2

2004-06-08 Thread Marc Schwartz

On Tue, 2004-06-08 at 12:40, Peter Dalgaard wrote:
 Marc Schwartz [EMAIL PROTECTED] writes:
 
  wget -r -l1 --no-parent -A*.gz -nd -P src/library/Recommended
  http://www.cran.mirrors.pair.com/src/contrib/1.9.1/Recommended
  
  The above _should_ be one one line, but of course will wrap here. There
  should be a space   between the two lines.
 
 Kids these days... Make that
 
 wget -r -l1 --no-parent -A*.gz -nd -P src/library/Recommended \
  http://www.cran.mirrors.pair.com/src/contrib/1.9.1/Recommended

LOL

Thanks Dad  ;-)

Marc

__
[EMAIL PROTECTED] mailing list
https://www.stat.math.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

Re: [R] fighting with ps.options and xlim/ylim

2004-06-08 Thread Marc Schwartz

On Tue, 2004-06-08 at 20:18, ivo welch wrote:
 thank you, marc.  I will play around with these parameters tomorrow at 
 my real computer.  yes, the idea is to just create an .eps and .pdf 
 file, which is then \includegraphics[0.25\textwidth]{} in pdflatex.  I 
 need to tweak with the parameter ps.options(pointsize) because 
 otherwise, I end up with 5pt fonts---which is not readable.  And once I 
 do this, I need different R parameter defaults on the axes.  With the 
 advice I have gotten, I think I am all set now.  However, I am a little 
 bit surprised that noone has written a package around this task---there 
 must be many people that have to produce quarter-page (or half-page) 
 graphics, and probably everyone is tweaking plot parameters a bit 
 differently.  It would be nice to build in some of this intelligence 
 into plot parameters, themselves.   of course, R is a free volunteer 
 effort, and I am grateful for all the stuff that has been done already.
 
 /iaw


You might want to try to set the 'height' and 'width' arguments for
postscript() to something larger than the defaults. For example, use 6 x
6 (if square) and then use your code above to scale the plot down to
size. That might help with your font size and spacing problem, rather
than adjusting the point size.

I don't have a 'rule of thumb', but experience suggests that downsizing
a plot that is too big is better than upsizing one that is too small,
especially for a partial page.

I have done some other things using the 'seminar' LaTeX package for
landscape orientation slides and there I generally use the exact size
for the EPS files. But that is generally the only time that I do that.

YMMV,

Marc

__
[EMAIL PROTECTED] mailing list
https://www.stat.math.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

Re: [R] fighting with ps.options and xlim/ylim

2004-06-08 Thread Marc Schwartz

On Tue, 2004-06-08 at 21:02, Duncan Murdoch wrote:
 On Tue, 08 Jun 2004 21:18:34 -0400, ivo welch [EMAIL PROTECTED]
 wrote:
  And once I 
 do this, I need different R parameter defaults on the axes.  With the 
 advice I have gotten, I think I am all set now.  However, I am a little 
 bit surprised that noone has written a package around this task---there 
 must be many people that have to produce quarter-page (or half-page) 
 graphics, and probably everyone is tweaking plot parameters a bit 
 differently. 
 
 My general strategy for this is to change the width and height used in
 the pdf() or postscript() device call, then just trust the defaults
 chosen by R.  For inclusion in a paper, I generally specify sizes
 about twice as big as I really want, and get text size similar to the
 printed text.  So in your case, assuming a page is around 6 inches
 wide, I'd use something like
 
 pdf(width=3, height=3, ...)
 
 and then get LaTeX to shrink it to half the size.
 
 Duncan Murdoch


I just got Duncan's msg, so I think that we are thinking along the same
lines here.

I agree with Duncan's suggestion relative to trying a 2x scaling factor
and would see how that goes with your particular plot. Then adjust if
need be as you develop some intuition.

Marc

__
[EMAIL PROTECTED] mailing list
https://www.stat.math.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

Re: [R] Re: fighting with ps.options and xlim/ylim

2004-06-09 Thread Marc Schwartz

On Wed, 2004-06-09 at 09:30, Uwe Ligges wrote:
 ivo welch wrote:
 
  
  Thanks again for all the messages.
  
  Is the 4% in par('usr') hardcoded?  if so, may I suggest making this a 
  user-changeable parameter for x and y axis?
 
 See ?par and its argumets xaxp, yaxp which can be set to i.

Quick correction. That should be xaxs and yaxs. See my initial reply.

xaxp and yaxp are for the positions of the tick marks.

  I looked at psfrag, and it seems like a great package.  alas, I have 
  switched to pdflatex, and pdffrag does not exist.  :-(

One option to point out, is that if the functionality in psfrag is
important to you, you can use 'ps2pdf' to convert a ps file to a pdf
file. ps2pdf filters the ps file through ghostscript to create the pdf
file.

It means a three step process (latex, dvips and ps2pdf), but it can
provide additional functionality that pdflatex does not support, such as
the use of \special as in the package 'pstricks'. pdf does not have any
programming language functionality as does postscript, so there are some
tradeoffs and likely why there is no pdffrag.

Food for thought.

  I also discovered that there is a pdf device now.  neat.  
 
 Since R-1.3.0, as the News file tells us.
 
 Uwe Ligges

HTH,

Marc

__
[EMAIL PROTECTED] mailing list
https://www.stat.math.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

Re: [R] displaying a table in full vs. typing variable name

2004-06-10 Thread Marc Schwartz

On Thu, 2004-06-10 at 11:26, Uwe Ligges wrote:
 Rishi Ganti wrote:
 
  I have a data frame called totaldata that is 10,000 rows by about 9 columns.
 
 If about 9 equals 2, the behaviour reported below is expected.


That is, of course, for sufficiently large values of about...

;-)

Marc

__
[EMAIL PROTECTED] mailing list
https://www.stat.math.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

Re: [R] import SYSTAT .syd file?

2004-06-16 Thread Marc Schwartz

On Wed, 2004-06-16 at 10:32, Anne York wrote:
 On Tue, 15 Jun 2004 Jonathan Baron [EMAIL PROTECTED] wrote:
 
 Does anyone know how to read a SYSTAT .syd file on Linux?
 (Splus 6 does it, but it is easier to find a Windows box
 with Systat than to download their demo.  I'm wondering
 if there is a better way than either of these options.)
 
 Jon
 
 The commercial package dbmscopy has a Linux version. 
 I have used dbmscopy for several years and have been happy 
 with it as it converts data files among many spreadsheets and 
 statistics programs.
 
 http://www.conceptual.com/dbmscopt.htm
 
 
 However, somewhat recently they were purchased by SAS, so 
 I'm not sure of current state of the program. There are 
 probably other commercial packages as well.
 
 Anne


Hi Jon and Anne!

One other commercial product to check out is Stat/Transfer.  More
information on supported formats is at:
http://www.stattransfer.com/html/formats.html

They do support Windows, MacOS and Unix/Linux. Demo downloads are
available from: http://www.stattransfer.com/html/download.html

Unix/Linux pricing is available at:
http://www.stattransfer.com/html/prices_-_unix.html.

HTH,

Marc Schwartz

__
[EMAIL PROTECTED] mailing list
https://www.stat.math.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

Re: [R] R help in Firefox on Windows XP

2004-06-17 Thread Marc Schwartz

On Thu, 2004-06-17 at 12:06, Erich Neuwirth wrote:
 I had to reinstall my machine, so I installed Firefox 0.9 as browser
 I am using WinXP and R 1.9.1 beta.
 Now search in R html help does not work.
 I checked that the Java VM is working correctlt, Sun's test site says
 my installation is OK.
 Firefoxalso tells me that
 
 Applet Searchengine loaded
 Applet Searchengine started
 
 it just does not find anything.
 Does anybody know how to solve this?
 
 Erich

Erich,

Do you also have JavaScript enabled in the Firefox Tools - Options
settings?

Both Java and JavaScript need to be enabled for the help.start() search
engine to function properly.

I reviewed the release notes at
http://www.mozilla.org/products/firefox/releases/ and did not see
anything relating to Java there, as had been the case with prior
releases. The message that you are getting on the status line suggests
that the R search applet is being found and properly enabled, which is
typically the primary source of problems.

Check the above and let us know.

Marc Schwartz

__
[EMAIL PROTECTED] mailing list
https://www.stat.math.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

Re: [R] Grouped AND stacked bar charts possible in R?

2004-06-22 Thread Marc Schwartz

On Tue, 2004-06-22 at 10:54, Patrick Lenon wrote:
 Good day all,
 
 My statisticians want an R procedure that will produce grouped stacked 
 barplots.  Barplot will
 stack or group, but not both.  The ftable function can produce a table
 of the exact form they want, but the barplot doesn't show all the
 divisions we want.
 
 For an example, here's the sample from the help file for ftable:
  
 data(Titanic)
 ftable(Titanic, row.vars = 1:3)
 ftable(Titanic, row.vars = 1:2, col.vars = Survived)
 ftable(Titanic, row.vars = 2:1, col.vars = Survived)
 
 Now take it a step further to try to add another dimension:
 
 b - ftable(Titanic, row.vars=1:3)
 
Survived  No Yes
 Class SexAge  
 1st   Male   Child0   5
  Adult  118  57
   Female Child0   1
  Adult4 140
 2nd   Male   Child0  11
  Adult  154  14
   Female Child0  13
  Adult   13  80
 3rd   Male   Child   35  13
  Adult  387  75
   Female Child   17  14
  Adult   89  76
 Crew  Male   Child0   0
  Adult  670 192
   Female Child0   0
  Adult3  20
 
 barplot(b)
 barplot(b, beside=T))
 
 Neither resulting barplot is satisfactory.  The first stacks all the
 subdivisions of Survived = Yes and Survived = No together.  The
 second is closer because it creates two groups, but it lists
 combinations side-by-side that we'd like stacked. In the above example
 No and Yes would be stacked on bars labeled Male or Female
 in groups by Class.
 
 I've taken a look through the R-Help archives and looked through the
 contributed packages, but haven't found anything yet.
 
 If you have any thoughts how we might produce groups of stacked bars
 from an ftable, we would appreciate it.


I think that you are trying to plot too much information in a single
graphic. The result of a multi-dimensional barplot is likely to be very
difficult to interpret visually.

You would likely be better served to determine, within the multiple
dimensions, what your conditioning and grouping dimensions need to be
and then consider a lattice based plot.

I would urge you to consider using either barchart() or perhaps
dotplot() in lattice, which are designed to handle multivariable charts
of this nature.

Use:

library(lattice)

Then for general information

?Lattice

and then

?barchart

for more function specific information and examples of graphics with
each function.

For the Titanic data that you have above, you could do something like:

# Convert the multi-dimensional table to a 
# data frame. Assumes you have already done
# data(Titanic)
MyData - as.data.frame(Titanic)

# Take a look at the structure
MyData

   ClassSex   Age Survived Freq
11st   Male Child   No0
22nd   Male Child   No0
33rd   Male Child   No   35
4   Crew   Male Child   No0
51st Female Child   No0
62nd Female Child   No0
73rd Female Child   No   17
8   Crew Female Child   No0
91st   Male Adult   No  118
10   2nd   Male Adult   No  154
11   3rd   Male Adult   No  387
12  Crew   Male Adult   No  670
13   1st Female Adult   No4
14   2nd Female Adult   No   13
15   3rd Female Adult   No   89
16  Crew Female Adult   No3
17   1st   Male Child  Yes5
18   2nd   Male Child  Yes   11
19   3rd   Male Child  Yes   13
20  Crew   Male Child  Yes0
21   1st Female Child  Yes1
22   2nd Female Child  Yes   13
23   3rd Female Child  Yes   14
24  Crew Female Child  Yes0
25   1st   Male Adult  Yes   57
26   2nd   Male Adult  Yes   14
27   3rd   Male Adult  Yes   75
28  Crew   Male Adult  Yes  192
29   1st Female Adult  Yes  140
30   2nd Female Adult  Yes   80
31   3rd Female Adult  Yes   76
32  Crew Female Adult  Yes   20

# Now do a plot. Use 'library(lattice)' here first
# if you had not already done so above for help.
barchart(Freq ~ Survived | Age * Sex, groups = Class, data = MyData,
 auto.key = list(points = FALSE, rectangles = TRUE, space
 = right, title = Class, border = TRUE), xlab = Survived,
 ylim = c(0, 800))

The above barchart will create a four panel plot, where the four main
panels will contain the combinations of Sex and Age. Within each panel
will be two groups of bars, one each for the Survived Yes/No status.
Within each group will be one bar for each Class. 

That is one quick way of grouping things, but you can alter that and
other plot attributes easily.

HTH,

Marc Schwartz

__
[EMAIL PROTECTED] mailing list
https://www.stat.math.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

Re: [R] Covered Labels

2004-06-23 Thread Marc Schwartz

On Wed, 2004-06-23 at 09:06, Martina Renninger wrote:
 Dear All!

 How can I cope with overlapping or covered labels (covered by labels
 from other data points) in plots?


Presuming that you are using text() to identify points in a plot, you
can use the 'cex' argument (which defaults to 1) to reduce the size of
the font. So in this case, try values 1, for example:

text(x, y, labels = YourText, cex = 0.8)

Possibly depending upon how many points you have, you can also adjust
the position of the label with respect to the data points by using
'adj', 'pos' and 'offset'.

See ?text for more information.

HTH,

Marc Schwartz

__
[EMAIL PROTECTED] mailing list
https://www.stat.math.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

Re: [R] direction of axes of plot

2004-06-27 Thread Marc Schwartz

On Sun, 2004-06-27 at 18:24, XIAO LIU wrote:
 R users:
 
 I want X-Y plotting with axes in reverse direction such as (0, -1, -2,
 -3, ).  How can I do it?
 
 Thanks in advance
 
 Xiao

If I am understanding what you want, the following should give you an
example:

# Create x and y with negative values
x - -1:-10
y - -1:-10

# Show regular plot
plot(x, y)

# Now plot using -x and -y
# Do not plot the axes or annotation
plot(-x, -y, axes = FALSE, ann = FALSE)

# Now label both x and y axes with negative
# labels. Use pretty() to get standard tick mark locations
# and use rev() to create tick mark labels in reverse order
axis(1, at = pretty(-x), labels = rev(pretty(x)))
axis(2, at = pretty(-y), labels = rev(pretty(y)))

HTH,

Marc Schwartz

__
[EMAIL PROTECTED] mailing list
https://www.stat.math.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

Re: [R] priceIts problem

2004-07-01 Thread Marc Schwartz

On Thu, 2004-07-01 at 19:02, Erin Hodgess wrote:
 Dear R People:
 
 In library(its), there is a command priceIts.
 
 There is a problem with this command.  It is returning an error message:
 
  ibm1 - priceIts(instrument=ibm,start=1998-01-01,quote=Open)
 Error in download.file(url, destfile, method = method, quiet = quiet) : 
 cannot open URL 
 `http://chart.yahoo.com/table.csv?s=ibma=0b=01c=1998d=5e=30f=2004g=dq=qy=0z=ibmx=.csv'
 In addition: Warning message: 
 cannot open: HTTP status was `404 Not Found' 
  
 
 
 This has been working fine until tonight.
 
 Has anyone else seen this, please?
 thanks in advance!


It would appear that the URL at Yahoo has changed. If you try your URL
in a browser, you get the same 404 msg.

Going to the page for securing an IBM quote:

http://finance.yahoo.com/q/hp?s=IBMa=00b=1c=1998d=05e=30f=2004g=d

The URL towards the bottom of the page for the CSV download is:

http://ichart.yahoo.com/table.csv?s=IBMa=00b=1c=1998d=05e=30f=2004g=dignore=.csv

Note the 'ichart' as opposed to 'chart' in your error msg above. A quick
review of the R source in the 'its' package suggests that the base URL
for Yahoo is hard coded in the priceIts() function and the 'provider'
argument is not yet used.

I have copied Heywood Giles on this reply as an FYI and for
confirmation.

A short term workaround would be to edit the function's code using
fix(priceIts) and change the base URL in the function body as indicated
above. That seems to work for me with a quick check.

HTH,

Marc Schwartz

__
[EMAIL PROTECTED] mailing list
https://www.stat.math.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

Re: [R] priceIts problem

2004-07-01 Thread Marc Schwartz

On Thu, 2004-07-01 at 19:26, Marc Schwartz wrote:
 On Thu, 2004-07-01 at 19:02, Erin Hodgess wrote:
  Dear R People:
  
  In library(its), there is a command priceIts.
  
  There is a problem with this command.  It is returning an error message:
  
   ibm1 - priceIts(instrument=ibm,start=1998-01-01,quote=Open)
  Error in download.file(url, destfile, method = method, quiet = quiet) : 
  cannot open URL 
  `http://chart.yahoo.com/table.csv?s=ibma=0b=01c=1998d=5e=30f=2004g=dq=qy=0z=ibmx=.csv'
  In addition: Warning message: 
  cannot open: HTTP status was `404 Not Found' 
   
  
  
  This has been working fine until tonight.
  
  Has anyone else seen this, please?
  thanks in advance!

snip

 I have copied Heywood Giles on this reply as an FYI and for
 confirmation.

Apologies. That should be Giles Heywood. 

Marc

__
[EMAIL PROTECTED] mailing list
https://www.stat.math.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

Re: [R] Vertical text in plot

2004-07-02 Thread Marc Schwartz

On Fri, 2004-07-02 at 12:45, Wolski wrote:
 Hallo!
 Would like to add vertical text labels to a histogram. Was trying with las but 
 without sucess.
 I am using the standard histogram.
 This is what I was trying.
 
 hist(resS2$sam,breaks=seq(0,1,0.01),col=3,border=0,freq=F,add=T,xlim=c(0,1))
 text(quantile(resS2$dif,0.005),5, 0.5% FP rate ,pos=2,cex=0.6,las=2)
 
 Thanks in advance.
 Eryk


Hi Eryk!

Try using 'srt' instead of 'las', which is for the axis labels.

For example:

text(quantile(resS2$dif, 0.005), 5, 0.5% FP rate, 
 pos = 2, cex = 0.6, srt = 90)

See ?par for more information.

HTH,

Marc Schwartz

__
[EMAIL PROTECTED] mailing list
https://www.stat.math.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

Re: [R] plotting many line segments in different colors

2004-07-02 Thread Marc Schwartz

On Fri, 2004-07-02 at 16:33, rif wrote:
 I want to plot a large number of line segments, with a color
 associated with each line segment (I'm actually plotting a function of
 the edges of a 2d graph, and I want to use color to indicate the level
 of the function.)  I originally thought I could use lines, but lines
 puts all its lines in one color (from help(lines), col: color to use.
 This can be vector of length greater than one, but only the first
 value will be used.).
 
 Is there a function that does what I want?  Right now I'm using the
 obvious solution of calling lines in a loop with a single segment, but
 this is really quite slow for my purposes, as I have several thousand
 lines total to plot.


Take a look at ?matplot or ?matlines depending upon which one might make
sense for your particular application. Both functions are on the same
help page.

HTH,

Marc Schwartz

__
[EMAIL PROTECTED] mailing list
https://www.stat.math.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

Re: [R] counting the occurrences of vectors

2004-07-03 Thread Marc Schwartz

On Sat, 2004-07-03 at 09:31, Ravi Varadhan wrote:
 Hi:
  
 I have two matrices, A and B, where A is n x k, and B is m x k, where
 n  m  k.  Is there a computationally fast way to count the number
 of times each row (a k-vector) of B occurs in A?  Thanks for any
 suggestions.
  
 Best,
 Ravi. 

How about something like this:

row.match - function(m1, m2)
{
  if (ncol(m1) != (ncol(m2)))
stop(Matrices must have the same number of columns)

  m1.l - apply(m1, 1, list)
  m2.l - apply(m2 ,1, list)

  # return boolean for m1.l in m2.l
  m1.l %in% m2.l
}


Example of use:

m - matrix(1:20, ncol = 4, byrow = TRUE)
n - matrix(1:40, ncol = 4, byrow = TRUE)

 m
 [,1] [,2] [,3] [,4]
[1,]1234
[2,]5678
[3,]9   10   11   12
[4,]   13   14   15   16
[5,]   17   18   19   20

 n
  [,1] [,2] [,3] [,4]
 [1,]1234
 [2,]5678
 [3,]9   10   11   12
 [4,]   13   14   15   16
 [5,]   17   18   19   20
 [6,]   21   22   23   24
 [7,]   25   26   27   28
 [8,]   29   30   31   32
 [9,]   33   34   35   36
[10,]   37   38   39   40

 row.match(n, m)
 [1]  TRUE  TRUE  TRUE  TRUE  TRUE FALSE FALSE FALSE FALSE FALSE

If you want to know which rows from n are matches:

 n[row.match(n, m), ]
 [,1] [,2] [,3] [,4]
[1,]1234
[2,]5678
[3,]9   10   11   12
[4,]   13   14   15   16
[5,]   17   18   19   20

and if you just want the indices from n:

 which(row.match(n, m))
[1] 1 2 3 4 5



For timing, if I create some large matrices:

 m - matrix(1:2, ncol = 4, byrow = TRUE)
 nrow(m)
[1] 5000

 n - matrix(1:4, ncol = 4, byrow = TRUE)
 nrow(n)
[1] 1

 system.time(row.match(n, m))
[1] 0.39 0.01 0.41 0.00 0.00

 length(row.match(n, m))
[1] 1


Does that get you what you want?

HTH,

Marc Schwartz

__
[EMAIL PROTECTED] mailing list
https://www.stat.math.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

Re: [R] Outliers

2004-07-04 Thread Marc Schwartz

On Sun, 2004-07-04 at 19:41, Richard A. O'Keefe wrote:
 Last week there was a thread on outlier detection.
 I came across an article which has a very interesting paragraph.
 
 The article is
 Missing Values, Outliers, Robust Statistics,  Non-parametric Methods
 by Shaun Burke, RHM Techology Ltd, High Wycombe, Buckinghamshire, UK.
 It was the fourth article in a series which appeared in
 Scientific Data Management
 in 1998 and 1998.
 
 The very interesting paragraph is this:
 
 NB: It should be noted that following a
 judgement in a US court, the Food and
 Drug Administration 9FDA) in a guide -
 Guide to inspection of pharmaceutical
 quality control laboratories - has
 specifically prohibited the use of outlier
 tests.
 
 Elsewhere, the article recommends the use of outlier tests as a way of
 locating possible transcription errors, but NOT as a way of discarding data.


The FDA Guide referred to in that article is here:

http://www.fda.gov/ora/inspect_ref/igs/pharm.html

If you search that page using the keyword 'outlier' you will note
several references.

The part of the document relevant to the above citation is:

In a recent court decision the judge used the term
out-of-specification (OOS) laboratory result rather than the term
product failure which is more common to FDA investigators and
analysts. He ruled that an OOS result identified as a laboratory error
by a failure investigation or an outlier test. The court provided
explicit limitations on the use of outlier tests and these are discussed
in a later segment of this document., or overcome by retesting. The
court ruled on the use of retesting which is covered in a later segment
of this document. is not a product failure.



Some of the above and elsewhere in the document, relative to grammar and
punctuation, suggests that the HTML page was converted from another
format, perhaps Word or PDF. Some things do not quite make sense, but
you can get the basic idea.

Note also the use of the word 'limitation' above rather than
'prohibited'.

HTH,

Marc Schwartz

__
[EMAIL PROTECTED] mailing list
https://www.stat.math.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

Re: [R] density(x)

2004-07-05 Thread Marc Schwartz

On Mon, 2004-07-05 at 08:34, Christoph Hanck wrote:
 Dear experts, 
 
 when trying to estimate an kernel density function with density(x) I get the 
 following 
 error message with imported data from either EXCEL or text files:
 
 Error in density(spr) : argument must be numeric.
 
 Other procedues such as truehist work. If I generate data within R density works 
 fine. 
 Does anybody have an idea?


More than likely, your vector 'spr' was imported as a factor. This would
possibly suggest that at least one value in 'spr' is not numeric. If the
entire vector was numeric, this would not be a problem.

It is also possible that you may have not specified the proper
delimiting character during the import, which would compromise the
parsed structure of the incoming data.

Use:

str(spr)

and you will probably get 

Factor ...

First, check to be sure that you have used the proper delimiting
character during your import. See ?read.table for the family of related
functions and the default argument values for 'sep', which is the
delimiting character.

You should also check your source data file, since it may be
problematic.

HTH,

Marc Schwartz

__
[EMAIL PROTECTED] mailing list
https://www.stat.math.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

Re: [R] Function for skewness

2004-07-05 Thread Marc Schwartz

On Mon, 2004-07-05 at 09:49, Ernesto Jardim wrote:
 Hi,
 
 Is there a function to estimate the skewness of a distribution ?
 
 Thanks
 
 EJ


See skewness() in CRAN package 'e1071'.

HTH,

Marc Schwartz

__
[EMAIL PROTECTED] mailing list
https://www.stat.math.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

Re: [R] density(x)

2004-07-05 Thread Marc Schwartz

On Mon, 2004-07-05 at 09:41, Christoph Hanck wrote:
 Hello and thanks for your reply
 
 Hopefully, my answer arrives at the correct place like that (if not, I
 am sorry for bothering you, but please let me know...)
 
 To sum up my procedure (sp is exactly the same thing as spr, I had
 just tinkered with
 the names while trying sth. to solve this problem)
 
  sp-read.table(c:/ratsdata/sp3.txt, col.names=sp)
  xd-density(sp)
 Error in density(sp) : argument must be numeric
 
 The suggested remedies yield the following
  str(sp)
 `data.frame':   195 obs. of  1 variable:
  $ sp: int  11 10 10 12 25 22 12 23 13 15 ...
  xd-density(as.numeric(sp))
 Error in as.double.default(sp) : (list) object cannot be coerced to
 double
 
 Hence, it does not seem to be a factor. Declaring it as numeric gives
 another error
 message, on which I haven't yet found any help in Google/the archive.


In this case, you are trying to pass a data frame as an argument to
density() rather than a single column vector. The same problem is the
reason for the error in xd-density(as.numeric(sp)). You are trying to
coerce a data frame to a double.

Example:

# create a data frame called 'sp', that has a column called 'sp'
 sp - data.frame(sp = 1:195)

 str(sp)
`data.frame':   195 obs. of  1 variable:
 $ sp: int  1 2 3 4 5 6 7 8 9 10 ...

# Now try to use density()
 density(sp)
Error in density(sp) : argument must be numeric

# Now call density() properly with the column 'sp' as an argument
# using the data.frame$column notation:
 density(sp$sp)

Call:
density(x = sp$sp)

Data: sp$sp (195 obs.); Bandwidth 'bw' = 17.69

   xy
 Min.   :-52.08   Min.   :7.688e-06  
 1st Qu.: 22.96   1st Qu.:1.009e-03  
 Median : 98.00   Median :4.600e-03  
 Mean   : 98.00   Mean   :3.328e-03  
 3rd Qu.:173.04   3rd Qu.:5.131e-03  
 Max.   :248.08   Max.   :5.133e-03


Two other options in this case:

1. Use attach() to place the data frame 'sp' in the current search path.
Now you do not need to explicitly use the data.frame$column notation.
Then detach is then used to clean up.

attach(sp)
density(sp)
detach(sp)


2. Use with(), which is the preferred notation when dealing with data
frames:

with(sp, density(sp))


To avoid your own confusion in the future, it would be better to not
name the data frame with the same name as a vector. It also helps when
others may need to review your code.

See ?with and ?attach for more information.

Reading through An Introduction to R which is part of the default
documentation set would be helpful to you in better understanding data
types and dealing with data frame structures.

I see that Prof. Ripley has also replied regarding the nature of
truehist(), so that helps to clear up that mystery :-)

HTH,

Marc Schwartz

__
[EMAIL PROTECTED] mailing list
https://www.stat.math.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

Re: [R] counting the occurrences of vectors

2004-07-05 Thread Marc Schwartz

31 
c(3, 2, 3, 3, 1) c(1, 2, 2, 1, 2) c(1, 3, 2, 2, 2) c(1, 1, 1, 2, 3) 
   0102 


I'd be curious to get any feedback on this and if someone has any
thoughts on any gotchas with this approach.

Thanks and I hope that this is of some help.

Marc Schwartz

__
[EMAIL PROTECTED] mailing list
https://www.stat.math.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

Re: [R] counting the occurrences of vectors

2004-07-06 Thread Marc Schwartz

On Mon, 2004-07-05 at 23:22, Gabor Grothendieck wrote:
 Marc Schwartz MSchwartz at MedAnalytics.com writes:
 
  the likely overhead involved in paste()ing together the rows
  to create objects 
 
 
 I thought I would check this and it seems that in my original f1 function 
 its not really the paste itself that's the bottleneck but applying the 
 paste.  If we use do.call rather than apply, as shown in f1a below, then 
 we see that f1a runs faster than row.match.count (which in turn was faster
 than f1):
 
 f1a - function(a,b,sep=:) {
   f - function(...) paste(..., sep=sep)
   a2 - do.call(f, as.data.frame(a))
   b2 - do.call(f, as.data.frame(b))
   c(table(c(b2,unique(a2)))[a2] - 1)
 }
 
  set.seed(1)
  # note that we have increased the size of the matrices from last post
  # to better show the speed difference
  a - matrix(sample(3,1,rep=T),nc=5)
  b - matrix(sample(3,1000,rep=T),nc=5)
 
  # row.match.count taken from Marc's post in this thread
  # have put a c(...) around row.match.count to make it comparable to f1a
  gc(); system.time(ans - c(row.match.count(b,a)))
  used (Mb) gc trigger (Mb)
 Ncells 436079 11.7 741108 19.8
 Vcells 130663  1.0 786432  6.0
 [1] 0.11 0.00 0.11   NA   NA
 
  gc(); system.time(ansf1a - f1a(b,a))
  used (Mb) gc trigger (Mb)
 Ncells 436080 11.7 741108 19.8
 Vcells 130669  1.0 786432  6.0
 [1] 0.04 0.00 0.04   NA   NA
 
  all.equal(ansf1a,ans)
 [1] TRUE


Gabor,

Well done!  I liked your approach in the prior message of getting away
from using regex. I had one of those I could'a had a V-8 moments, when
I realized that of course the resultant table names were syntactically
correct R statements and therefore one could get away from worrying
about the data type issues and use eval(parse(...)).

The above approach is better yet, more flexible, of course more elegant
and notably faster.

Advantage Gabor...  ;-)

Best regards,

Marc

__
[EMAIL PROTECTED] mailing list
https://www.stat.math.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

Re: [R] Improving effeciency - better table()?

2004-07-06 Thread Marc Schwartz

On Tue, 2004-07-06 at 07:56, Simon Cullen wrote:
 Hi,
 
 I've been running some simulations for a while and the performance of R  
 has been great. However, I've recently changed the code to perform a sort  
 of chi-square goodness-of-fit test. To get the observed values for each  
 cell I've been using table() - specifically I've been using cut2 from  
 Hmisc to divide up the range into a specified number of cells and then  
 using table to count how many observations appear in each cell.
 
  obs - table(cut2(z.trun, cuts=breaks))
 
 Having done this I've found that the code takes much longer to run - up to  
 10x as long. Is there a more effecient way of doing this? Anyone have any  
 thoughts?


It would appear that you might be attempting to do a Hosmer-Lemeshow
type of GOF test.

If indeed that is the case, before making the above more efficient, you
should spend some time reviewing the following posts by Frank Harrell on
this subject:

http://maths.newcastle.edu.au/~rking/R/help/02b/4210.html

http://maths.newcastle.edu.au/~rking/R/help/02b/3111.html

HTH,

Marc Schwartz

__
[EMAIL PROTECTED] mailing list
https://www.stat.math.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

Re: [R] Converting S-Plus Libraries to R

2004-07-06 Thread Marc Schwartz

On Tue, 2004-07-06 at 08:54, [EMAIL PROTECTED] wrote:
 Dear all!
 
 I'd like to do multiple imputation of missing values with s-plus libraries
 that are provided by Shafer (http://www.stat.psu.edu/~jls/misoftwa.html). I
 wonder, whether these libraries are compatible or somehow convertible to R
 (because I don't have S-plus), so that I can use this functions using the R
 Program.
 
 I would be happy if you could tell me,
 -if it is possible to use S-plus libraries with R
 -if yes, how I can use the S-Plus libraries in R
 
 Thank you very much,
 
 Will


I believe that you will find that Prof. Ripley has already done the work
for you in the 'mix' package on CRAN:

http://cran.us.r-project.org/src/contrib/Descriptions/mix.html

HTH,

Marc Schwartz

__
[EMAIL PROTECTED] mailing list
https://www.stat.math.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

Re: [R] Creating Binary Outcomes from a continuous variable

2004-07-07 Thread Marc Schwartz

On Wed, 2004-07-07 at 07:57, Doran, Harold wrote:
 Dear List:
 
 I have searched the archives and my R books and cannot find a method to
 transform a continuous variable into a binary variable. For example, I
 have test score data along a continuous scale. I want to create a new
 variable in my dataset that is 1=above a cutpoint (or passed the test)
 and 0=otherwise.

 My instinct tells me that this will require a combination of the
 transform command along with a conditional selection. Any help is much
 appreciated.

Example:

 a - rnorm(20)
 b - ifelse(a  0, 0, 1)

 a
 [1] -1.0735800 -0.6788456  1.9979801 -0.4026760  0.1781791 -1.1540434
 [7] -1.0842728  1.6042602 -0.7950492 -0.1194323  0.4450296  1.9269333
[13] -0.4456181 -0.8374677 -1.1898772  1.7353067  1.8619422 -0.1679996
[19] -0.2656138 -1.5529884
 b
 [1] 0 0 1 0 1 0 0 1 0 0 1 1 0 0 0 1 1 0 0 0

HTH,

Marc Schwartz

__
[EMAIL PROTECTED] mailing list
https://www.stat.math.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

Re: [R] fast NA elimination ?

2004-07-07 Thread Marc Schwartz

On Wed, 2004-07-07 at 09:35, ivo welch wrote:
 dear R wizards: an operation I execute often is the deletion of all 
 observations (in a matrix or data set) that have at least one NA. (I 
 now need this operation for kde2d, because its internal quantile call 
 complains;  could this be considered a buglet?)   usually, my data sets 
 are small enough for speed not to matter, and there I do not care 
 whether my method is pretty inefficient (ok, I admit it: I use the 
 sum() function and test whether the result is NA)---but now I have some 
 bigger data sets. Is there a recommended method of doing NA elimination 
 most efficiently? sincerely, /iaw
 ---
 ivo welch
 professor of finance and economics
 brown / nber / yale


Take a look at ?complete.cases

HTH,

Marc Schwartz

__
[EMAIL PROTECTED] mailing list
https://www.stat.math.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

Re: [R] Importing an Excel file

2004-07-07 Thread Marc Schwartz

On Wed, 2004-07-07 at 13:21, Park, Kyong H Mr. RDECOM wrote:
 Hello, R users,
 I am  a very beginner of R and tried read.csv to import an excel file after
 saving an excel file as csv. But it added alternating rows of fictitious NA
 values after row number 16. When I applied read.delim, there were trailing
 several commas at the end of each row after row number 16 instead of NA
 values. Appreciate your help. 
 
 Kyong


Yep. This is one of the behaviors that I had seen with Excel when I was
running Windows XP. Seemingly empty cells outside the data range would
get exported in the CSV file causing a data integrity problem.

It is one of the reasons that I installed OpenOffice under Windows and
used Calc to open the Excel files and then do the CSV exports before I
switched to Linux :-)

Depending upon the version of Excel you are using, you might try to
highlight and copy only the rectangular range of cells in the sheet that
actually have data to a new sheet and then export the new sheet to a CSV
file.

Do not just click on the upper left hand corner of the sheet to
highlight the entire sheet to copy it. Only highlight the range of cells
you actually need for copying.

Another option is to use the read.xls() function in the 'gregmisc'
package on CRAN or install OpenOffice.

HTH,

Marc Schwartz

__
[EMAIL PROTECTED] mailing list
https://www.stat.math.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

Re: [R] Importing an Excel file

2004-07-07 Thread Marc Schwartz

On Wed, 2004-07-07 at 13:44, Marc Schwartz wrote:
 On Wed, 2004-07-07 at 13:21, Park, Kyong H Mr. RDECOM wrote:
  Hello, R users,
  I am  a very beginner of R and tried read.csv to import an excel file after
  saving an excel file as csv. But it added alternating rows of fictitious NA
  values after row number 16. When I applied read.delim, there were trailing
  several commas at the end of each row after row number 16 instead of NA
  values. Appreciate your help. 
  
  Kyong


One other thing:

The default delimiting characters in read.csv() and read.delim() are NOT
the same.

The former uses a comma and the latter a TAB character. If you did not
change the defaults in Excel when you created your CSV file, that would
account for the difference behaviors upon import.

Be sure that the delimiting character in the R function you use properly
corresponds to the actual delimiting character in your CSV file.

Marc

__
[EMAIL PROTECTED] mailing list
https://www.stat.math.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

Re: [R] text editor for R

2004-07-07 Thread Marc Schwartz

On Wed, 2004-07-07 at 17:47, Yi-Xiong Sean Zhou wrote:
 Hi, 

 What is the best text editor for programming in R? I am using JEdit as the
 text editor, however, it does not have anything specific for R. It will be
 nice to have a developing environment where the keywords are highlighted,
 plus some other debugging functions. 
 
 Yi-Xiong

More information is available at:

http://www.sciviews.org/_rgui/

Your e-mail headers suggest that you are using Windows. Thus, perhaps
the two best choices (subject to challenge by others) would be:

1. R-WinEdt (Under IDE/Script Editors)

2. ESS for Windows

The above two tools provide for a wide variety of functionality beyond
syntax highlighting.

There is a syntax highlighting file listed at the above site for jEdit.

HTH,

Marc Schwartz

__
[EMAIL PROTECTED] mailing list
https://www.stat.math.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

Re: [R] Simple 'frequency' function?

2004-07-09 Thread Marc Schwartz

On Fri, 2004-07-09 at 10:43, Dan Bolser wrote:
 On Fri, 9 Jul 2004, Uwe Ligges wrote:
 
 Dan Bolser wrote:
 
  Hi, I have designed the following function to extract count frequencies
  from an array of integers. For example...
  
  # Tipical array
  x - cbind(1,1,1,1,1,2,2,2,2,3,3,3,3,4,5,6,7,22)
  
  # Define the frequency function
  frequency -
function(x){
  max - max(x)
  j - c()
  for(i in 1:max){
j[i] - length(x[x==i])
  }
  return(j)
  }
  
  fre - frequency(x)
  plot(fre)
  
  How can I ... 
  
  1) Make this a general function so my array could be of the form
  
  # eats!
  x - cbind( egg,egg,egg,egg,ham,ham,ham,ham,chicken )
  
  fre - frequency(x)
  plot(fre)
  
  2) Make frequency return an object which I can call plot on (allowing the
  prob=TRUE option).
 
 
 See ?table:
 
table(x)
plot(table(x))
plot(table(x) / sum(table(x)))
 
 
 Sorry, why does 
 
 plot(table(x),log='y')
 
 fail?
 
 I am looking at count/frequency distributions which are linear on log/log
 scales.


Presumably you are getting the following:

 x - cbind( egg,egg,egg,egg,ham,
  ham,ham,ham,chicken )
 plot(table(x),log='y')
Error in plot.window(xlim, ylim, log, asp, ...) :
Infinite axis extents [GEPretty(0,inf,5)]
In addition: Warning message:
Nonfinite axis limits [GScale(-inf,0.60206,2, .); log=1]

The problem here is that the range for the default y axis is being set
to limits that cannot be used on a log scale.

If you review the code for plot.table(), which is the method that will
be used here, you see the function definition as follows:

 graphics:::plot.table
function (x, type = h, ylim = c(0, max(x)), lwd = 2, xlab = NULL,
ylab = NULL, frame.plot = is.num, ...)

Note that the default ylim is set to have a min value of 0, which of
course you cannot have on a log scale.

Thus, instead, use the following:

plot(table(x), log = y, ylim = range(table(x)))

or otherwise explicitly define the y axis range, such that the min value
is 0.

Note also that the default plot type here is 'h', which will result in a
histogram type of plot using vertical lines. If you want a scatterplot
type of graphic, use:

plot(table(x), log = y, ylim = range(table(x)), type = p)

HTH,

Marc Schwartz

__
[EMAIL PROTECTED] mailing list
https://www.stat.math.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

Re: [R] where does R search when source()?

2004-07-10 Thread Marc Schwartz

On Sat, 2004-07-10 at 20:13, Spencer Graves wrote:
   In case no one who knows has time to reply to this, I will report on 
 my empirical investigation of this question using R 1.9.1 under Windows 
 2000. First, I saved a simple script file tst-source.R in the working 
 directory, e.g., d:/sg/proj1. When I said, source('tst-source.R'), 
 it sourced the file appropriately. Then I moved this file to the 
 immediate parent, e.g., d:/sg and tried the same source command. It 
 replied, Error ... unable to open connection ... . Then I got a 
 command prompt, said, path, and moved the file into one of the 
 directories in the search path. When I repeated the source command, it 
 was still unable to open connection ... .
 
 Conclusion: From this and other experiences, I have found three ways to 
 specify file names:
 
 (1) If the complete path and file name are supplied for an existing 
 file, 'source' will find it.
 
 (2) If a file is in the working directory, specifying that name will get 
 it.
 
 (3) If a file is in a subdirectory of the working directory, e.g., 
 d:/sg/proj1/sub1/tst-source.R, then specifying 
 source('sub1/tst-source.R') will get it.
 
 hope this helps. spencer graves
 
 Shin, Daehyok wrote:
 
 Exactly where does R search for foo.R if I type source(foo.R)?
 Only from current working directory (same as getwd()), from all directories
 specified by e.g. $PATH?  Thanks.
 
 Daehyok Shin


The relevant code snippet from source() is:

Ne - length(exprs - parse(n = -1, file = file))

Note that the argument 'file' from the initial call to source() is used
'as is' in the 'file = file' argument to parse(). There is no searching
of the $PATH.

Thus, the file will be used based upon either the filename itself or a
proper absolute or relative path as Spencer notes above. If the filename
only is used, it needs to be in the current working directory or you get
the error that Spencer experienced.

HTH,

Marc Schwartz

__
[EMAIL PROTECTED] mailing list
https://www.stat.math.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

Re: [R] help with paste

2004-07-12 Thread Marc Schwartz

On Mon, 2004-07-12 at 01:16, Andrew Criswell wrote:
 Hello All:
 
 Suppose the following little data frame:
 
   x - data.frame(dog = c(3,4,6,2,8), cat = c(8,2,3,6,1))
  
   x$cat
 [1] 8 2 3 6 1
  
 
 How can I get the paste() function to do the same thing. The command 
 below is obviously wrong
 
   paste(x, cat, sep = $)


You need to quote the x and the cat as explicit names, otherwise the
objects 'x' and 'cat' are passed as arguments. 'x' in this case being
your data frame and 'cat' being the function cat().

Try this:

 eval(parse(text = paste(x, cat, sep = $)))
[1] 8 2 3 6 1

HTH,

Marc Schwartz

__
[EMAIL PROTECTED] mailing list
https://www.stat.math.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

Re: [R] proportions confidence intervals

2004-07-12 Thread Marc Schwartz

FWIW, if the exact intervals are what is desired here, as another poster
has already suggested, binom.test() will get you there:

 binom.test(1, 10)$conf.int
[1] 0.002528579 0.445016117
attr(,conf.level)
[1] 0.95

 binom.test(10, 100)$conf.int
[1] 0.04900469 0.17622260
attr(,conf.level)
[1] 0.95

HTH,

Marc Schwartz

On Mon, 2004-07-12 at 13:19, Chuck Cleland wrote:
Darren also might consider binconf() in library(Hmisc).
 
   library(Hmisc)
 
   binconf(1, 10, method=all)
 PointEstLower Upper
 Exact   0.1  0.002528579 0.4450161
 Wilson  0.1  0.005129329 0.4041500
 Asymptotic  0.1 -0.085938510 0.2859385
 
   binconf(10, 100, method=all)
 PointEst  Lower Upper
 Exact   0.1 0.04900469 0.1762226
 Wilson  0.1 0.05522914 0.1743657
 Asymptotic  0.1 0.04120108 0.1587989
 
 Spencer Graves wrote:
   Please see:
   Brown, Cai and DasGupta (2001) Statistical Science, 16:  101-133 
  and (2002) Annals of Statistics, 30:  160-2001
   They show that the actual coverage probability of the standard 
  approximate confidence intervals for a binomial proportion are quite 
  poor, while the standard asymptotic theory applied to logits produces 
  rather better answers.
   I would expect confint.glm in library(MASS) to give decent 
  results, possibly the best available without a very careful study of 
  this particular question.  Consider the following:
   library(MASS)# needed for confint.glm
   library(boot)# needed for inv.logit
   DF10 - data.frame(y=.1, size=10)
   DF100 - data.frame(y=.1, size=100)
   fit10 - glm(y~1, family=binomial, data=DF10, weights=size)
   fit100 - glm(y~1, family=binomial, data=DF100, weights=size)
   inv.logit(coef(fit10))
  
   (CI10 - confint(fit10))
   (CI100 - confint(fit100))
  
   inv.logit(CI10)
   inv.logit(CI100)
  
   In R 1.9.1, Windows 2000, I got the following:
  
inv.logit(coef(fit10))
  
  (Intercept)
 0.1
  
   
(CI10 - confint(fit10))
  
  Waiting for profiling to be done...
  2.5 % 97.5 %
  -5.1122123 -0.5258854
  
(CI100 - confint(fit100))
  
  Waiting for profiling to be done...
 2.5 %97.5 %
  -2.915193 -1.594401
  
   
inv.logit(CI10)
  
   2.5 %  97.5 %
  0.005986688 0.371477058
  
inv.logit(CI100)
  
 2.5 %97.5 %
  0.0514076 0.1687655
  
 
(naiveCI10 - .1+c(-2, 2)*sqrt(.1*.9/10))
  
  [1] -0.08973666  0.28973666
  
(naiveCI100 - .1+c(-2, 2)*sqrt(.1*.9/100))
  
  [1] 0.04 0.16

__
[EMAIL PROTECTED] mailing list
https://www.stat.math.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

Re: [R] paired t-test with bootstrap

2004-07-13 Thread Marc Schwartz

On Tue, 2004-07-13 at 07:28, Petr Pikal wrote:
 Hi
 
 On 13 Jul 2004 at 12:28, luciana wrote:
 
  Dear Sirs,
  
  I am a R beginning user: by mean of R I would like to apply the
  bootstrap to my data in order to test cost differences between
  independent or paired samples of people affected by a certain
  disease.
  
  My problem is that even if I am reading the book by Efron
  (introduction to the bootstrap), looking at the examples in internet
  and available in R, learning a lot of theoretical things on
  bootstrap, I can't apply bootstrap with R to my data because of many
  doubts and difficulties. This is the reason why I have decided to
  ask the expert for help.
  
  
  
  I have a sample of diabetic people, matched (by age and sex) with a
  control sample. The variable I would like to compare is their drug
  and hospital monthly cost. The variable cost has a very far from
  gaussian distribution, but I need any way to compare the mean
  between the two group. So, in the specific case of a paired sample
  t-test, I aim at testing if the difference of cost is close to 0.
  What is the better way to follow for that?
  
  
  
  Another question is that sometimes I have missing data in my dataset
  (for example I have the cost for a patients but not for a control).
  If I introduce NA or a dot, R doesn't estimate the statistic I need
  (for instance the mean). To overcome this problem I have replaced
  the missing data with the mean computed with the remaining part of
  data. Anyway, I think R can actually compute the mean even with the
  presence of missing data. Is it right? What can I do?
 
 your.statistic(your.data, na.rm=T)
 
 e.g.
 mean(your.data, na.rm=T)
 
 or look at ?na.action e.g  mean(na.omit(your.data))
 
 Cheers
 Petr Pikal


A couple of other thoughts here with respect to the use of a paired
t-test for the comparison.

As Luciana notes above, cost data is typically highly skewed, raising
doubt as to the use of a simple parametric test to compare the two
groups.

One of the many reasons such data is skewed is that there are notable
differences in the populations that are not accounted for when using
simple characteristics for matching as is done here. What makes a
patient an outlier with respect to cost and how does the distribution
of these patients differ between the two groups and the individual
pairs?

For example, are all the patients in both groups insulin dependent or
are some controlled with oral agents or diet alone? If all are using
insulin, are some using self-administered injections while others are
using implanted infusion pumps? What is the interval from disease onset?
Have any had Pancreas/Islet Cell transplants? Do the matched patients
have similar diabetic related sequelae such as diabetic retinopathy,
neuropathy, vasculopathy, renal dysfunction and others? If not, the
costs to treat these other issues, such as dialysis and wound care
alone, can dramatically alter the cost profile for patients even when
matched by age and gender.

If you are not considering these issues (ie. such as inclusion/exclusion
criteria), you risk significant challenges in your conclusions with
respect to the comparison of costs for these two groups. I would raise
similar concerns when using a sample mean as the imputed value for
missing data.

If you have not done so already, a Medline search of the literature
would be in order to better understand what others have done in this
area for diabetic treatment costs and the pros and cons of their
respective approaches. I suspect that others here will have additional
recommendations.

HTH,

Marc Schwartz

__
[EMAIL PROTECTED] mailing list
https://www.stat.math.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

Re: [R] Permutations

2004-07-13 Thread Marc Schwartz

On Tue, 2004-07-13 at 14:07, Jordi Altirriba Gutirrez wrote:
 Dear R users,
 Im a beginner user of R and Ive a problem with permutations that I dont 
 know how to solve. Ive 12 elements in blocks of 3 elements and I want only 
 to make permutations inter-blocks (no intra-blocks) (sorry if the 
 terminology is not accurate), something similar to:
 
 1 2 3 | 4 5 6 | 7 8 9 | 10 11 12   --1st permutation
 
 1 3 2 | 4 5 6 | 7 8 9 | 10 11 12   NO
-  -
 3 2 1 | 4 5 6 | 7 8 9 | 10 11 12   NO
 -  -  -
 1 2 4 | 3 5 6 | 7 8 9 | 10 11 12   YES-2nd permutation
   --
 4 5 6 | 1 2 3 | 7 8 9 | 10 11 12   YES-3rd permutation
 -  -  -   -  -  -
 4 5 6 | 2 1 3 | 7 8 9 | 10 11 12   NO
-  -
 

You can use the permutations() function in the 'gregmisc' package on
CRAN:

# Assuming you installed 'gregmisc' and used library(gregmisc)
# First create 'groups' consisting of the four blocks
groups - c(1 2 3, 4 5 6, 7 8 9, 10 11 12)

# Now create a 4 column matrix containing the permutations
# The call to permutations() here indicates the number of blocks in
# groups (4), the required length of the output (4) and the vector of
# elements to permute
perms - matrix(permutations(4, 4, groups), ncol = 4)

 perms
  [,1]   [,2]   [,3]   [,4]  
 [1,] 1 2 310 11 12 4 5 67 8 9   
 [2,] 1 2 310 11 12 7 8 94 5 6   
 [3,] 1 2 34 5 610 11 12 7 8 9   
 [4,] 1 2 34 5 67 8 910 11 12
 [5,] 1 2 37 8 910 11 12 4 5 6   
 [6,] 1 2 37 8 94 5 610 11 12
 [7,] 10 11 12 1 2 34 5 67 8 9   
 [8,] 10 11 12 1 2 37 8 94 5 6   
 [9,] 10 11 12 4 5 61 2 37 8 9   
[10,] 10 11 12 4 5 67 8 91 2 3   
[11,] 10 11 12 7 8 91 2 34 5 6   
[12,] 10 11 12 7 8 94 5 61 2 3   
[13,] 4 5 61 2 310 11 12 7 8 9   
[14,] 4 5 61 2 37 8 910 11 12
[15,] 4 5 610 11 12 1 2 37 8 9   
[16,] 4 5 610 11 12 7 8 91 2 3   
[17,] 4 5 67 8 91 2 310 11 12
[18,] 4 5 67 8 910 11 12 1 2 3   
[19,] 7 8 91 2 310 11 12 4 5 6   
[20,] 7 8 91 2 34 5 610 11 12
[21,] 7 8 910 11 12 1 2 34 5 6   
[22,] 7 8 910 11 12 4 5 61 2 3   
[23,] 7 8 94 5 61 2 310 11 12
[24,] 7 8 94 5 610 11 12 1 2 3   

HTH,

Marc Schwartz

__
[EMAIL PROTECTED] mailing list
https://www.stat.math.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

Re: [R] Permutations

2004-07-13 Thread Marc Schwartz

On Tue, 2004-07-13 at 14:29, Marc Schwartz wrote:
 On Tue, 2004-07-13 at 14:07, Jordi Altirriba Gutirrez wrote:
  Dear R users,
  Im a beginner user of R and Ive a problem with permutations that I dont 
  know how to solve. Ive 12 elements in blocks of 3 elements and I want only 
  to make permutations inter-blocks (no intra-blocks) (sorry if the 
  terminology is not accurate), something similar to:
  
  1 2 3 | 4 5 6 | 7 8 9 | 10 11 12   --1st permutation
  
  1 3 2 | 4 5 6 | 7 8 9 | 10 11 12   NO
 -  -
  3 2 1 | 4 5 6 | 7 8 9 | 10 11 12   NO
  -  -  -
  1 2 4 | 3 5 6 | 7 8 9 | 10 11 12   YES-2nd permutation
--
  4 5 6 | 1 2 3 | 7 8 9 | 10 11 12   YES-3rd permutation
  -  -  -   -  -  -
  4 5 6 | 2 1 3 | 7 8 9 | 10 11 12   NO
 -  -
  
 
 You can use the permutations() function in the 'gregmisc' package on
 CRAN:
 
 # Assuming you installed 'gregmisc' and used library(gregmisc)
 # First create 'groups' consisting of the four blocks
 groups - c(1 2 3, 4 5 6, 7 8 9, 10 11 12)
 
 # Now create a 4 column matrix containing the permutations
 # The call to permutations() here indicates the number of blocks in
 # groups (4), the required length of the output (4) and the vector of
 # elements to permute
 perms - matrix(permutations(4, 4, groups), ncol = 4)


Ackone correction. The use of matrix() here was actually redundant.

You can use:

 permutations(4, 4, groups)
  [,1]   [,2]   [,3]   [,4]  
 [1,] 1 2 310 11 12 4 5 67 8 9   
 [2,] 1 2 310 11 12 7 8 94 5 6   
 [3,] 1 2 34 5 610 11 12 7 8 9   
 [4,] 1 2 34 5 67 8 910 11 12
 [5,] 1 2 37 8 910 11 12 4 5 6   
 [6,] 1 2 37 8 94 5 610 11 12
 [7,] 10 11 12 1 2 34 5 67 8 9   
 [8,] 10 11 12 1 2 37 8 94 5 6   
 [9,] 10 11 12 4 5 61 2 37 8 9   
[10,] 10 11 12 4 5 67 8 91 2 3   
[11,] 10 11 12 7 8 91 2 34 5 6   
[12,] 10 11 12 7 8 94 5 61 2 3   
[13,] 4 5 61 2 310 11 12 7 8 9   
[14,] 4 5 61 2 37 8 910 11 12
[15,] 4 5 610 11 12 1 2 37 8 9   
[16,] 4 5 610 11 12 7 8 91 2 3   
[17,] 4 5 67 8 91 2 310 11 12
[18,] 4 5 67 8 910 11 12 1 2 3   
[19,] 7 8 91 2 310 11 12 4 5 6   
[20,] 7 8 91 2 34 5 610 11 12
[21,] 7 8 910 11 12 1 2 34 5 6   
[22,] 7 8 910 11 12 4 5 61 2 3   
[23,] 7 8 94 5 61 2 310 11 12
[24,] 7 8 94 5 610 11 12 1 2 3  

Sorry about that.

Marc

__
[EMAIL PROTECTED] mailing list
https://www.stat.math.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

Re: [R] Permutations

2004-07-13 Thread Marc Schwartz

On Tue, 2004-07-13 at 15:02, Rolf Turner wrote:
 Marc Schwartz wrote (in response to a question from Jordi Altirriba):

snip

 This does not solve the problem that was posed.  It only permutes the
 blocks, and does not allow for swapping between blocks.  For instance
 it does produce the ``acceptable'' permutation
 
   1 2 4 | 3 5 6 | 7 8 9 | 10 11 12   YES-2nd permutation
 
 I would guess that a correct solution is likely to be pretty
 difficult.  I mean, one ***could*** just generate all 12!
 permutations of 1 to 12 and filter out the unacceptable ones.  But
 this is getting unwieldy (12! is close to half a billion) and is
 inelegant.  And the method does not ``generalize'' worth a damn.

Rolf,

You are correct. I missed that (not so subtle) change in the line above.
I mis-read the inter-blocks (no intra-blocks) requirement as simply
permuting the blocks, rather than allowing for the swapping of values
between blocks. Time for new bi-focals...

As Robert has also pointed out in his reply, this gets quite unwieldy.

One of the follow up questions might be, is it only allowable that one
value at a time can be swapped between blocks or can multiple values be
swapped between blocks simultaneously? 

I am not sure that it makes a substantive impact on the problem or its
solution, however. The question is what is to be done with the resultant
set of permutations?

FWIW, on a 3.2 Ghz P4 with 2Gb of RAM:

 system.time(perms - permutations(12, 12, 1:12))

Error: cannot allocate vector of size 1403325 Kb
Timing stopped at: 2274.27 54.58 2787.76 0 0


Marc

__
[EMAIL PROTECTED] mailing list
https://www.stat.math.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

Re: [R] Permutations

2004-07-14 Thread Marc Schwartz

] 941.52

 sd(unlist(lapply(r, nrow)))
[1] 6.494079

There are likely to be some efficiencies in the function that can be
brought to bear, but it is a start.

In either case, the restricted permutations appear to be around 94%, if
all of the assumptions are correct.

HTH,

Marc Schwartz

__
[EMAIL PROTECTED] mailing list
https://www.stat.math.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

Re: [R] MASS package?

2004-07-14 Thread Marc Schwartz

On Wed, 2004-07-14 at 17:05, Johanna Hardin wrote:
 Did the MASS package disappear?  Specifically, I'm looking for a function to
 find the MCD (robust measure of shape and location) for a multi-dimensional
 data matrix.
 
  
 
 Anyone know anything about this?


Try:

library(MASS)
?cov.rob

It's there, unless you have a corrupted/incomplete installation.

HTH,

Marc Schwartz

__
[EMAIL PROTECTED] mailing list
https://www.stat.math.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

Re: [R] Evaluating the Yield of Medical Tests

2004-07-19 Thread Marc Schwartz

On Mon, 2004-07-19 at 14:37, Lisa Wang wrote:
 Hello,
 
 I'm a biostatistician in Toronto. I would like to know if there is
 anything in survival analysis developed in R for the method Evaluating
 the Yield of Medical Test (JAMA. May 14,1982--Vol 247, No.18  Frank E.
 Harrell, Jr,PhD; Robert M. Califf, MD; David B. Pryor, MD;Kerry L.Lee,
 PhD; Robert A. Rosait,MD.)
 
 Hope to hear from you and thanks


I do not have access to the full text of Frank's article, however I read
the brief abstract on Medline and cross-referenced the citation of the
article with content in Frank's book (Regression Modeling Strategies -
http://biostat.mc.vanderbilt.edu/twiki/bin/view/Main/RmS). 

Thus, I am going to take a (hopefully well considered) guess that what
you are looking for will be in the combination of the Hmisc and Design
packages, which Frank has kindly made available for R. These are
available for installation from CRAN.

More information on Hmisc and Design is available at:

http://biostat.mc.vanderbilt.edu/twiki/bin/view/Main/RS

which is the link to Frank's site at Vanderbilt. Looking at the authors'
names, Frank was at Duke when the cited article was written.

I suspect that Frank will reply (RSN) with the acceptance or rejection
of my guess, however...  ;-)

HTH,

Marc Schwartz

__
[EMAIL PROTECTED] mailing list
https://www.stat.math.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

Re: [R] --max-vsize and --max-nsize linux?

2004-07-20 Thread Marc Schwartz

On Tue, 2004-07-20 at 07:55, Christian Schulz wrote:
 Hi,
 
 somtimes i have trivial recodings like this:
 
  dim(tt)
 [1] 252382 98
 
 system.time(for(i in 2:length(tt)){
   tt[,i][is.na(tt[,i])] - 0
 })
 
 ...and a win2000(XP2000+,1GB) machine makes it in several minutes, but
 my linux notebook (XP2.6GHZ,512MB) don't get success after some hours.
 
 I recognize that the cpu load is most time relative small, but  the hardisk 
 have a lot of work.
 
 Is this a problem of --max-vsize and --max-nsize and i should play with that, 
 because i can't believe that the difference of RAM is the reason?
 
 Have anybody experience what is an optimal setting with i.e.
 512 MB  RAM in Linux?
 
 Many thanks for help and comments
 regards,christian


Christian,

I am unclear as to the nature of your loop above. 

Note that:

 length(tt)
[1] 24733436

which is  252382 * 98. Your looping approach is not efficient and
incorrect.

Note that when trying to run your loop 'as is', I get:

 system.time(for(i in 2:length(tt)){
+   tt[,i][is.na(tt[,i])] - 0
+ })
Error: subscript out of bounds
Timing stopped at: 3.54 1.81 5.5 0 0 

This is because 'i' eventually exceeds the number of columns (98) in
'tt', since you have 'i' going from 2 to 24733436.


I am presuming that you simply want to set any 'NA' values in 'tt' to 0?

Take note of using a vectorized approach:


tt - matrix(sample(c(1:10, NA), 252382 * 98, replace = TRUE), 
 ncol = 98)

 dim(tt)
[1] 252382 98

 table(is.na(tt))

   FALSE TRUE 
22484834  2248602 


Now use:

 system.time(tt[is.na(tt)] - 0)
[1] 1.56 0.73 2.42 0.00 0.00

 table(is.na(tt))

   FALSE 
24733436 


This is on a 3.2 Ghz system with 2 Gb of RAM.

However, this is not a memory issue, it is an inefficient use of loops.

HTH,

Marc Schwartz

__
[EMAIL PROTECTED] mailing list
https://www.stat.math.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

Re: [R] Precision in R

2004-07-20 Thread Marc Schwartz

On Tue, 2004-07-20 at 12:13, Duncan Murdoch wrote:
 On Tue, 20 Jul 2004 11:55:32 -0400, [EMAIL PROTECTED] wrote :
 Does anyone know 
 where I can find specifications for R's type double?  
 
 As far as I know, all platforms use the IEEE-754 standard double
 precision numbers.  Google will give you a description; here's one:
 
 http://research.microsoft.com/~hollasch/cgindex/coding/ieeefloat.html
 
 This isn't relevant to your question, but I found the history of the
 development of the standard interesting:
 
 http://http.cs.berkeley.edu/~wkahan/ieee754status/754story.html
 
 Duncan Murdoch


Duncan, 

The standard is there, but not all applications stick to it faithfully.
A good example being how certain cough spreadsheets \cough deal with
numbers close to zero.

For example, Excel will round numbers close to zero to zero. You may
recall this thread from last year covered this topic

http://maths.newcastle.edu.au/~rking/R/help/03a/6597.html

More information on Excel's varied compliance with the IEEE 754 standard
is available here

http://support.microsoft.com/default.aspx?scid=kb;en-us;78113

The official IEEE 754 page is at http://grouper.ieee.org/groups/754/ and
there are some good reading materials and FAQ's there.

This above is beyond the scope of SAS in particular, but I suspect that
the difference that Aaron is experiencing, as Andy has noted, is
methodologic and not precision related.

Aaron, one other source for information on the precision of R on your
particular machine is the use of .Machine, which will provide you with a
list of specifications. See ?.Machine for additional information here.

HTH,

Marc Schwartz

__
[EMAIL PROTECTED] mailing list
https://www.stat.math.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

Re: [R] Error: subscript out of bounds

2004-07-20 Thread Marc Schwartz

On Tue, 2004-07-20 at 13:12, Marie-Pierre Sylvestre wrote:
 Hi
 
 I am running a simulation that involves a loop calling three 2
 functions that I have written. Everything works fine when the inside
 of the loop is performed up to 1000 times (for (i in 1:750)). 
 
 However, I sometimes get : ''Error: subscript out of bounds'' if I try
 to increase the loop 'size' to 1000. I am thinking it has to to with
 memory but I am not sure. I have increased my memory size to 512M but
 it does not solve my problem.
 
 It would take to much place to copy and paste my code here. It would
 be helpful if you could tell me whether my problem may or may not be
 related to memory size.
 
 Beside, what's the difference between
 
  Error: subscript out of bounds
  Error: subscript out of range  ?
 
 
 Regards
 
 M-P Sylvestre


If this was a memory error, you would probably get a cannot allocate
... type of error message.

More than likely, the object upon which you are using the loop has
dimensions which are smaller than the value(s) that your loops are using
for indexing into the object. The use of either dim(object) or
str(object) will give you more information here. When you increase the
loop size, presumably, you have not increased the size of your
underlying object in kind.

For example, if your object (say a matrix) has dimensions of 500 rows
and 10 columns, your loop is trying to index object[510, 12], which is
'out of bounds' for your object.

A search of the R source code using grep suggests that the 'out of
bounds' message is generally used when trying to index (subset) an
object with a value or values that are not correct as I have above. This
could also be a single dimension vector, BTW. For example, trying to
index object[100] when your vector is only 50 elements in size.

In the case of the 'out of range' message, that appears to be typically
used when an argument to a function or other constrained parameter is
above or below the valid range that the argument or parameter may have. 

A scan of where and how the messages are used indicates some
variability, probably as a result of the multiple authors involved.

HTH,

Marc Schwartz

__
[EMAIL PROTECTED] mailing list
https://www.stat.math.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

Re: [R] dumpClass, hasSlot in R?

2004-07-21 Thread Marc Schwartz

On Wed, 2004-07-21 at 15:53, hadley wickham wrote:
 There are a few notes about difference between the R implementation
 and the book at http://developer.r-project.org/methodsPackage.html
 
 I found the hardest thing to get to grips in R was method calling -
 using multiple dispatch (totally different to what I'm used to from
 Java, Python etc.).  I found this tutorial
 (http://www.gwydiondylan.org/gdref/tutorial.html, the sections on
 generic functions and multiple-dispatch) very useful.  However, it is
 for another programming language, and although the method and class
 creation process feels very similar to R, the syntax is quite
 different.  There is definitely scope for a similarly structured
 introduction to S4 classes in R.
 
 Hadley

I have not done any S4 coding yet, but two references that may be of
interest are:

Converting Packages to S4 
by Doug Bates
R News Vol 3, No. 1, June 2003
http://cran.r-project.org/doc/Rnews/Rnews_2003-1.pdf

and 

S4 Classes and Methods
by Fritz Leisch
useR! 2004 Keynote Lecture
Slides available at: 
http://www.ci.tuwien.ac.at/Conferences/useR-2004/Keynotes/Leisch.pdf

HTH,

Marc Schwartz

__
[EMAIL PROTECTED] mailing list
https://www.stat.math.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

Re: [R] viewing Postscript file

2004-07-22 Thread Marc Schwartz

On Thu, 2004-07-22 at 16:50, Bickel, David wrote:
 Is there any R function that can display a Postscript file that is
 already in the working directory? For example, if 'graph.ps' is such a
 file, I'd like to type something like this:
  plot.postscript.file(file = 'graph.ps')
 
 If no such function exists, I'd be interested in a way to use existing
 R functions to do this under UNIX or Windows, preferably without a
 system call to GhostView (gv).
 
 Thanks,
 David


I am not entirely sure what your expectations are here.

As you probably know, Postscript files (like PDF files) are text files
that describe how to draw an image. It requires a Postscript interpreter
(typically Ghostscript) to read the contents of the PS file and then
something like GSview (or gv or ggv or ...) as a front end to render the
image.

It is illusory, but you could create a R wrapper function and call it
plot.postcript.file():

plot.postscript.file - function(file = Rplots.ps)
{
  # define viewer for UNIX/LINUX or Windows
  viewer - ifelse(.Platform$OS.type == unix, gv, GSview)
   
  system(paste(viewer, file, sep =  ))
}


So:

postscript(graph.ps)
barplot(1:5)
dev.off()
plot.postscript.file(graph.ps)

HTH,

Marc Schwartz

__
[EMAIL PROTECTED] mailing list
https://www.stat.math.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

Re: [R] retrieve rows from frame assuming criterion

2004-07-23 Thread Marc Schwartz

On Fri, 2004-07-23 at 08:36, Luis Rideau Cruz wrote:
 Hi all,
 
 I have a data frame in which one column(PUNTAR) is of character type.
 What I want is to retrieve is the frame but only with those rows
 matching elements of PUNTAR with a list characters (e.g
 c(IX49,IX48) )
 
 YearTUR  STODNR   PUNTAR
 1994  9412 94020061 IX49
 1994  9412 94020062 IX48
 1994  9412 94020063  X32
 1994  9412 94020065  X23
 1994  9412 94020066  X27
 1994  9412 94020067 XI19
 1994  9412 94020068 XI16
 1994  9412 94020069 XI14
 1994  9412 94020070  XI8
 1994  9412 94020071  X25
 1994  9412 94020072  X18
 1994  9412 94020073 II23
 1994  9412 94020074XII33
 1994  9412 94020075XII31
 
 my.function(frame) should be then equal to 
 
 Year TURNR   STODNR M_PUNTAR
 1994  9412 94020061 IX49
 1994  9412 94020062 IX48
 
 Thank you in advance


For a simple subset like this, something like the following, presuming
that your data frame is called MyData:

 MyData[MyData$PUNTAR %in% c(IX49, IX48), ]
  Year  TUR   STODNR PUNTAR
1 1994 9412 94020061   IX49
2 1994 9412 94020062   IX48

This basically says to select only those rows where the value of
MyData$PUNTAR is in c(IX49, IX48).

If you need to engage in more complex boolean comparisons for
subsetting, especially on multiple columns, then the function subset()
would be better suited.

HTH,

Marc Schwartz

__
[EMAIL PROTECTED] mailing list
https://www.stat.math.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

Re: [R] merge, cbind, or....?

2004-07-23 Thread Marc Schwartz

On Fri, 2004-07-23 at 10:07, Bruno Cutayar wrote:
 Hi,
 i have two data.frame x and y like :
   x - data.frame( num = c(1:10), value = runif(10) )
   y - data.frame( num = c(6:10), value = runif(5) )
 and i want to obtain something like :
 
  num.xvalue.x num.y   value.y
   1 0.38423828NA 0.2911089
   2 0.17402507NA 0.8455208
   3 0.54443465NA 0.8782199
   4 0.04540406NA 0.3202252
   5 0.46052426NA 0.7560559
   6 0.61385464 6 0.2911089
   7 0.48274968 7 0.8455208
   8 0.11961778 8 0.8782199
   9 0.64531394 9 0.3202252
 10 0.9205280510 0.7560559
 
 with NA in case of missing value for y to x.
 
 { for this example : i write simply
   data.frame(num.x=c(1:10), 
 value.x=x[[2]],num.y=c(rep(NA,5),6:10),value.y=y[[2]]) }
 
 I didn't find solution in merge(x,y,by=num) : missing rows are no keeping.
 Can't you help me ?
 
 thanks,
 Bruno

The use of merge(), with the argument 'all' set to TRUE, will get you
the following (note my values are different due to not using the same
'seed' value for runif() ):

 merge(x, y, by = num, all = TRUE)
   numvalue.x   value.y
11 0.14057955NA
22 0.60850644NA
33 0.63410731NA
44 0.07196253NA
55 0.51869503NA
66 0.57042428 0.3340535
77 0.85874426 0.9340489
88 0.03608417 0.5417780
99 0.24422205 0.2214993
10  10 0.03383263 0.4947865

The use of 'all = TRUE' will fill in non-matching rows. The default is
FALSE.

Note here however, that the value.y column is not replicated for the
first five rows, as you have above. If that is what you want, you could
do something like the following:

 cbind(x, y$value)
   num  value   y$value
11 0.14057955 0.3340535
22 0.60850644 0.9340489
33 0.63410731 0.5417780
44 0.07196253 0.2214993
55 0.51869503 0.4947865
66 0.57042428 0.3340535
77 0.85874426 0.9340489
88 0.03608417 0.5417780
99 0.24422205 0.2214993
10  10 0.03383263 0.4947865

which takes advantage of the recycling of y$value, since it is shorter
than the number of rows in 'x'. In this case, y$value is repeated twice.
HTH,

Marc Schwartz

__
[EMAIL PROTECTED] mailing list
https://www.stat.math.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

Re: [R] installing problems repeated.tgz linux

2004-07-26 Thread Marc Schwartz

On Mon, 2004-07-26 at 09:34, Christian Schulz wrote:
 Hi,
 
 i try several possibilities adn looking in the archive,
  but didn't getting success to install  j.lindsey's  usefuel library 
 repeated on my linux (suse9.0 with kernel 2.6.7,R.1.9.1)
 
 P.S. Windows, works  fine
 
 Many thanks for help
 Christian
 
 
 [EMAIL PROTECTED]:/space/downs R CMD INSTALL - l /usr/lib/R/library   repeated
 WARNING: invalid package '-'
 WARNING: invalid package 'l'
 WARNING: invalid package '/usr/lib/R/library'
 * Installing *source* package 'repeated' ...
 ** libs
 /usr/lib/R/share/make/shlib.mk:5: *** Target-Muster enthlt kein %.  
 Schluss.
 ERROR: compilation failed for package 'repeated'
 ** Removing '/usr/lib/R/library/repeated'

Christian,

There is a space (' ') between the '-' and the 'l', which will be parsed
as two separate arguments.  Hence the initial WARNING messages.

You need to use:

R CMD INSTALL -l /usr/lib/R/library repeated

Also note that you need to have 'root' privileges in order to install
the packages into the /usr/lib/R tree. Thus, you should 'su' to root
before running the command.

You should verify that your R tree is in /usr/lib, as the default is
/usr/local/lib, for which you would not require the '-l
/usr/lib/R/library' argument.

Presumably Windows worked fine because you typically do not require
administrator privileges to install the package locally on Windows or
your account has administrative privileges, which is typical (and bad)
on Windows NT/XP.

HTH,

Marc Schwartz

__
[EMAIL PROTECTED] mailing list
https://www.stat.math.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

RE: [R] installing problems repeated.tgz linux

2004-07-26 Thread Marc Schwartz

I echo Andy's experience on FC2. I was able to install the package here
and got the same warning messages.

Despite trying to use some web sites to translate the german text, I am
unsure of the 'true' meaning. I think it is something pertaining to
target patterns not being found, which leads me to think that this might
be a locale/character encoding issue in the package.

Anyone?

Marc

On Mon, 2004-07-26 at 14:16, Liaw, Andy wrote:
 I downloaded repeated.tgz and tried it myself on one of our AMD Opterons
 running SLES8, and it worked (R-1.9.1 compiled as 64-bit).  Notice that I do
 get a couple of warnings from gcc about labels, and from g77 about the use
 of `sum' function.
 
 Andy

SNIPPED

  From: Liaw, Andy
  
  Sorry, Christian.  I have no idea what those error messages 
  in German say.
  
  Andy
  
   From: [EMAIL PROTECTED] 
   
   Hello,
   
thanks for your and Marc's hint, but it seems not the probleme!?
   Is there any  probleme with my make?
   
   many thanks and regards, 
   christian
   
   
   [EMAIL PROTECTED]:/usr/lib/R R CMD INSTALL 
   -l /usr/lib/R/library  /space/downs/repeated.tgz
   * Installing *source* package 'repeated' ...
   ** libs
   /usr/lib/R/share/make/shlib.mk:5: *** Target-Muster enthlt 
   kein %.  
   Schluss.
   ERROR: compilation failed for package 'repeated'
   ** Removing '/usr/lib/R/library/repeated'
   
   [EMAIL PROTECTED]:/usr/lib/R R CMD INSTALL 
   -l /usr/lib/R/library  /space/downs/repeated
   * Installing *source* package 'repeated' ...
   ** libs
   /usr/lib/R/share/make/shlib.mk:5: *** Target-Muster enthlt 
   kein %.  
   Schluss.
   ERROR: compilation failed for package 'repeated'
   ** Removing '/usr/lib/R/library/repeated'

SNIPPED

__
[EMAIL PROTECTED] mailing list
https://www.stat.math.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

Re: [R] ghyper package

2004-07-27 Thread Marc Schwartz

On Tue, 2004-07-27 at 13:54, Romn Padilla Lizbeth wrote:
 Hello
 
 I am searching ghyper package (generalized hypergeometric
 distributions).
 
 Does anyone can send it to me?
 
  
 
 Regards from Mexico
 
 Lizbeth Romn


You will find that _function_ in Bob Wheeler's SuppDists package on
CRAN:

http://cran.us.r-project.org/src/contrib/Descriptions/SuppDists.html

So use:

install.packages(SuppDists)
library(SuppDists)
?ghyper

HTH,

Marc Schwartz

__
[EMAIL PROTECTED] mailing list
https://www.stat.math.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

Re: [R] RE: [S] tree function in R language

2004-07-28 Thread Marc Schwartz

Shouldn't the URL be (for R 1.8.1 on Windows):

http://cran.r-project.org/bin/windows/contrib/1.8/PACKAGES

There is no URL as listed below, which is presumably why the error
message.

Was options()$CRAN changed improperly or is there some other Windows
specific issue that is escaping me at the moment?

BTW, you should upgrade to R 1.9.1, as you are two versions behind at
this point.

HTH,

Marc Schwartz

On Wed, 2004-07-28 at 23:08, Liaw, Andy wrote:
 1. Could it be that your computer is behind a firewall?  If so, try reading
 the R for Windows FAQ.
 
 2. Please ask R-related question on R-help instead of S-news.
 
 Andy
 
  From: cheng wu 
  
  Hi, Andy
  
  Thank you for your answer.  
  
  Why I can't load CRAN packages?
  
  the error message is: 
  
   {a - CRAN.packages()
  + install.packages(select.list(a[,1],,TRUE), .libPaths()[1], 
  available=a)}
  trying URL `http://cran.r-project.org/bin/windows/contrib/PACKAGES'
  unable to connect to 'cran.r-project.org'.
  Error in download.file(url = paste(contriburl, PACKAGES, 
  sep = /),  : 
  cannot open URL 
  `http://cran.r-project.org/bin/windows/contrib/PACKAGES'
  
  
  
  From: Chushu Gu [EMAIL PROTECTED]
  To: cheng wu [EMAIL PROTECTED]
  Subject: Fw: [S] tree function in R language
  Date: Wed, 28 Jul 2004 09:14:48 -0400
  
  
  - Original Message -
  From: Liaw, Andy [EMAIL PROTECTED]
  To: 'chushu Gu' [EMAIL PROTECTED]; 
  [EMAIL PROTECTED]
  Sent: Tuesday, July 27, 2004 11:22 PM
  Subject: RE: [S] tree function in R language
  
  
Have you read the (latest edition of the) book for which 
  the package 
  you
  are
using supports?  There are differences in S-PLUS and R 
  (and the 4th
  edition
of MASS supports both, thus ought to tell you this particular 
  difference
between the two).  tree() in S-PLUS is written originally 
  by Clark and
Pregibon.  If you want that functionality in R, you need 
  to load the
  `tree'
package (available on CRAN), which is an independent 
  implementation by 
  one
of the co-authors of MASS.
   
Another hint: Look in the `scripts' subdirectory of where 
  the `MASS'
  package
is installed.
   
Andy
   
 From: chushu Gu

 Hi all,

 I am using R 1.8.1, I have the following code,

 library(MASS)
 data(iris)
 ir.tr - tree(Species ~., iris)
 ir.tr
 summary(ir.tr)


 I got the following message:

 Error: couldn't find function tree

 I don't the reason, as I already load the library MASS.

 Could anyone tell me the possible reasons?

 Thanks,

 Chushu Gu

__
[EMAIL PROTECTED] mailing list
https://www.stat.math.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

Re: [R] Editing Strings in R

2004-07-29 Thread Marc Schwartz

On Thu, 2004-07-29 at 15:56, Bulutoglu Dursun A Civ AFIT/ENC wrote:
   I was wondering if there is a way of editting strings in R. I
 have a set of strings and each set is a row of numbers and paranthesis.
   For example the first row is: 
   (0 2)(3 4)(7 9)(5 9)(1 5)
   and I have a thousand or so such rows. I was wondering how I
 could get the corresponding string obtained by adding 1 to all the
 numbers in the string above.
   Dursun



I don't know if this is the most efficient approach, but working on a
few hours of sleep, here goes:


NewRow - function(x)
{
  TempRow - as.numeric(unlist(strsplit(x, ([\\(\\) ] + 1

  TempMat - matrix(TempRow[!is.na(TempRow)], ncol = 2, byrow = TRUE)

  paste((, TempMat[, 1],  , TempMat[, 2], ), sep = , 
collapse = )
}


Basically, the first line splits the character vector into its
components using (, ) and   as regex based delimiters. It coerces
the result to a numeric vector and adds 1.

The second line takes the adjusted non-NA values and converts them into
a two column matrix, to make it easier to do the paste in line 3.

Line 3 returns the adjusted character vector reconstructed.


So:

MyRow - (0 2)(3 4)(7 9)(5 9)(1 5)

 NewRow(MyRow)
[1] (1 3)(4 5)(8 10)(6 10)(2 6)


So, if you have a bunch of these rows, you could use this function with
apply:

MyData - matrix(c((0 2)(3 4)(7 9)(5 9)(1 5), 
(1 6)(4 5)(3 7)(4 8)(9 0),
(3 5)(8 1)(4 7)(2 7)(6 1)))

 MyData
 [,1]   
[1,] (0 2)(3 4)(7 9)(5 9)(1 5)
[2,] (1 6)(4 5)(3 7)(4 8)(9 0)
[3,] (3 5)(8 1)(4 7)(2 7)(6 1)

 matrix(apply(MyData, 1, NewRow))
 [,1] 
[1,] (1 3)(4 5)(8 10)(6 10)(2 6)
[2,] (2 7)(5 6)(4 8)(5 9)(10 1) 
[3,] (4 6)(9 2)(5 8)(3 8)(7 2)  

Somebody may come up with an approach that is more efficient I suspect. 

For 1,200 rows:

 system.time(apply((matrix(rep(MyData, 400))), 1, NewRow))
[1] 0.29 0.00 0.33 0.00 0.00


(Gabor?  ;-)

HTH,

Marc Schwartz

__
[EMAIL PROTECTED] mailing list
https://www.stat.math.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

Re: [R] Editing Strings in R

2004-07-29 Thread Marc Schwartz

On Thu, 2004-07-29 at 21:08, Gabor Grothendieck wrote:
 Bulutoglu Dursun A Civ AFIT/ENC Dursun.Bulutoglu at afit.edu writes:
 
  
  I was wondering if there is a way of editting strings in R. I
  have a set of strings and each set is a row of numbers and paranthesis.
  For example the first row is: 
  (0 2)(3 4)(7 9)(5 9)(1 5)
  and I have a thousand or so such rows. I was wondering how I
  could get the corresponding string obtained by adding 1 to all the
  numbers in the string above.
 
 First do the 1 character translations simultaneously using chartr and
 then use gsub for the remaining one to two character translation:
 
 gsub(0,10,chartr(0123456789,1234567890,(0 2)(3 4)(7 9)(5 9)(1 5)))


Gabor, 

One problem:  Multi-digit numbers in the source string:

 gsub(0,10,chartr(0123456789,1234567890,
   (10 99)(3 4)(7 9)(5 9)(1 5)))
[1] (21 1010)(4 5)(8 10)(6 10)(2 6)


Note the first number 10 gets transformed to 21 and the 99 goes to
1010.


I made a quick update to NewRow, which is not faster, but gets it to two
lines, instead of three, and is a bit cleaner:

NewRow - function(x)
{
  TempMat - matrix(as.numeric(unlist(strsplit(x, ([\\(\\) ], 
ncol = 3, byrow = TRUE) + 1

  paste((, TempMat[, 2],  , TempMat[, 3], ), sep = , 
collapse = )
}


Note that with multi digit numbers, it gives a correct result:

 NewRow((10 99)(101 4)(7 9)(5 9)(1 5))
[1] (11 100)(102 5)(8 10)(6 10)(2 6)


HTH,

Marc Schwartz

__
[EMAIL PROTECTED] mailing list
https://www.stat.math.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

Re: [R] Transparent backgrounds in png files

2004-07-29 Thread Marc Schwartz

On Thu, 2004-07-29 at 19:24, Patrick Connolly wrote:
 On Thu, 29-Jul-2004 at 08:38AM +0100, Prof Brian Ripley wrote:
 
 | The bitmap() device does not support transparency.  The png() device does.
 
 Unfortunately, though png() does a fine job at a transparent
 background, it's rather lumpy even on a screen.
 
 | 
 | On Thu, 29 Jul 2004, Patrick Connolly wrote:
 | 
 [...]
  
 |  Mine is the reverse (and I'm using standard graphics, not Lattice).
 |  I'm trying to get a transparent background but it always comes out
 |  white.  Setting bg = transparent, I've tried using a bitmap device
 |  to create a png file.  I've also tried creating a postscript file and
 |  converting it to a PNG file using the Gimp.  I've always used a
 |  resolution of 300 dpi in bitmaps since the default is far too low.
 | 
 | Really?  You want PNG files of 2000+ pixels in each dimension?  
 
 Well, 300 dpi is somewhat excessive for onscreen, but not for printing
 (more below).  For a screen at 1600 by 1200 resolution, a bitmap of
 over 1000 pixels in either direction is not excessive.  Using a screen
 rated at .25mm dot pitch, 75dpi is rather a lot less than sufficient.
 According to my calculations, .25mm dot pitch corresponds to over 100
 dpi, and a .27mm screen is over 90 dpi, so I don't get this 72
 business.  Perhaps there's something I need to know.
 
 Evidently, there's something others know that I don't since png()
 generated files always turn out lumpy for me.  It's worse than the
 unsatisfactory result of using PowerPoint's turning colours to
 transparent method I mentioned.  People who are used to looking at TV
 screens might not think it's low resolution, so perhaps I'm too fussy.
 
 Maybe I should be more fussy about getting an exact ratio between the
 number of pixels in the plotting device and the size of the image in
 PowerPoint.  I'm somewhat confused by the fact that PP scales to fit
 to the slide PNG files that I produce using the Gimp, but not ones
 made using the png() method directly.  What is the essential
 difference?
 
 
 | -- and you should not really be using bitmapped files for other
 | uses.)
 
 Unfortunate as it may be, many people wish to put graphics in Word
 files and don't like being unable to see their graphics on their
 screen even if they have a postscript printer that could print them
 perfectly.  That's where I use 300 dpi PNGs which print at least as
 well as WMFs I've seen.
 
 There was a recent discussion on this list about graphics using OSX
 which covers most of the same thinking.  Nothing in that discussion
 indicated to me a better way to get graphic files from Linux to Word.
 If there are any, I'd like to know about them.

Patrick,

Are the Windows recipients of the R graphics involved in
creating/editing the resultant documents, or do they simply require
read only access of a final document?

If the latter, then let me suggest that you generate EPS based graphics
in R (for which you can specify height and width arguments in inches as
required). Import those EPS graphics into OO.org's Impress (PP-alike) or
Writer (Word-alike). Then print the file to a PS file and then use
ps2pdf to create a PDF version of the document that the recipients can
view in Acrobat Reader.

If the former, as I believe Frank Harrell noted here some time back, the
recent versions of Word and Powerpoint will create bitmapped previews of
the EPS files upon import. While they are not a high quality image (and
do add to filesize notably), they at least enable the users of the
documents to preview the image for placement/sequencing. They can then
print them to a PS file or if they have the purchased Adobe add-ins,
could print them to a PDF file on their own for viewing in Acrobat.

The major problem with bitmapped images (as has been mentioned here ad
nauseum) is that they do not resize well and what you see on the screen
does not always translate into a quality image when enlarged or sent to
a printer. This is why vector based graphics (such as WMF/EMF, EPS, PDF
and SVG) are preferred. Bitmapped image files also end up being quite
large, whereas EPS files (since they are text files) are relatively
small.

It is not a solution today, but as SVG based graphics become more
available on multiple platforms, that format will probably emerge as the
preferred means of sharing such files. WMF/EMF are limited to Windows as
a realistic option. There is the libEMF library available under Linux,
but from personal experience, it is not a viable option.

HTH,

Marc Schwartz

__
[EMAIL PROTECTED] mailing list
https://www.stat.math.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

Re: [R] How to put multiple plots in the same window? (not par(mfrow=))

2004-07-30 Thread Marc Schwartz

On Fri, 2004-07-30 at 10:41, F Duan wrote:
 Dear All,
 
 I am sorry if this question has been asked before. Below is my Question:
 
 I want to put several plots in the same window, but I dont want the blank 
 space between plots (like par(mfrow=)) --- that makes the plots too small. 
 Could anyone tell me how to do it?
 
 Thanks a lot.
 
 Frank

It is not clear if you want a matrix of plots or if you want plots that
actually overlap (ie. inset plots).

For example, for a matrix using par(mfrow), the actual figure regions
for each plot fill up the full plotting device:

par(mfrow = c(2, 2))
plot(1:5)
box(which = figure)
plot(1:5)
box(which = figure)
plot(1:5)
box(which = figure)
plot(1:5)
box(which = figure)

Each of the four plots take up one quarter of the overall device. The
outer four boxes represent the figure region for each of the four
plots.  Within each figure region is the plot region and the axes,
labels, etc. for each individual plot.

You can use par(mar) to reduce the amount of space between the plot
region and the figure region. As an extreme example:

par(mfrow = c(2, 2))
par(mar = c(0, 0, 0, 0))
plot(1:5)
box(which = figure)
plot(1:5)
box(which = figure)
plot(1:5)
box(which = figure)
plot(1:5)
box(which = figure)

In this case, you now would need to play around with the axis tick
marks, labels, etc.

Can you clarify which space you are referring to?

Marc Schwartz

__
[EMAIL PROTECTED] mailing list
https://www.stat.math.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

Re: [R] Transparent backgrounds in png files

2004-07-30 Thread Marc Schwartz

Patrick,

Here is one additional option for you.

I happened to be doing some searching on the OO.org site today for some
printing related issues in their Bugzilla equivalent.

There was a reference to a MS Office PDF import filter available from
ScanSoft that would enable you to create PDF vector based plot files in
R (using pdf()) and then import them into MS Office. 

There is a RFE in the OO.org issues list for this feature, which won't
appear before OO.org V2.0. If and when this becomes available it would
streamline some of the Linux - Windows issues that have been discussed
in this thread.

More information on PDFConverter is available from the ScanSoft site at:

http://www.scansoft.com/pdfconverter/standard/

There is a standard version available for $49 (U.S) and a professional
version available for $99 (U.S.). Some example PDF - Word documents are
available at http://www.scansoft.com/pdfconverter/demo/.

HTH,

Marc Schwartz

__
[EMAIL PROTECTED] mailing list
https://www.stat.math.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

Re: [R] pairwise difference operator

2004-07-30 Thread Marc Schwartz

On Fri, 2004-07-30 at 18:30, Adaikalavan Ramasamy wrote:
 There was a BioConductor thread today where the poster wanted to find
 pairwise difference between columns of a matrix. I suggested the slow
 solution below, hoping that someone might suggest a faster and/or more
 elegant solution, but no other response.
 
 I tried unsuccessfully with the apply() family. Searching the mailing
 list was not very fruitful either. The closest I got to was a cryptic
 chunk of code in pairwise.table().
 
 Since I do use something similar myself occasionally, I am hoping
 someone from the R-help list can suggest alternatives or past threads.
 Thank you.
 
 ### Code ###
 pairwise.difference - function(m){
   npairs  - choose( ncol(m), 2 )
   results - matrix( NA, nc=npairs, nr=nrow(m) )
   cnames  - rep(NA, npairs)
   if(is.null(colnames(m))) colnames(m) - paste(col, 1:ncol(m), sep=)
   
   k - 1
   for(i in 1:ncol(m)){
 for(j in 1:ncol(m)){
   if(j = i) next;
   results[ ,k] - m[ ,i] - m[ ,j]
   cnames[k]- paste(colnames(m)[ c(i, j) ], collapse=.vs.)
   k - k + 1
 }
   }
   
   colnames(results) - cnames
   rownames(results) - rownames(m)
   return(results)
 }
 
 ### Example using a matrix with 5 gene/row and 4 columns ###
 mat - matrix( sample(1:20), nc=4 )
 colnames(mat) - LETTERS[1:4]; rownames(mat) - paste( g, 1:5, sep=) 
 mat
 A  B  C  D
 g1 10 16  3 15
 g2 18  5 12 19
 g3  7  4  8 13
 g4 14  2  6 11
 g5 17  1 20  9
 
 pairwise.difference(mat)
A.vs.B A.vs.C A.vs.D B.vs.C B.vs.D C.vs.D
 g1 -6  7 -5 13  1-12
 g2 13  6 -1 -7-14 -7
 g3  3 -1 -6 -4 -9 -5
 g4 12  8  3 -4 -9 -5
 g5 16 -3  8-19 -8 11


How about this:

I am taking advantage of the combinations() function in the 'gregmisc'
package to define the pairwise column combinations based upon the input
matrix colnames. Given that, perhaps Greg might want to add this
function to the package if it holds up to scrutiny. Additional error
checking would be required as I note below.

pairwise.diffs - function(x)
{
  if(is.null(colnames(x)))
colnames(x) - 1:ncol(x)

  col.diffs - combinations(ncol(x), 2, colnames(x))
  result - x[, col.diffs[, 1]] - x[, col.diffs[, 2]]
  colnames(result) - paste(col.diffs[, 1], .vs., col.diffs[, 2], 
sep = )
  result
}

What I am essentially doing is creating the matrix 'col.diffs' to hold
the combinations of the colnames in matrix 'x'. If 'x' does not have
colnames, I set them to the column indices. Then in line 2, I do the
pairwise subtractions. Line 3 simply sets up the colnames in the result
as the combinations.

Note that the subtractions, as you have above, are the first column
minus the second column in the pairwise combinations. You would also
want to check for an input matrix of 3 columns, since the 'result' in
that case would be a vector, rather than a matrix. In that case, you
could add code to coerce 'result' to a matrix, or simply not allow
matrices with 3 columns.

So, using your example matrix above (different seed value):

 mat - matrix(sample(1:20), nc=4)
 colnames(mat) - LETTERS[1:4]
 rownames(mat) - paste( g, 1:5, sep=) 
 mat
A  B  C  D
g1  1 17 13 10
g2 12  5  7 16
g3  2 19  6 14
g4 20  4 11  8
g5  3 15 18  9

 pairwise.diffs(mat)
   A.vs.B A.vs.C A.vs.D B.vs.C B.vs.D C.vs.D
g1-16-12 -9  4  7  3
g2  7  5 -4 -2-11 -9
g3-17 -4-12 13  5 -8
g4 16  9 12 -7 -4  3
g5-12-15 -6 -3  6  9


HTH,

Marc Schwartz

__
[EMAIL PROTECTED] mailing list
https://www.stat.math.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

Re: [R] pairwise difference operator

2004-07-31 Thread Marc Schwartz

On Fri, 2004-07-30 at 20:28, Marc Schwartz wrote:
 On Fri, 2004-07-30 at 18:30, Adaikalavan Ramasamy wrote:
  There was a BioConductor thread today where the poster wanted to find
  pairwise difference between columns of a matrix. I suggested the slow
  solution below, hoping that someone might suggest a faster and/or more
  elegant solution, but no other response.
  
  I tried unsuccessfully with the apply() family. Searching the mailing
  list was not very fruitful either. The closest I got to was a cryptic
  chunk of code in pairwise.table().
  
  Since I do use something similar myself occasionally, I am hoping
  someone from the R-help list can suggest alternatives or past threads.
  Thank you.

snip

In follow up to the posts on this last night, I created an updated
version of my function (though I will point out that Gabor's is faster,
as I will show below).

I realized that using the combinations() function had a potential
limitation, which is the limits of R's recursion depth, as Greg mentions
in the help for the function. It will require an adjustment when the
number of columns is about 45.

Thus, I modified the creation of the column combinations as noted below.
I also added some code to verify the input data type and to ensure that
the resultant structures remain matrices in the case of an input matrix
with ncol = 2, in which case, this function is of course, overkill.

Thus:

 pairwise.diffs - function(x)
{
  stopifnot(is.matrix(x))

  # create column combination pairs
  prs - cbind(rep(1:ncol(x), each = ncol(x)), 1:ncol(x))
  col.diffs - prs[prs[, 1]  prs[, 2], , drop = FALSE]

  # do pairwise differences 
  result - x[, col.diffs[, 1]] - x[, col.diffs[, 2], drop = FALSE]

  # set colnames
  if(is.null(colnames(x)))
colnames(x) - 1:ncol(x)

  colnames(result) - paste(colnames(x)[col.diffs[, 1]], .vs., 
colnames(x)[col.diffs[, 2]], sep = )
  result
}


Now to performance. I created a large 1,000 column matrix:

mat - matrix(sample(100, 1, replace = TRUE), ncol = 1000)
colnames(mat) - 1:1000

 str(mat)
 int [1:10, 1:1000] 48 23 26 22 69 64 2 13 13 69 ...
 - attr(*, dimnames)=List of 2
  ..$ : NULL
  ..$ : chr [1:1000] 1 2 3 4 ...


Timing:

 gc();system.time(m - pairwise.diffs(mat))
  used (Mb) gc trigger  (Mb)
Ncells 1541241 41.23094291  82.7
Vcells 7139074 54.5   17257300 131.7
[1] 1.14 0.19 1.39 0.00 0.00


 gc();system.time(g - do.call(cbind, sapply(2:ncol(mat), 
f, mat)))
  used (Mb) gc trigger  (Mb)
Ncells 1541241 41.23094291  82.7
Vcells 7139074 54.5   17257300 131.7
[1] 0.81 0.02 0.92 0.00 0.00


Comparisons:

 str(m)
 int [1:10, 1:499500] -47 -43 -35 -29 15 33 -53 -36 -17 57 ...
 - attr(*, dimnames)=List of 2
  ..$ : NULL
  ..$ : chr [1:499500] 1.vs.2 1.vs.3 1.vs.4 1.vs.5 ...


 str(g)
 int [1:10, 1:499500] -47 -43 -35 -29 15 33 -53 -36 -17 57 ...
 - attr(*, dimnames)=List of 2
  ..$ : NULL
  ..$ : chr [1:499500] 1-2 1-3 1-4 1-5 ...


 table(m == g)
 
   TRUE
4995000


HTH,

Marc Schwartz

__
[EMAIL PROTECTED] mailing list
https://www.stat.math.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

Re: [R] Is k equivalent to k:k ?

2004-08-02 Thread Marc Schwartz

On Mon, 2004-08-02 at 09:46, Georgi Boshnakov wrote:
 Hi,
 
 I wonder if the following (apparent) inconsistency is a bug or
 feature. 
 Since scalars are simply vectors of length one I would think that 
   a  and 
   a:a 
 produce the same result. For example,
 
  identical(4.01,4.01:4.01)
 [1] TRUE
 
 However,
 
 identical(4,4:4)
 [1] FALSE
 
 and
 
  identical(4.0,4.0:4.0)
 [1] FALSE
 
 A closer look reveals that the colon operator produces objects of
 different class, e.g.
 
  class(4)
 [1] numeric
  class(4.0)
 [1] numeric
 
 but
 
  class(4:4)
 [1] integer
  class(4.0:4.0)
 [1] integer
 
 
 Georgi Boshnakov

The : operator is the functional equivalent of seq(from=a, to=b).

Note that the help for seq() indicates the following for the return
value:

The result is of mode integer if from is (numerically equal to an)
integer and by is not specified.

Thus, when using the : operator, you get integers as the returned
value(s), which is what is happening in your final pair of examples.


If you look at the final example under ?identical, you will see:

identical(1, as.integer(1)) ## FALSE, stored as different types

This is because the first 1 is a double by default.


Thus, in the case of:

identical(4, 4:4)

the first 4 is of type double, while the 4:4 is of type single. Thus the
result is FALSE.

Now, on the other hand, try:

 typeof(seq(4, 4, by = 1))
[1] double

You see that the result of the sequence is of type double. Hence:

 identical(4, seq(4, 4, by = 1))
[1] TRUE


So to the question in your subject, no k (a double by default) is not
the same as k:k (a integer by default).

HTH,

Marc Schwartz

__
[EMAIL PROTECTED] mailing list
https://www.stat.math.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

Re: [R] Is k equivalent to k:k ?

2004-08-02 Thread Marc Schwartz

On Mon, 2004-08-02 at 10:09, Marc Schwartz wrote:

snip

 Thus, in the case of:
 
 identical(4, 4:4)
 
 the first 4 is of type double, while the 4:4 is of type single. Thus the
 result is FALSE.

snip

Correction. The above sentence should read:

the first 4 is of type double, while the 4:4 is of type **INTEGER**.
Thus the result is FALSE.

Sorry about that.  Need more coffee

Marc

__
[EMAIL PROTECTED] mailing list
https://www.stat.math.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

Re: [R] R packages install problems linux - X not found (WhiteBox EL 3)

2004-08-08 Thread Marc Schwartz

On Sun, 2004-08-08 at 12:32, Douglas Bates wrote:
 Dr Mike Waters wrote:
 
  I am used to using R under Windows, but have done an install of 1.9.1 under
  WhiteBox linux 3 (based on RHEL 3). This all went without a hitch, along
  with most of the additional package installs. However, while trying to
  install car and rgl I hit a problem regarding the X environment not being
  found. As I was doing the install from a console *within* the X environment,
  this is obviously down to a missing environment variable or link. The X11
  directories all seem to be in the usual places. I've checked as much as I
  can through the archives and googled around, but to no avail. Any help
  appreciated.
 
 Or a missing development package.  In many Linux distributions the 
 include files for X11 are in a separate package from the run-time 
 libraries.  I have never used WhiteBox Linux but I imagine that will be 
 the case for that distribution too.  Check to see if there is a package 
 with a name like xlibs-dev or x-dev.


Just to amplify on Doug's comments, the RPM in question should be
something like:

XFree86-devel-...

where the ... is replaced the by version numbering schema.

I am presuming that WhiteBox has not yet changed over to the use of
X.org in place of XFree86 at this point. If it has, then the RPM would
be something like:

xorg-x11-devel-...

An easy way to check for this would be to open a console window and use:

rpm -q XFree86-devel

in the first case or:

rpm -q xorg-x11-devel

in the second case.

If nothing is returned by the command, then it would confirm that you
are missing the requisite RPM.

In the case of the RGL package, you might want to review this recent
thread:

https://www.stat.math.ethz.ch/pipermail/r-help/2004-August/thread.html

which indicates some issues related to the same devel libraries,
including the XFree86-Mesa-libGL (or xorg-x11-Mesa-libGL) and
XFree86-Mesa-libGLU (or xorg-x11-Mesa-libGLU) RPMS.

HTH,

Marc Schwartz

__
[EMAIL PROTECTED] mailing list
https://www.stat.math.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

Re: [R] R packages install problems linux - X not found (WhiteBox EL 3)

2004-08-08 Thread Marc Schwartz

On Sun, 2004-08-08 at 12:53, Marc Schwartz wrote:
 In the case of the RGL package, you might want to review this recent
 thread:
 
 https://www.stat.math.ethz.ch/pipermail/r-help/2004-August/thread.html


Correction on the above URL. I pasted the wrong one here. It should be:

https://www.stat.math.ethz.ch/pipermail/r-help/2004-August/053994.html

Marc

__
[EMAIL PROTECTED] mailing list
https://www.stat.math.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

Re: [R] manipulating strings

2004-08-08 Thread Marc Schwartz

On Sun, 2004-08-08 at 13:58, Stephen Nyangoma wrote:
 Hi
 I have a called fil consisting of the following strings.
 
 
  fil
   [1]  102.2 639104.2 224105.1 1159   107.1 1148  
108.1 1376
   [6]  109.2 1092   111.2 1238   112.2 349113.1 1204  
114.1 537
  [11]  115.0 303116.1 490117.2 202118.1 1864  
119.0 357
 
 
 I want to get a data frame like
 
 TimeObs
 102.2   639
 104.2   224
 105.1  1159
 107.1  1148
 108.1  1376
 109.2  1092
 111.2  1238
 112.2   349
 113.1  1204  
 114.1   537
 etc
 
 Can anyone see an efficient way of doing this?
 
 Thanks. Stephen

Try this:

# Create strings
MyStrings - c( 102.2 639,   104.2 224,  105.1 1159,
107.1 1148,  108.1 1376,  109.2 1092,
111.2 1238,  112.2 349,   113.1 1204,
114.1 537,   115.0 303,   116.1 490,
117.2 202,   118.1 1864,  119.0 357)

 MyStrings
 [1]  102.2 639   104.2 224   105.1 1159  107.1 1148
 [5]  108.1 1376  109.2 1092  111.2 1238  112.2 349 
 [9]  113.1 1204  114.1 537   115.0 303   116.1 490 
[13]  117.2 202   118.1 1864  119.0 357 


# Now convert to a data frame, by first using strsplit(), to break up
# each of the vector elements into three components, using   as a
# split character. This returns a list, which we then convert to vector,
# using unlist(). Then use matrix() to convert the vector into a two
# dimensional object with 3 cols. Use 'byrow = TRUE' so that we fill
# the matrix row by row. Then take only the second and third columns 
# from the matrix and convert them into a data frame.
df - as.data.frame(matrix(unlist(strsplit(MyStrings, split =  )),
ncol = 3, byrow = TRUE)[, 2:3])

# Finally, set the colnames
colnames(df) - c(Time, Obs)

 df
Time  Obs
1  102.2  639
2  104.2  224
3  105.1 1159
4  107.1 1148
5  108.1 1376
6  109.2 1092
7  111.2 1238
8  112.2  349
9  113.1 1204
10 114.1  537
11 115.0  303
12 116.1  490
13 117.2  202
14 118.1 1864
15 119.0  357


Note that the above presumes that your strings (character vectors) have
a leading   in them and the Time and Obs elements are also separated
by a   in each.

See ?strsplit for more information.

HTH,

Marc Schwartz

__
[EMAIL PROTECTED] mailing list
https://www.stat.math.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

RE: [R] R packages install problems linux - X not found (WhiteBoxEL 3)

2004-08-08 Thread Marc Schwartz

On Sun, 2004-08-08 at 14:10, Dr Mike Waters wrote:

snip

 Thanks for the responses guys.
 
 I used to have RH9 installed on this machine and I found out about the
 separate developer packages then. I thought that I had got the relevant
 XFree devel package installed, but although it showed up in the rpm database
 as being present, the required files were not present. I did a forced rpm
 upgrade from the WhiteBox updates directory and that problem is now fixed,
 at least for car. Marc, thanks for the pointer on the rgl problem. However,
 I have a slightly different problem with the install of this package. It
 gets through to the point where it tries to make the rgl.so from the various
 .o files and fails then, as follows:
 
 
 
 g++ -I/usr/lib/R/include -I/usr/X11R6/include -DHAVE_PNG_H
 -I/usr/include -I/usr/local/include  -Wall -pedantic -fno-exceptions
 -fno-rtti -fPIC  -O2 -g -march=i386 -mcpu=i686 -c glgui.cpp -o glgui.o
 
 g++  -L/usr/local/lib -o rgl.so x11lib.o x11gui.o types.o math.o fps.o
 pixmap.o gui.o api.o device.o devicemanager.o rglview.o scene.o glgui.o
 -L/usr/X11R6/lib -L/usr/lib -lstdc++ -lX11 -lXext -lGL -lGLU -lpng
 /usr/lib/gcc-lib/i386-redhat-linux/3.2.3/../../../crt1.o(.text+0x18): In
 function `_start':
 : undefined reference to `main'
 x11lib.o(.text+0x84): In function `set_R_handler':
 /tmp/R.INSTALL.13414/rgl/src/x11gui.h:33: undefined reference to
 `R_InputHandlers'
 x11lib.o(.text+0x92):/tmp/R.INSTALL.13414/rgl/src/x11gui.h:33: undefined
 reference to `addInputHandler'
 x11lib.o(.text+0xfb): In function `unset_R_handler':
 /tmp/R.INSTALL.13414/rgl/src/x11lib.cpp:52: undefined reference to
 `R_InputHandlers'
 x11lib.o(.text+0x103):/tmp/R.INSTALL.13414/rgl/src/x11lib.cpp:52:
 undefined reference to `removeInputHandler'
 collect2: ld returned 1 exit status
 make: *** [rgl.so] Error 1
 ERROR: compilation failed for package 'rgl'
 ** Removing '/usr/lib/R/library/rgl'
 
 -
 
 No doubt another failed dependency... DOH!
 
 Regards


I am concerned by your indications of previously having had RH9 on the
same box and that you had to force an update of the XFree Devel RPM.
Forcing the installation of an RPM is almost always a bad thing.

When you installed WB on the system, did you do a clean installation
or some type of upgrade?

If the latter, it is reasonable to consider that there may be some level
of mixing and matching of RPMS from the two distributions going on. This
could result in a level of marginally or wholly incompatible versions of
RPMS being installed.

Could you clarify that point?

Also, be sure that you have the same versions of the XFree series RPMS
installed.

Use:

rpm -qa | grep XFree

in a console and be sure that the RPMS return the same version schema.
If not, it is possible that one of your problems is the mixing of
versions.

Take note of the output of the above and be sure that the
XFree86-Mesa-libGL and XFree86-Mesa-libGLU RPMS are installed as well.

Some of the messages above would also suggest a problem finding R
related headers. How did you install R? This may be a red herring of
sorts, given the other problems, but may be helpful.

Marc

__
[EMAIL PROTECTED] mailing list
https://www.stat.math.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

RE: [R] R packages install problems linux - X not found (WhiteBoxEL 3)

2004-08-09 Thread Marc Schwartz

On Mon, 2004-08-09 at 08:13, Dr Mike Waters wrote:

snip

 Marc,
 
 Sorry for the confusion yesterday - in my defence, it was very hot and humid
 here in Hampshire (31 Celsius at 15:00hrs and still 25 at 20:00hrs). 
 
 What had happened was that I had done a clean install of WB Linux, including
 the XFree86 and other developer packages. However, the on-line updating
 system updated the XFree86 packages to a newer sub version. It seems that it
 didn't do this correctly for the XFree86 developer package, which was
 missing vital files. However it showed up in the rpm database as being
 installed (i.e. rpm -qa | grep XFree showed it thus). I downloaded another
 rpm for this manually and I only forced the upgrade because it was the same
 version as already 'installed' (as far as the rpm database was concerned). I
 assumed that all dependencies were sorted out through the install in the
 first place.

OK, that helps. I still have a lingering concern that, given the facts
above, there may be other integrity issues in the RPM database, if not
elsewhere.

From reading the WB web site FAQ's
(http://www.whiteboxlinux.org/faq.html) , it appears that they are using
up2date/yum for system updates. Depending upon the version in use, there
have been issues especially with up2date (hangs, incomplete updates,
etc.) which could result in other problems. I use yum via the console
here (under FC2), though I note that a GUI version of yum has been
created, including replacing the RHN/up2date system tray alert icon.

A thought relative to this specifically:

If there is or may be an integrity problem related to the rpm database,
you should review the information here:

http://www.rpm.org/hintskinks/repairdb/

which provides instructions on repairing the database. Note the
important caveats regarding backups, etc.

The two key steps there are to remove any residual lock files using (as
root):

rm -f /var/lib/rpm/__*

and then rebuilding the rpm database using (also as root):

rpm -vv --rebuilddb

I think that there needs to be some level of comfort that this basic
foundation for the system is intact and correct.

 I only mentioned RH9 to show that I had some familiarity with the RedHat
 policy of separating out the 'includes' etc into a separate developer
 package.
 
 Once all this had been sorted out, I was then left with a compilation error
 which pointed to a missing dependency or similar, which was not due to
 missing developer packages, but, as you and Prof Ripley correctly point out,
 from the R installation itself. Having grown fat and lazy on using R under
 the MS Windows environment, I was struggling to identify the precise nature
 of this remaining problem.
 
 As regards the R installation, I did this from the RH9 binary for version
 1.9.1, as I did not think that the Fedora Core 2 binary would be appropriate
 here. Perhaps I should now compile from the source instead?

I would not use the FC2 RPM, since FC2 has many underlying changes not
the least of which includes the use of the 2.6 kernel series and the
change from XFree86 to x.org. Both changes resulted in significant havoc
during the FC2 testing phases and there was at least one issue here with
R due to the change in X.

According to the WB FAQs:

If you cannot find a package built specifically for RHEL3 or WBEL3 you
can try a package for RH9 since many of the packages in RHEL3 are the
exact same packages as appeared in RH9.

Thus, it would seem reasonable to use the RH9 RPM that Martyn has
created. An alternative would certainly be to compile R from the source
tarball.

In either case, I would remove the current installation of R and after
achieving a level of comfort that your RPM database is OK, reinstall R
using one of the above methods. Pay close attention to any output during
the installation process, noting any error or warning messages that may
occur.

If you go the RPM route, be sure that the MD5SUM of the RPM file matches
the value that Martyn has listed on CRAN to ensure that the file has
been downloaded in an intact fashion.

These are my thoughts at this point. You need to get to a point where
the underlying system is stable and intact, then get R to the same state
before attempting to install new packages.

HTH,

Marc

__
[EMAIL PROTECTED] mailing list
https://www.stat.math.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

RE: [R] R packages install problems linux - X not found (WhiteBoxEL 3)

2004-08-10 Thread Marc Schwartz

On Tue, 2004-08-10 at 08:15, Dr Mike Waters wrote:

snip 

 From unpacking the tarball and running ./configure in the R source
 directory, I obtain the fact that crti.o is needed by ld.so and was not
 found. This file is not present on the system. This file, along with crtn.o
 is usually installed by the gnu libc packages, I believe. However, I know
 that not all *nix distributions include these files among their packages.
 From a web search, I have not been able to ascertain whether this lack of a
 crti.o is due to there not being one in the distribution, or to another
 incomplete package install.
 
 So, I did a completely fresh installation of WhiteBox, followed by R built
 from source, checked that it ran and then installed the R packages. Only
 then did I run up2date. At least crti.o and crtn.o are still there this
 time, along with the XFree86 includes.
 
 A bit of a cautionary tale, all in all. 
 
 Thanks for all the help and support.
 
 Regards
 
 M


Mike,

From my FC2 system:

$ rpm -qf /usr/lib/crti.o
glibc-devel-2.3.3-27

$ rpm -qf /usr/lib/crtn.o
glibc-devel-2.3.3-27

So, you are correct relative to the source of these two files. A follow
up question might be, did you include the devel packages during your
initial install? If not, that would explain the lack of these files. if
you did, then it would add another data point to support the notion that
your system was, to some level, compromised and a clean install was
probably needed, rather than just trying to re-create the RPM database.

Glad that you are up and running at this point. Given Martyn's follow up
messages, it looks like there may be an issue with the RH9 RPM, so for
the time being using the source tarball would be appropriate.

Best regards,

Marc

__
[EMAIL PROTECTED] mailing list
https://www.stat.math.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

Re: [R] barplot and names.arg

2004-08-13 Thread Marc Schwartz

On Fri, 2004-08-13 at 09:22, Luis Rideau Cruz wrote:
 R-help
 
 Is there any option to get closer the x-axis and names.arg from barplot?
 
 Thank you

Using mtext() you can do something like the following:

data(VADeaths)

# Now place labels closer to the x axis
# set 'axisnames' to FALSE so the default
# labels are not drawn. Also note that barplot() 
# returns the bar midpoints, so set 'mp' to the return
# values
mp - barplot(VADeaths, axisnames = FALSE)

# Now use mtext() for the axis labels
mtext(text = colnames(VADeaths), side = 1, at = mp, line = 0)

# clean up
rm(VADeaths)


You can adjust the 'line = 0' argument to move the labels closer to and
farther away from the axis.

HTH,

Marc Schwartz

__
[EMAIL PROTECTED] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

Re: [R] Question from Newbie on PostScript and Multiple Plots

2004-08-13 Thread Marc Schwartz

On Fri, 2004-08-13 at 11:28, Johnson, Heather wrote:
 Hi,
  
 As I'm pretty new to R I hope this question isn't too basic.  
  
 I am currently looping through my dataset and for each iteration am
 producing three separate plots.  When I output these plots to the screen
 they are nicely grouped as three plots per page, however, when I try to send
 it to a PostScript file I get one page for each plot.  I have adjusted my
 postscript options so that my plots are the size that I want and the paper
 is set to portrait, I just can't figure out how to get all three plots on
 one page in the postscript file.  I've been through the archives on the list
 (albeit not exhaustively) and the manuals available on the R site and cannot
 figure out how to solve my problem.
  
 Thanks,
 -Heather


Either one of the following work for me:

# Do 3 plots in a 2 x 2 matrix
postscript(file = ThreePlots.ps, horizontal = FALSE)
par(mfrow = c(2, 2))
plot(1:5)
barplot(1:5)
boxplot(rnorm(10))
dev.off()

# Do 3 x 1
postscript(file = ThreePlots.ps, horizontal = FALSE)
par(mfrow = c(3, 1))
plot(1:5)
barplot(1:5)
boxplot(rnorm(10))
dev.off()

Can you provide an example of the code that you are using?

Marc Schwartz

__
[EMAIL PROTECTED] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

RE: [R] numerical accuracy, dumb question

2004-08-13 Thread Marc Schwartz

Part of that decision may depend upon how big the dataset is and what is
intended to be done with the ID's:

 object.size(1011001001001)
[1] 36

 object.size(1011001001001)
[1] 52

 object.size(factor(1011001001001))
[1] 244


They will by default, as Andy indicates, be read and stored as doubles.
They are too large for integers, at least on my system:

 .Machine$integer.max
[1] 2147483647

Converting to a character might make sense, with only a minimal memory
penalty. However, using a factor results in a notable memory penalty, if
the attributes of a factor are not needed.

If any mathematical operations are to be performed with the ID's then
leaving them as doubles makes most sense.

Dan, more information on the numerical characteristics of your system
can be found by using:

.Machine

See ?.Machine and ?object.size for more information.

HTH,

Marc Schwartz


On Fri, 2004-08-13 at 21:02, Liaw, Andy wrote:
 If I'm not mistaken, numerics are read in as doubles, so that shouldn't be a
 problem.  However, I'd try using factor or character.
 
 Andy
 
  From: Dan Bolser
  
  I store an id as a big number, could this be a problem?
  
  Should I convert to at string when I use read.table(...
  
  example id's
  
  1001001001001
  1001001001002
  ...
  1002001002005
  
  
  Bigest is probably 
  
  1011001001001
  
  Ta, 
  Dan.
 

__
[EMAIL PROTECTED] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

RE: [R] numerical accuracy, dumb question

2004-08-14 Thread Marc Schwartz

On Sat, 2004-08-14 at 08:42, Tony Plate wrote:
 At Friday 08:41 PM 8/13/2004, Marc Schwartz wrote:
 Part of that decision may depend upon how big the dataset is and what is
 intended to be done with the ID's:
 
   object.size(1011001001001)
 [1] 36
 
   object.size(1011001001001)
 [1] 52
 
   object.size(factor(1011001001001))
 [1] 244
 
 
 They will by default, as Andy indicates, be read and stored as doubles.
 They are too large for integers, at least on my system:
 
   .Machine$integer.max
 [1] 2147483647
 
 Converting to a character might make sense, with only a minimal memory
 penalty. However, using a factor results in a notable memory penalty, if
 the attributes of a factor are not needed.
 
 That depends on how long the vectors are.  The memory overhead for factors 
 is per vector, with only 4 bytes used for each additional element (if the 
 level already appears).  The memory overhead for character data is per 
 element -- there is no amortization for repeated values.
 
   object.size(factor(1011001001001))
 [1] 244
   
 object.size(factor(rep(c(1011001001001,111001001001,001001001001,011001001001),1)))
 [1] 308
   # bytes per element in factor, for length 4:
   
 object.size(factor(rep(c(1011001001001,111001001001,001001001001,011001001001),1)))/4
 [1] 77
   # bytes per element in factor, for length 1000:
   
 object.size(factor(rep(c(1011001001001,111001001001,001001001001,011001001001),250)))/1000
 [1] 4.292
   # bytes per element in character data, for length 1000:
   
 object.size(as.character(factor(rep(c(1011001001001,111001001001,001001001001,011001001001),250/1000
 [1] 20.028
  
 
 So, for long vectors with relatively few different values, storage as 
 factors is far more memory efficient (this is because the character data is 
 stored only once per level, and each element is stored as a 4-byte 
 integer).  (The above was done on Windows 2000).
 
 -- Tony Plate


Good point Tony. I was making the, perhaps incorrect assumption, that
the ID's were unique or relatively so. However, as it turns out, even
that assumption is relevant only to a certain extent with respect to how
much memory is required.

What is interesting (and presumably I need to do some more reading on
how R stores objects internally) is that the incremental amount of
memory is not consistent on a per element basis for a given object,
though there is a pattern. It is also dependent upon the size of the new
elements to be added, as I note at the bottom.

This all of course presumes that object.size() is giving a reasonable
approximation of the amount of memory actually allocated to an object,
for which the notes in ?object.size raise at least some doubt. This is a
critical assumption for the data below, which is on FC2 on a P4.

For example:

 object.size(a)
[1] 44

 object.size(letters)
[1] 340

In the second case, as Tony has noted, the size of letters (a character
vector) is not 26 * 44.

Now note:

 object.size(c(a, b))
[1] 52
 object.size(c(a, b, c))
[1] 68
 object.size(c(a, b, c, d))
[1] 76
 object.size(c(a, b, c, d, e))
[1] 92

The incremental sizes are a sequence of 8 and 16.

Now for a factor:

 object.size(factor(a))
[1] 236
 object.size(factor(c(a, b)))
[1] 244
 object.size(factor(c(a, b, c)))
[1] 268
 object.size(factor(c(a, b, c, d)))
[1] 276
 object.size(factor(c(a, b, c, d, e)))
[1] 300

The incremental sizes are a sequence of 8 and 24.


Using elements along the lines of Dan's:

 object.size(1)
[1] 52
 object.size(c(1, 10001))
[1] 68
 object.size(c(1, 10001, 10002))
[1] 92
 object.size(c(1, 10001, 10002,
10003))
[1] 108
 object.size(c(1, 10001, 10002,
10003, 10004))
[1] 132

The sequence is 16 and 24.

For factors:

 object.size(factor(1)
[1] 244
 object.size(factor(c(1, 10001)))
[1] 260
 object.size(factor(c(1, 10001,
   10002)))
[1] 292
 object.size(factor(c(1, 10001,
   10002, 10003)))
[1] 308
 object.size(factor(c(1, 10001,
   10002, 10003,
   10004)))
[1] 340

The sequence is 24 and 32.


So, the incremental size seems to alternate as elements are added. 

The behavior above would perhaps suggest that memory is allocated to
objects to enable pairs of elements to be added. When the second element
of the pair is added, only a minimal incremental amount of additional
memory (and presumably time) is required.

However, when I add a third element, there is additional memory
required to store that new element because the object needs to be
adjusted in a more fundamental way to handle this new element.

There also appears to be some memory allocation adjustment at play
here. Note:

 object.size(factor(1))
[1

RE: [R] numerical accuracy, dumb question

2004-08-14 Thread Marc Schwartz

On Sat, 2004-08-14 at 12:01, Marc Schwartz wrote:

 There also appears to be some memory allocation adjustment at play
 here. Note:
 
  object.size(factor(1))
 [1] 244
 
  object.size(factor(1, a))
 [1] 236


Arggh.

Negate that last comment. I had a typo in the second example. It should
be:

 object.size(factor(c(1, a)))
[1] 252

which of course results in an increase in memory.

Geez. Time for lunch.

Marc

__
[EMAIL PROTECTED] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

RE: [R] numerical accuracy, dumb question

2004-08-14 Thread Marc Schwartz

On Sat, 2004-08-14 at 13:19, Prof Brian Ripley wrote:
 On Sat, 14 Aug 2004, Marc Schwartz wrote:
 
   object.size(a)
  [1] 44
  
   object.size(letters)
  [1] 340
  
  In the second case, as Tony has noted, the size of letters (a character
  vector) is not 26 * 44.
 
 Of course not.  Both are character vectors, so have the overhead of any R
 object plus an allocation for pointers to the elements plus an amount for
 each element of the vector (see the end).
 
 These calculations differ on 32-bit and 64-bit machines.  For a 32-bit
 machine storage is in units of either 28 bytes (Ncells) or 8 bytes
 (Vcells) so single-letter characters are wasteful, viz
 
  object.size(aaa)
 [1] 44
 
 That is 1 Ncell and 2 Vcells, 1 for the string (7 bytes plus terminator)
 and 1 for the pointer.
 
 Whereas
 
  object.size(letters)
 [1] 340
 
 has 1 Ncell and 39 Vcells, 26 for the strings and 13 for the pointers 
 (which fit two to a Vcell).
 
 Note that repeated character strings may share storage, so for example
 
  object.size(rep(a, 26))
 [1] 340
 
 is wrong (140, I think).  And that makes comparisons with factors depend
 on exactly how they were created, for a character vector there probably is 
 a lot of sharing.
 
 I have a feeling that these calculations are off for character vectors, as 
 each element is a CHARSXP and so may have an Ncell not accounted for by 
 object.size.  (`May' because of potential sharing.)  Would anyone who is 
 sure like to confirm or deny this?
 
 It ought to be possible to improve the estimates for character vectors a 
 bit as we can detect sharing amongst the elements.

Prof. Ripley,

Thanks for the clarifications. 

I'll need to spend some time reading through R-exts.pdf and
Rinternals.h.

Regards,

Marc

__
[EMAIL PROTECTED] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

RE: [R] Stacking Vectors/Dataframes

2004-08-16 Thread Marc Schwartz

Archived versions of gregmisc (and other packages) are available from:

http://cran.r-project.org/src/contrib/Archive/

Download one of the older versions (ie. 0.8.5) and install it from a
console using R CMD INSTALL.

If you are restricted from installing packages to the main R tree (ie.
you do not have the requisite permissions), see R FAQ 5.2 regarding
installing packages to alternate locations.

HTH,

Marc Schwartz

On Mon, 2004-08-16 at 08:33, Laura Quinn wrote:
 As our IT man is currently on holiday I am not able to upgrade to version
 1.9.0(or 1.9.1) at the moment, and I see that the gregmisc library will
 not work on earlier versions (I am using 1.8.0). Does anyone have any
 other suggestions how I might be able to acheive this?
 
 Thank you
 
 Laura Quinn
 Institute of Atmospheric Science
 School of the Environment
 University of Leeds
 Leeds
 LS2 9JT
 
 tel: +44 113 343 1596
 fax: +44 113 343 6716
 mail: [EMAIL PROTECTED]
 
 On Sun, 15 Aug 2004, Liaw, Andy wrote:
 
  I believe interleave() in the `gregmisc' package can do what you want.
 
  Cheers,
  Andy
 
   From: Laura Quinn
  
   Hello,
  
   Is there a simple way of stacking/merging two dataframes in
   R? I want to
   stack them piece-wise, not simply add one whole dataframe to
   the bottom of
   the other. I want to create as follows:
  
   x.frame:
   aX1  bX1  cX1  ... zX1
   aX2  bX2  cX2  ... zX2
   ...  ...  ...  ... ...
   aX99 bX99 cX99 ... zX99
  
   y.frame:
   aY1  bY1  cY1  ... zY1
   aY2  bY2  cY2  ... zY2
   ...  ...  ...  ... ...
   aY99 bY99 cY99 ... zY99
  
   new.frame:
   aX1  bX1  cX1  ... zX1
   aY1  bY1  cY1  ... zY1
   aX2  bX2  cX2  ... zX2
   aY2  bY2  cY2  ... tY2
   ...  ...  ...  ... ...
   aX99 bX99 cX99 ... tX99
   aY99 bY99 cY99 ... tY99
  
   I have tried to use a for loop (simply assigning and also
   with rbind) to
   do this but am having difficulty correctly assigning the
   destination in the new dataframe. Can
   anyone offer a quick and easy way of doing this (or even a
   long winded one
   if it works!!)
  
   Thank you in advance,

__
[EMAIL PROTECTED] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

Re: [R] Bug in colnames of data.frames?

2004-08-17 Thread Marc Schwartz

On Tue, 2004-08-17 at 09:01, Arne Henningsen wrote:
 Hi,
 
 I am using R 1.9.1 on on i686 PC with SuSE Linux 9.0.
 
 I have a data.frame, e.g.:
 
  myData - data.frame( var1 = c( 1:4 ), var2 = c (5:8 ) )
 
 If I add a new column by
 
  myData$var3 - myData[ , var1 ] + myData[ , var2 ]
 
 everything is fine, but if I omit the commas:
 
  myData$var4 - myData[ var1 ] + myData[ var2 ]
 
 the name shown above the 4th column is not var4:
 
  myData
   var1 var2 var3 var1
 11566
 22688
 337   10   10
 448   12   12
 
 but names() and colnames() return the expected name:
 
  names( myData )
 [1] var1 var2 var3 var4
  colnames( myData )
 [1] var1 var2 var3 var4
 
 And it is even worse: I am not able to change the name shown above the 4th 
 column:
  names( myData )[ 4 ] - var5
  myData
   var1 var2 var3 var1
 11566
 22688
 337   10   10
 448   12   12
 
 I guess that this is a bug, isn't it?
 
 Arne


Here is a hint:

# This returns an integer vector
 str(myData[ , var1 ] + myData[ , var2 ])
 int [1:4] 6 8 10 12


# This returns a data.frame
 str(myData[ var1 ] + myData[ var2 ])
`data.frame':   4 obs. of  1 variable:
 $ var1: int  6 8 10 12


 str(myData)
`data.frame':   4 obs. of  5 variables:
 $ var1: int  1 2 3 4
 $ var2: int  5 6 7 8
 $ var3: int  6 8 10 12
 $ var4:`data.frame':   4 obs. of  1 variable:
  ..$ var1: int  6 8 10 12


Take a look at the details, value and coercion sections of ?.data.frame

HTH,

Marc Schwartz

__
[EMAIL PROTECTED] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

Re: [R] Bug in colnames of data.frames?

2004-08-17 Thread Marc Schwartz

On Tue, 2004-08-17 at 09:34, Marc Schwartz wrote:

 Take a look at the details, value and coercion sections of
 ?.data.frame

This must be my week for typos. That should be:

?[.data.frame (in ESS)

or

?[.data.frame (otherwise)

Marc

__
[EMAIL PROTECTED] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

Re: [R] levels of factor

2004-08-17 Thread Marc Schwartz

On Tue, 2004-08-17 at 09:30, Luis Rideau Cruz wrote:
 R-help,
 
 I have a data frame wich I subset like :
 
 a - subset(df,df$column2 %in% c(factor1,factor2)   df$column2==1)
 
 But when I type levels(a$column2) I still get the same levels as in df (my 
 original data frame)
 
 Why is that?

The default for [.factor is:

x[i, drop = FALSE]

Hence, unused factor levels are retained.

 Is it right?

Yes.

If you want to explicitly recode the factor based upon only those levels
that are actually in use, you can do something like the following:

a - factor(a)


However, I am a bit unclear as to the logic of the subset statement that
you are using, perhaps b/c I don't know what your data is.

You seem to be subsetting 'column2' on both the factor levels and a
presumed numeric code. Is that really what you want to do?

You might want to review the Warning section in ?factor

BTW, when using subset(), the evaluation takes place within the data
frame, so you do not need to use df$column2 in the function call. You
can just use column2, for example:

subset(df, column2 %in% c(factor1, factor2))

See ?factor and ?[.factor for more information.

HTH,

Marc Schwartz

__
[EMAIL PROTECTED] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

Re: [R] all.equal and names?

2004-08-18 Thread Marc Schwartz

It is in the Description now (at least for 1.9.1 patched):

all.equal(x,y) is a utility to compare R objects x and y testing near
equality. If they are different, comparison is still made to some
extent, and a report of the differences is returned. Don't use all.equal
directly in if expressionseither use identical or combine the two, as
shown in the documentation for identical.

There is also a reference to:

attr.all.equal(target, current, ...)

on the same help page, which returns the following using the example:

 attr.all.equal(1, c(a=1))
[1] names for current but not for target

Not quite the same message as S-PLUS however.

HTH,

Marc


On Wed, 2004-08-18 at 11:02, Spencer Graves wrote:
 Hi, Duncan: 
 
   Thanks much.  I think I remember reading about both all.equal 
 and identical in Venables and Ripley (2002) MASS.  Unfortunately, I 
 don't have MASS handy now, and I could not find it otherwise, so I asked. 
 
   What needs to happen to upgrade the all.equal documentation to 
 add identical to the see also? 
 
   Best Wishes,
   Spencer
 
 Duncan Murdoch wrote:
 
 On Wed, 18 Aug 2004 10:27:49 -0400, Spencer Graves
 [EMAIL PROTECTED] wrote :
 
   
 
  How can I compare two objects for structure, names, values, etc.?  
 With R 1.9.1 under Windows 2000, the obvious choice all.equal ignores 
 names and compares only values: 
 
 
 
 all.equal(1, c(a=1))
   
 
 [1] TRUE
 
  Under S-Plus 6.2, I get the comparison I expected: 
 
 
 
 all.equal(1, c(a = 1))
   
 
 [1] target, current classes differ: integer : 
 named   
 [2] class of target is \integer\, class of current is \named\ 
 (coercing current to class of target)
 
 
 
 If you want the explanation you're out of luck, but identical() does
 the test:
 
   
 
 identical(1, c(a = 1))
 
 
 [1] FALSE
 
 Duncan Murdoch
   
 
 
 __
 [EMAIL PROTECTED] mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

__
[EMAIL PROTECTED] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

Re: [R] header line generated write.table

2004-08-18 Thread Marc Schwartz

On Wed, 2004-08-18 at 16:42, Y C Tao wrote:
 I want to write following data frame into a CSV file:
  
   Col1   Col2  Col3
 Row1   1   1 1
 Row2   2   2 2
  
 where Row1, Row2 are the row names and Col1, Col2, Col3 are the column
 names.
  
 The correct CSV file should be:
 ,Col1,Col2,Col3
 Row1,1,1,1
 Row2,2,2,2
  
 However, the one generated by R using write.table(x, file=xyz.csv,
 sep=,) has a header line that reads:
 Col1,Col2,Col3
 without the comma at the very beginning.
  
 As a result, if you open the file in Excel, the column names are not
 correct (shifted to the left by one column).
  
 Is there a way to get around this?
  
 Thanks!

The solution is on the help page for ?write.table:

Details

Normally there is no column name for a column of row names. If
col.names=NA a blank column name is added. This can be used to write CSV
files for input to spreadsheets.


Also, the first example on that page gives you:

## To write a CSV file for input to Excel one might use
write.table(x, file = foo.csv, sep = ,, col.names = NA)


Thus:

 write.table(x, col.names = NA, sep = ,)
,Col1,Col2,Col3
Row1,1,1,1
Row2,2,2,2

HTH,

Marc Schwartz

__
[EMAIL PROTECTED] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

Re: [R] Is R good for not-professional-statistician, un-mathematical clinical researchers?

2004-08-19 Thread Marc Schwartz

On Thu, 2004-08-19 at 01:45, Jacob Wegelin wrote:
 Alternate title: How can I persuade my students that R is for them?
 
 Alternate title: Can R replace SAS, SPSS or Stata for clinicians?
 
 I am teaching introductory statistics to twelve physicians and two veterinarians
 who have enrolled in a Mentored Clinical Research Training Program.  My course is the
 first in a sequence of three.  We (the instructors of this sequence) chose to teach
 R rather than some other computing environment.
 
 My (highly motivated) students have never encountered anything like R.  One frankly
 asked:
 
 Do you feel (honestly) that a group of physicians (with two vets) clinicians will
 be able to effectively use and actually understand R? If so, I will happily call this
 bookstore and order this book [Venables and Ripley] tomorrow.
 
 I am heavily biased toward R/S because I have used it since the first applied 
 statistics
 course I took.  But I would love to give these students some kind of objective 
 information
 about the usability of R by non-statisticians--not just my own bias.
 
 Could anyone suggest any such information?  Or does anyone on this list use R who is
 a clinician and not really mathematically savvy?  For instance, someone who doesn't
 remember any math beyond algebra and doesn't think in terms of P(A|B)?
 
 Or have we done a disservice to our students by choosing to make them
 learn R, rather than making ourselves learn SAS, Stata or SPSS?
 
 Thank you for any ideas
 
 Jake Wegelin


A couple of questions:

1. What is the intended goal of the series of classes?

2. What are the expectations of the clinicians for themselves and what
is their likely career path?


Possible answers to the questions:

1. Provide the clinicians a reasonable (and perhaps broad) foundation of
statistical knowledge.

2. To be able to have a reasonable comprehension of statistical concepts
and methods so that in the future, as they are busy with patients
(animals for the vets) in a clinical practice, they can intelligently
interact with formally trained statisticians when engaged in clinical
research in a multi-disciplinary team environment.


If the above is close to reality, then let me suggest that you consider
Peter's book Introductory Statistics with R rather than MASS, at least
for the first class in the series. I cannot think of a more gentle,
broad and competent way to introduce clinicians to both statistics and R
at the same time.

If these clinicians are likely to move on to busy clinical practices, in
my experience having come out of the clinical environment, they will not
have the time to sit at a computer and grind out analyses, much less
maintain their proficiency with a programming language (R, Stata or SAS)
or the broad range of statistical methodologies that they would likely
encounter over their careers.

They will however, need to be able to sit and interact with
statisticians, bringing the significant value of their clinical training
and knowledge, to the process of designing clinical research projects
and effectively comprehend the multitude of issues in that endeavor.
They will need to have an understanding of the complex processes by
which data are collected, managed, manipulated and analyzed in the
course of obtaining the resultant analyses. 

In other words, it is important that they realize that it is more than
just a point and click process where voila, you have logistic
regression model. They need to appreciate both the subtleties and
complexities of dealing with real world research, incomplete data, etc.
Many clinicians do not and this results in mis-matched expectations in
the future as they deal with real world situations.

There are certainly physicians who have made the decision to focus their
careers on the statistical part of the research process, forsaking any
significant clinical patient care role. They are far and few between, to
my experience, though two or three immediately come to mind. They have
also generally made the commitment to formal graduate level education in
math/statistics securing advanced degrees.

Short of that, there is typically a future dependence upon trained
statisticians, either within an academic medical environment or via
contracted services.

The above is based upon my own experience, which is largely in
sub-specialty clinical areas. Others may and perhaps will differ, based
upon their own bias.

HTH,

Marc Schwartz

__
[EMAIL PROTECTED] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

RE: [R] paired t-test vs pairwise t-test

2004-08-19 Thread Marc Schwartz

On Thu, 2004-08-19 at 14:42, Liaw, Andy wrote:
  From: Duncan Murdoch
  
  On Thu, 19 Aug 2004 13:42:21 -0300 (ADT), Rolf Turner
  [EMAIL PROTECTED] wrote :
  
  
  You wrote:
  
   What's the difference between t.test(x, y) and 
  pairwise.t.test()? Is
   it just that the former takes two vectors, whereas the 
  latter takes a
   vector and a factor?
  
 No.  The pairwise.t.test() function (according to the help
 file) does a multiplicity of t-tests, on more than two
 samples, adjusting the p-value to compensate for the
 multiplicity by various methods.
  
 IMHO the name of this function is bad, because to me it
 suggests doing ***paired*** t-tests, which would trip up the
 naive user, who probably wouldn't notice or would ignore the
 t tests with pooled SD message in the output.  As one of
 the Ripley fortunes says ``It really is hard to anticipate
 just how silly users can be.''  But why go out of the way to
 give them a chance to be silly?
  
  And Jack wrote:
  
  But the documentation, which I valiantly tried to make sense 
  of BEFORE 
  asking my stupid question, is not clear enough for this 
  particular idiot. 
  Might I suggest that the documentation be altered? It could 
  use an example 
  (as in, real-life applied statistical problem) of when 
  pairwise.t.test() 
  ought to be used, and why t.test(paired=TRUE) would be 
  inappropriate in that 
  context; it could also use a reference to some published 
  paper, website or 
  some such that explains the rationale and correct procedure 
  for using this 
  test.
  
  I think it's unlikely that we would rename the function; it's been
  around a while with its current name so that's a bad idea.  On the
  other hand, clearer documentation is always a plus:  why not submit
  some?
 
 I guess this is sort of related to the thread on whether R is good for
 non-statisticians...  The help pages in R are sort of like *nix man pages.
 They give the technical information about the topic, but not necessarily the
 background.  E.g., the man page for `chmod' does not explain file
 permissions in detail: the user is expected to learn that elsewhere.
 
 Perhaps other stat packages do it differently?  Does SPSS manuals detail
 what its t-test procedure does, including which t-test(s) it does and when
 it's appropriate?  That might make it easier on users, but I still think the
 users should learn the appropriate use of statistical procedures
 elsewhere...
 
 Best,
 Andy


Andy,

I don't know about SPSS, but SAS' documentation is available online at:

http://support.sas.com/91doc/docMainpage.jsp

The documentation specifically for PROC TTEST is at:

http://support.sas.com/91doc/getDoc/statug.hlp/ttest_index.htm

and the documentation for PROC MULTTEST is at:

http://support.sas.com/91doc/getDoc/statug.hlp/multtest_index.htm

Of course, to go along with the standard SAS documentation, there is the
line of Books by Users, which parallels in a fashion, the increasing
number of books on R, authored by members of this community.

Best regards,

Marc

__
[EMAIL PROTECTED] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

Re: [R] How generate A01, A02, ..., A99?

2004-08-20 Thread Marc Schwartz

On Fri, 2004-08-20 at 15:15, Peter Dalgaard wrote:
 Sundar Dorai-Raj [EMAIL PROTECTED] writes:
 
  Yao, Minghua wrote:
  
   Hi,
Anyone can tell me how to generate A01, A02, ..., A99?
paste(A, 1:99, sep=) generates A1, A2,..., A99. This is
   not  what I want.
Thanks for the help.
-MY
 [[alternative HTML version deleted]]
  
  
  How about?
  
  sapply(1:99, function(i) sprintf(A%02d, i))
 
 or just 
 
 sapply(1:99,sprintf,fmt=A%02d)


or yet another variation:

paste(A, formatC(1:99, width = 2, format = d, flag = 0), 
  sep = )


HTH,

Marc Schwartz

__
[EMAIL PROTECTED] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

Re: [R] where is internal function of sample()?

2005-04-11 Thread Marc Schwartz

On Mon, 2005-04-11 at 23:04 -0400, Weijie Cai wrote:
 Hi there,
 
 I am trying to write a c++ shared library for R. I need a function which has 
 the same functionality as sample() in R, i.e., does permutation, sample 
 with/without replacement. Does R have internal sample routine so that I can 
 call it directly?
 
 I did not find it in R.h, Rinternal.h.
 
 Thanks

A quick grep of the source code tree tells you that the function is
in .../src/main/random.c

A general pattern for C .Internal functions is to use a prefix of do_
in conjunction with the R function name. So in this case, the C function
is called do_sample and begins at line 391 (for 2.0.1 patched) in the
aforementioned C source file.

HTH,

Marc Schwartz

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

Re: [R] removing characters from a string

2005-04-12 Thread Marc Schwartz

On Tue, 2005-04-12 at 05:54 -0700, Vivek Rao wrote:
 Is there a simple way in R to remove all characters
 from a string other than those in a specified set? For
 example, I want to keep only the digits 0-9 in a
 string.
 
 In general, I have found the string handling abilities
 of R a bit limited. (Of course it's great for stats in
 general). Is there a good reference on this? Or should
 R programmers dump their output to a text file and use
 something like Perl or Python for sophisticated text
 processing?
 
 I am familiar with the basic functions such as nchar,
 substring, as.integer, print, cat, sprintf etc.

Something like the following should work:

 x - paste(sample(c(letters, LETTERS, 0:9), 50, replace = TRUE),
 collapse = )

 x
[1] QvuuAlSJYUFpUpwJomtCir8TfvNQyV6O7W7TlXSXlLHocCdtnV

 gsub([^0-9], , x)
[1] 8677

The use of gsub() here replaces any characters NOT in 0:9 with a ,
therefore leaving only the digits.

See ?gsub for more information.

HTH,

Marc Schwartz

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

Re: [R] Cumulative Points and Confidence Interval Manipulation in barplot2

2005-04-12 Thread Marc Schwartz

On Tue, 2005-04-12 at 10:14 -0500, Bret Collier wrote:
 R-Users,
 I am working with gplots (in gregmisc bundle) plotting some posterior
 probabilities (using barplot2) of harvest bag limits for discrete data
 (x-axis from 0 to 12, data is counts) and I ran into a couple of
 questions whose solutions have evaded me.
 
 1)  When I create and include the confidence intervals, the lower bound
 of the confidence intervals for several of the posterior probabilities
 is below zero, and in those specific cases I only want to show the upper
 limit for those CI's so they do not extend below the x-axis (as harvest
 can not be 0).  Also, comments on a better technique for CI
 construction when the data is bounded to be =0 would be appreciated.
 
 2)  I would also like to show the cumulative probability (as say a
 point or line) across the range of the x-axis on the same figure at the
 top, but I have been unable to figure out how to overlay a set of
 cumulative points over the barplot across the same range as the x-axis.
 
 Below is some example code showing the test data I am working on
 (xzero):
 
 xzero - table(factor(WWNEW[HUNTTYPE==DOVEONLY], levels=0:12))
  xzero
 
   0   1   2   3   4   5   6   7   8   9  10  11  12 
 179  20   9   2   2   0   1   0   0   0   0   0   0 
 
  n - sum(xzero)
  k - sum(table(xzero))
  meantheta1 -((2*xzero + 1)/(2*n + k))
  vartheta1
 -((2*(((2*n)+k)-((2*xzero)+1)))*((2*xzero)+1))/2*n)+k)^2)*(((2*n)+k)+2))
  stderr - sqrt(vartheta1)
  cl.l - meantheta1-(stderr*2)#Fake CI:  Test
  cl.u - meantheta1+(stderr*2)#Fake CI: Test
  barplot2(meantheta1, xlab=WWD HARVEST DOVE ONLY 2001,
 ylab=Probability, ylim=c(0, 1),xpd=F, col=blue, border=black,
 axis.lty=1,plot.ci=TRUE, ci.u = cl.u, ci.l = cl.l)
  title(main=WHITE WING DOVE HARVEST PROBABILITIES:  DOVE HUNT ONLY)
 
 
 I would greatly appreciate any direction or assistance,
 Thanks,
 Bret

Bret,

If you replace the lower bound of your confidence intervals as follows,
you can get just the upper bound plotted:

cl.l.new - ifelse(cl.l = 0, cl.l, meantheta1)

This will set the lower bound to meantheta1 in those cases, thus
plotting the upper portion and you can remove the 'xpd=F' argument. Use
'ci.l = cl.l.new' here:

barplot2(meantheta1, xlab=WWD HARVEST DOVE ONLY 2001,
 ylab=Probability, ylim=c(0, 1), col=blue, 
 border=black, axis.lty=1,plot.ci=TRUE, 
 ci.u = cl.u, ci.l = cl.l.new)

I would defer to others with more Bayesian experience on alternatives
for calculating bounded CI's for the PP's.

With respect to the cumulative probabilities, if I am picturing the same
thing you are, you can use the cumsum() function and then add points
and/or a line as follows:

  points(cumsum(meantheta1), pch = 19)

  lines(cumsum(meantheta1), lty = solid)

See ?cumsum, ?points and ?lines for more information.

BTW, some strategically placed spaces would help make your code a bit
more readable for folks.

HTH,

Marc Schwartz

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

Re: [R] R in Windows

2005-04-13 Thread Marc Schwartz

On Wed, 2005-04-13 at 10:51 -0400, George Kelley wrote:
 Has anyone tried to create dialog boxes for Windows in R so that one
 doesn't have to type in so much information but rather enter it in a
 menu-based format. If not, does anyone plan on doing this in the future
 if it's possible?
 
 Thanks.
 
 George (Kelley)


There are a variety of GUI's being actively developed for R.

More information is here:

http://www.sciviews.org/_rgui/

I don't use it actively, but I might specifically suggest that you
review John Fox' R Commander:

http://socserv.mcmaster.ca/jfox/Misc/Rcmdr/

It is written in tcl/tk, which makes it cross-platform compatible if
that is an issue for you.

HTH,

Marc Schwartz

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

RE: [R] terminate R program when trying to access out-of-bounds arrayelement?

2005-04-13 Thread Marc Schwartz

On Wed, 2005-04-13 at 15:03 -0700, Berton Gunter wrote:
 WHOA!
 
 Do not redefine R functions (especially [ !) in this way! That's what R
 classes and methods (either S3 or S4) are for. Same applies to print
 methods. See the appropriate sections of the R language definition and the
 book S PROGRAMMING by VR.
 
 Please do not offer advice of this sort if you are not knowledgeable about
 R/S Programming, as it might be taken seriously.

I think that we have another entry for the fortunes package...

:-)

Best regards,

Marc

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

< 1 2 3 4 5 6 7 8 9 10 >

301 - 400 of 1450 matches

Mail list logo