Re: [R] R-help Digest, Vol 36, Issue 21

2006-02-21 Thread Evgeniy Kachalin
Hello, dear R users.

I've already sent a question here, but I'm not sure that it had been read.

I need to visualize classification of my numerical data based on 2-3 
factors. As I suppose, the best way is a tree.
With an orbitrary function at the ends (leaves), or at least with means 
of my data at the ends.

What is the way to do it? As I found, ctree offers binary 
classification, but it that the only way? Of course, tree is not only 
way, may be you could offer other ways.

Thank you.

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] R-help Digest, Vol 36, Issue 21

2006-02-21 Thread Evgeniy Kachalin
Evgeniy Kachalin wrote:
 Hello, dear R users.
 
 I've already sent a question here, but I'm not sure that it had been read.
 
 I need to visualize classification of my numerical data based on 2-3 
 factors. As I suppose, the best way is a tree.
 With an orbitrary function at the ends (leaves), or at least with means 
 of my data at the ends.
 
 What is the way to do it? As I found, ctree offers binary 
 classification, but it that the only way? Of course, tree is not only 
 way, may be you could offer other ways.
 
Or the best way of it is to do it with replacement, like a 'heatmap', 
but with means in the cells instead of colors, if it is possible.

Sorry for the second letter.

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] R-help Digest, Vol 35, Issue 7

2006-01-08 Thread Evgeniy Kachalin
Uwe Ligges пишет:
 Evgeniy Kachalin wrote:
 
 Hello, dear participants!

 Could you tip me, is there any simple and nice way to build 
 scatter-plot for three different types of data (, and o and * - signs, 
 for example) with legend.

 Now i can guess only that way:

 plot(x~y,data=subset(mydata,factor1=='1'), pch='.',col='blue')
 points(x~y,data=subset(mydata,factor1=='2'), pch='*',col='green')
 points( etc

 What is the simple and nice way?
 Thank you very much for your kindness and help.

 
 
 Example:
 
 
 with(iris,
   plot(Sepal.Length, Sepal.Width, pch = as.integer(Species)))
 with(iris,
   legend(7, 4.4, legend = unique(as.character(Species)),
 pch = unique(as.integer(Species
 

Uwe, sorry for my stupid question. You mean that when pch=factor , plot 
can recycle the factor and use it for subscripts or marks.

Then pch=as.integer(Species) results in c(1,2,3) for 3 factor levels. 
And I need symbols 15,16,17 and colors red, blue, green.

So then I do:
iris$Species-spec.symb
iris$Species-spec.col
levels(spec.symb)-c(15,16,17)
levels(spec.col)-c('red','green','blue')

That's the only way?
More of that!!! 'Plot' does not like factors in 'pch'. So it must be so:
plot(x~y,data, pch=as.integer(as.character(spec.symb))).
That's totally crazy...

-- 
Evgeniy

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

Re: [R] R-help Digest, Vol 35, Issue 7

2006-01-07 Thread Evgeniy Kachalin
Hello, dear participants!

Could you tip me, is there any simple and nice way to build scatter-plot 
for three different types of data (, and o and * - signs, for example) 
with legend.

Now i can guess only that way:

plot(x~y,data=subset(mydata,factor1=='1'), pch='.',col='blue')
points(x~y,data=subset(mydata,factor1=='2'), pch='*',col='green')
points( etc

What is the simple and nice way?
Thank you very much for your kindness and help.

-- 
Evgeniy Kachalin

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] Mass 'identify' on 2d-plot

2006-01-07 Thread Evgeniy Kachalin
Barry Rowlingson пишет:
 Evgeniy Kachalin wrote:
 
 What is ability in R to graphically (per mouse) define some area and 
 to select all the cases felt in it?

 'identify' is OK for 5-10 cases, but what if cases=1000?
 
 
  You can use 'locator' to let the user click a number of points to 
 define a polygon, and then use one of the point-in-polygon functions 
 provided by one of the spatial packages to work out whats in your polygon.
 
  Look at splancs, spatstat, sp - pretty much anything beginning with 
 'sp' - on CRAN.
 
  In splancs you can just do:
 
  poly = getpoly()
 
  - which lets the user draw a polygon on screen, then:
 
  inPoly = inpip(xypts,poly)
  points(xypts[inpip,], pch=19,col=red)
 
  and that will plot the selected points in solid red dots.
 
  I don't think there's a way to draw a freehand figure on an R plot, you 
 have to go click, click, click, and draw straight lines.
 

I don't get what is 'xypts' in this case... One step earlier i've 
plotted plot(y~x,data=dat). What is xypts?

-- 
Evgeniy Kachalin

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

[R] Mass 'identify' on 2d-plot

2005-12-05 Thread Evgeniy Kachalin
Hello, dear R-users.

I have 2-d dotplot with two variables: x, y. Dots on this dotplot are 
grouped in human-recogniseable areas. These areas are not round-shaped 
nor oval-shaped. They are free-form, but still recogniseable by an operator.

What is ability in R to graphically (per mouse) define some area and to 
select all the cases felt in it?

'identify' is OK for 5-10 cases, but what if cases=1000?

Thank you very much for advice.

-- 
Evgeniy Kachalin

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] Mass 'identify' on 2d-plot

2005-12-05 Thread Evgeniy Kachalin
Duncan Temple Lang пишет:
 -BEGIN PGP SIGNED MESSAGE-
 Hash: SHA1
 
 
 
 
 Barry Rowlingson wrote:
 
Evgeniy Kachalin wrote:



What is ability in R to graphically (per mouse) define some area and to 
select all the cases felt in it?

'identify' is OK for 5-10 cases, but what if cases=1000?


  You can use 'locator' to let the user click a number of points to 
define a polygon, and then use one of the point-in-polygon functions 
provided by one of the spatial packages to work out whats in your polygon.

  Look at splancs, spatstat, sp - pretty much anything beginning with 
'sp' - on CRAN.

  In splancs you can just do:

  poly = getpoly()

  - which lets the user draw a polygon on screen, then:

  inPoly = inpip(xypts,poly)
  points(xypts[inpip,], pch=19,col=red)

  and that will plot the selected points in solid red dots.

  I don't think there's a way to draw a freehand figure on an R plot, 
you have to go click, click, click, and draw straight lines.
 
 
 
 FWIW, there is an experimental package on the
 Omegahat site (and repository) named
 
RGtkIPrimitives
 
 that works with the gtkDevice only to do
 rubber banding and free form region identification.
 

Ho do I do GTK device on my WinXP? No way? ;)

-- 
Evgeniy

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

[R] Impaired boxplot functionality - mean instead of median

2005-12-01 Thread Evgeniy Kachalin
Hello to all users and wizards.

I am regulary using 'boxplot' function or its analogue - 'bwplot' from 
the 'lattice' library. But they are, as far as I understand, totally 
flawed in functionality: they miss ability to select what they would 
draw 'in the middle' - median, mean. What the box means - standard 
error, 90% or something else. What the whiskers mean - 100%, 99% or 
something else.
Is there any way to realize it? Or is there any other good data 
visualization function for comparing means of various data groups? 
Ideally I would like to have a bit more customised function for doing 
that. For example, 'boxplot(a~b,data=d,mid='mean').


-- 
Evgeniy, ICQ 38317310.

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] Impaired boxplot functionality - mean instead of median

2005-12-01 Thread Evgeniy Kachalin
Martin Maechler пишет:
 Boxplots were invented by John W. Tukey and I think should be
 counted among the top small but smart achievements from the
 20th century.  Very wisely he did *not* use mean and standard deviations.
 
 Even though it's possible to draw boxplots that are not boxplots
 (and people only recently explained how to do this with R on this
  mailing list), I'm arguing very strongly against this.
 
 If I see a boxplot - I'd want it to be a boxplot and not have
 the silly (please excuse)  10%90% whiskers  which
 declare 20% of the points as outliers {in the boxplot sense}.
 
 If you want the mean +/- sd plot, do *not* misuse boxplots
 for them, please! 
 

So I analize genetics data. I have some factor (gene variant, c(1,2,3))
and the quantitative variable corresponding to that factor. How do I
visualize this situation? Compare mean of samples corresponding to
factor values?

Should boxplot support 'mean-in-the-middle', it would fit my needs
ideally. How do I plot mean +/- SD plot?

Also there is a way to rewrite boxplot.stats and replace fivenum there
for self-made function. Then I would need to write self-made
boxplot.formula (or boxplot.default?) function. And all this stuff would
not be configurable. I'm still novice in R, so I need simple way to
pre-visualize my data and estimate approximate result.



-- 
Evgeniy, ICQ 38317310.

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

Re: [R] Impaired boxplot functionality - mean instead of median

2005-12-01 Thread Evgeniy Kachalin
Marc Schwartz (via MN) пишет:
 On Thu, 2005-12-01 at 19:40 +0300, Evgeniy Kachalin wrote:
 
Martin Maechler пишет:

So I analize genetics data. I have some factor (gene variant, c(1,2,3))
and the quantitative variable corresponding to that factor. How do I
visualize this situation? Compare mean of samples corresponding to
factor values?

Should boxplot support 'mean-in-the-middle', it would fit my needs
ideally. How do I plot mean +/- SD plot?

Also there is a way to rewrite boxplot.stats and replace fivenum there
for self-made function. Then I would need to write self-made
boxplot.formula (or boxplot.default?) function. And all this stuff would
not be configurable. I'm still novice in R, so I need simple way to
pre-visualize my data and estimate approximate result.
 
 
 If you want means and SDs, you might want to look at:
 
 1. plotCI() and plotmeans() in the gplots package

So plotmeans is incapable of: boxplot(numerical~fact1+fact2). Is there 
any way further?

-- 
Evgeniy

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

Re: [R] Impaired boxplot functionality - mean instead of median

2005-12-01 Thread Evgeniy Kachalin
Marc Schwartz (via MN) пишет:
Marc Schwartz (via MN) пишет:

So plotmeans is incapable of: boxplot(numerical~fact1+fact2). Is there 
any way further?
 
 
 I think that somehow we are talking past each other here.
 
 plotmeans() does what it is designed to do, which is to simplify the
 process of plotting group-wise point estimates and user defined error
 bars/intervals around the point estimates.
 
 In your case, these intervals would be standard deviations around each
 of the group means as you have indicated.
 
 Review the examples in ?plotmeans.
 
 As Martin and others have pointed out, you need to remove boxplots from
 the equation here, as they were not designed to plot means and standard
 deviations.
 

Again, what I'm talking about: plotmeans is incapable of analyzing the
formula. For example, I have two factors: A - a, b, c, and B - d, e, f.

If i plot: boxplot(num~A+B) what do I get? Eight boxes: ad, ae, af, ba,
be, bf, cd, ce, cf. If I plot: plotmeans(num~A+B) - what do I get?
Nothing. Because plotmeans cannot combine two factors in various
combination. Is there a simple way to do it?

Anyway... That's wrong way, all what is neccessary is to have a boxplot
with mean istead of median. Is there simple way to do it?

Statistical software like Statistica 7.0 offers any possible combination
of what Boxplot could mean. Is it possible to have only one
modification to R's boxplot?

Thank you for kind answers.
Also please tell me, where should I send replies: to conference adress
or to those who answer me directly.

-- 
Evgeniy

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

Re: [R] Impaired boxplot functionality - mean instead of median

2005-12-01 Thread Evgeniy Kachalin
Frank E Harrell Jr пишет:
 Evgeniy Kachalin wrote:
 
 Marc Schwartz (via MN) пишет:

 Marc Schwartz (via MN) пишет:


 
 library(Hmisc)
 library(lattice)
 ?panel.bpplot
 
 bwplot(, panel=panel.bpplot)
 
 By default, panel.bpplot shows the mean (dot) and median (line) plus 
 several quantiles.  To bother Martin in a friendly way, I think that 
 means  can be useful additions - not that they are so useful by 
 themselves, but that when they differ a lot from the median, 
 non-statisticians gain further information about asymmetry.  Also, even 
 though the simple box plot is elegant, I sometimes think it has a high 
 ink to information ratio.  I have gained a lot from seeing outer 
 quantiles on the plot, and I don't like to show outer points for fear of 
 someone labeling them outliers.  For describing raw data distributions, 
 I never find standard deviations useful, however.
 

= fa
  doz fabp2
1900 2
4   1500 2
6   1000 2
8750 3
10   750 1
11  1750 2
12   500 3






  bwplot(doz~factor(fabp2),data=fa,panel=panel.bpplot)
Error in sort(x, partial = unique(c(lo, hi))) :
 unsupported options for partial sorting


That's NOT simple way.

I need just one change.
Is there any good way?
$-(

-- 
Evgeniy

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

Re: [R] Impaired boxplot functionality - mean instead of median

2005-12-01 Thread Evgeniy Kachalin
Wiener, Matthew пишет:
 interaction(A, B) will create a single factor made up of the combinations of
 the two factors A and B.  Perhaps that would let you use plotmeans.
 
 Hope this helps,
 
 Matt Wiener

So you think plotmeans(num~interaction(A,B)) will work? How? There is NO
'num' data for a.d, a.e, a.f etc.

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

Re: [R] Impaired boxplot functionality - mean instead of median

2005-12-01 Thread Evgeniy Kachalin
Austin, Matt пишет:
 Check your syntax on the bwplot call.
 
 fa - data.frame(doz=sample(500:2000, size=500), fabp2=rep(1:20, 25))
 
 bwplot(factor(fabp2) ~ doz, data=fa, panel=panel.bpplot)

Yes, that's almost the same But there is a huge amount of data on 
the graphic, too much for estimation of rather simple and small 
dataset... And also too much for publication in journals, I think. :|

-- 
Evgeniy

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html