I just got into R for most of the Xmas vacations and was about to ask for
helping
pointer on how to get a hold of R when I came across this thread. I've read
through
most it and would like to comment from a novice user point of view. I've a
strong
programming background but limited statistical experience and no knowledge on
competing packages. I'm working as a senior engineer in electronics.
Yes, the learning curve is steep. Most of the docu is extremely terse. Learning
is mostly from examples (a wiki was proposed in another mail...), documentation
uses no graphical elements at all. So, when it comes to things like xyplot in
lattice: where would I get the concepts behind panels, superpanels, and the
like?
ok., this is steep and terse, but after a while I'll get over it... That's life.
The general concept is great, things can be expressed very densly: Potential
is here.... I quickly had 200 lines of my own code together, doing what it
should -
or so I believed.
Next I did:
matrix<-matrix(1:100, 10, 10) image(matrix)
locator()
Great: I can interactively work with my graphs... But then:
filled.contour(matrix)
locator()
Oops - wrong coordinates returned. Bug. Apparently, locator() doen't realize
that fitted.contour() has a color bar to the right and scales x wrongly...
Here is what really shocked me:
> str(bar)
`data.frame': 206858 obs. of 12 variables:
...
> str(mean(bar[,6:12]))
Named num [1:7] 1.828 2.551 3.221 1.875 0.915 ...
...
> str(sd(bar[,6:12]))
Named num [1:7] 0.0702 0.1238 0.1600 0.1008 0.0465 ...
...
> prcomp(bar[,6:12])->foo
> str(foo$x)
num [1:206858, 1:7] -0.4187 -0.4015 0.0218 -0.4438 -0.3650 ...
...
> str(mean(foo$x))
num -1.07e-13
> str(sd(foo$x))
Named num [1:7] 0.32235 0.06380 0.02254 0.00337 0.00270 ...
...
So, sd returns a vector independent on whether the arguement is a matrix or
data.frame,
but mean reacts differently and returns a vector only against a data.frame?
The problem here is not that this is difficult to learn - the problem is the
complete absense
of a concept. Is a data.frame an 'extended' matrix with columns of different
types or
something different? Since the numeric mean (I expected a vector) is recycled
nicely
when used in a vector context, this makes debugging code close to impossible.
Since
sd returns a vector, things like mean + 4*sd vary sufficiently across the data
elements
that I assume working code... I don't get any warning signal that something is
wrong here.
The point in case is the behavior of locator() on a filled.contour() object:
Things apparently
have been programmed and debugged from example rather than concept.
Now, in another posting I read that all this is a feature to discourge
inexperienced users
from statistics and force you to think before you do things. Whilst I support
this concept
of thinking: Did I miss something in statistics? I was in the believe that mean
and sd were
relatively close to each other conceptually... (here, they are even in
different packages...)
I will continue using R for the time being. But whether I can recommend it to
my work
collegues remains to be seen: How could I ever trust results returned?
I'm still impressed by some of the efficiency, but my trust is deeply shaken...
--------------------------------------------------------------------------------------------------------
Stefan Eichenberger mailto:[EMAIL PROTECTED]
--------------------------------------------------------------------------------------------------------
[[alternative HTML version deleted]]
______________________________________________
[email protected] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html