Re: [R] Latex editor recommendations

2006-10-23 Thread Allen S. Rout
Tom Backer Johnsen [EMAIL PROTECTED] writes:

 This question is not oriented towards R, but is posted here because
 I have the impression that there are at least some Latex users among
 the contributors.  The question is: What editors for Latex are to be
 recommended?  I have located one:


EMACS.  

It's not just an editor, it's a religion.


- Allen S. Rout

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Branching on 'grep' returns...

2006-07-26 Thread Allen S. Rout



Greetings, all.

I'm fiddling with some text manipulation in R, and I've found
something which feels counterintuitive to my PERL-trained senses; I'm
hoping that I can glean new R intuition about the situation.

Here's an example, as concise as I could make it. 


trg-c(this,that)

# these two work as I'd expected.
if ( grep(this,trg) ) { cat(Y\n) } else { cat(N\n) } 
if ( grep(that,trg) ) { cat(Y\n) } else { cat(N\n) } 

# These all fail with error 'argument is of length zero'
# if ( grep(other,trg) ) { cat(Y\n) } else { cat(N\n) } 
# if ( grep(other,trg) == TRUE) { cat(Y\n) } else { cat(N\n) } 
# if ( grep(other,trg) == 1) { cat(Y\n) } else { cat(N\n) } 


# This says that the result is a numeric zero.   Shouldn't I be able
#  to if on that, or at least compare it with a number?
grep(other, trg)

# I eventually decided this worked, but felt odd to me.
if ( any(grep(other,trg))) { cat(Y\n) } else { cat(N\n) } 


So, is the 'Wrap it in an any()' just normal R practice, and I'm too
new to know it?  Is there a more fundamental dumb move I'm making?




- Allen S. Rout

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Sort of data.frame yields a thing which is not a data.frame...

2006-06-20 Thread Allen S. Rout


I've observed something I don't understand, and I was hoping someone
could point me to the right section of docs.

There's a portion in one of my analyses in which I am wont to sort a
data.frame so:

seriesS  - seriesS[order(as.Date(row.names(seriesS),format=%m/%d/%Y)),] 

So, I've got row.names which are textual representations of dates, I'd
like to retain them as such, but order them datewise.


As long as seriesS has more than one column, this works nicely,
seriesS afterwards is a data.frame with columns similar to those it
had going in.  But if I encounter a case with only one column, the
result is _not_ a data frame, but instead an ? array? list? 

I can solve this with a kluge:

seriesS$stupid - 0;
seriesS - seriesS[order(as.Date(row.names(seriesS),format=%m/%d/%Y)),] 
seriesS - seriesS[,-c(which(names(seriesS)==stupid)) ]

but this mostly tells me I've failed to understand something about how
the process should work.


Any good references to the Chapter and Verse of the Canon of R I
should hit?



- Allen S. Rout

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] R usage for log analysis

2006-06-12 Thread Allen S. Rout
Gabriel Diaz [EMAIL PROTECTED] writes:

 I'm taking an overview to the project documentation, and seems the
 database is the way to go to handle log files of GB order (normally
 between 2 and 4 GB each 15 day dump).

 In this document http://cran.r-project.org/doc/manuals/R-data.html,
 says R will load all data into memory to process it when using
 read.table and such. Using a database will do the same? Well,
 currently i have no machine with  2 GB of memory.

Remember, swap too.  This means you're using more time, not running
into a hard limit.

If you're concerned about gross size, then preprocessing could be
useful; but consider: RAM is cheap.  Calibrate RAM purchases
w.r.t. hours of your coding time, -before- you start the project.
Then you can at least mutter to yourself when you waste more than the
cost of core trying to make the problem small. :)

It's entirely reasonable to do all your development work on a smaller
set, and then dump the real data into it and go home.  Unless you've
got something O(N^2) or so, you should be fine.


- Allen S. Rout

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] R usage for log analysis

2006-06-11 Thread Allen S. Rout
Gabriel Diaz [EMAIL PROTECTED] writes:

 and what is the correct path to do it?
 
 I mean, put logs files in a mysql or somehting like that, and then
 make R use that data, using the data from the files directly?

I haven't stuck anything in a DB yet.  I'm not sure how much of the DB
clue is used under the covers. 

 pre-parse the log files to accomodate them to R?
 
Probably not; a little familiarity with the reading functions will
obviate most needs to pre-parse.


 I need faqs, manuals, books, whatever to learn about this, can anyone
 give some advice?

[...]


Don't expect a warm welcome.  This community is like all open-source
communities, sharply focused on its' own concerns and expertise.  And,
in an unusual experience for computer types, our core competencies
hold little or no sway here; they don't even give us much of a leg up.
Just wait 'till you want to do something nutso like produce a business
graphic. :)

I'm working on understanding enough of R packaging and documentation
to begin a 'task view' focused on systems administration, for humble
submission. That might end up being mostly log analysis; the term
can describe much of what we do, if it's stretched a bit.  I'm hoping
the task view will attract the teeming masses of sysadmins trapped in
the mire of Gnuplot and friends.


For starters, become familliar with read.table(); with a few
variations it will take care of all the 

while () { @blah = split(/,/); etc. etc. etc. } 

you've been accustomed to doing.  

Name columns;  this makes it easier to think about your data.  

names(my_data)-c(column,names,can,be,assigned,to)

Start thinking of your data in generic sets, as opposed to specific
rows.  Situations which required iteration over specific rows in
PERL-land fall neatly to precise assignment in R.  For example, if
you've got records with dates and times and you want to work with time
structures:

in PERL you'd 

foreach (...) 
{$foo-{pdate} = parsedate($foo-{date}. .$foo-{time})}

or some such.  In R-land, the iteration is implicit.  Here's a snippet
from something I'm using 

a$pdate-as.POSIXct(paste(format(a$dte,%Y/%m/%d),a$time)) 

You're really acting on logical columns all at once here.  This is
fantastically more efficient in terms of your thought processes.  



- Allen S. Rout

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


[R] Curve fitting tutorial / clue stick?

2005-11-14 Thread Allen S. Rout


Working through the R archives and webspace, I've mostly proved to myself that
I don't know enough about what statisticians call Curve Fitting to even
begin translating the basics.


I'm a sysadmin, and have collected a variety of measurements of my systems,
and I can draw pretty pictures in R showing what has happened.  People are
happy, customers feel empowered.  Whee!


Now, I want to take my corpus of data and make a prediction based on it; In
statistics-moron speak, I want to draw a line or a simple curve across my
extant graph, and figure out where the predictive curve passes threshold 'T',
and then graph that too.


I thought I'd be telling R something like:  

- I think this is exponential.  Here's the data.  Give me the best function
  you can come up with, and tell me how good the fit is.

- I think this is quadratic.  Here's the data.  Give me the best function
  you can come up with, and tell me how good the fit is.



Can someone point me at a spot in the docs which might be suitable for my
level of ignorance?  


- Allen S. Rout

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


[R] Dirty Rotten Hack. (reversing tickmarks on axes?)

2005-06-03 Thread Allen S. Rout

I feel dirty.


I have some graphs I'm building to communicate chargeback rates and service
usage for our backup system here at the University of Florida.  These come
down to daily data points on a graph of number-of-bytes transferred and
stored. 

Since we chargeback on the same basis (price per MB this, price per KB that)
the same chart with a different scale can be used to communicate bytes and
dollars. I set about trying to accomplish this like so:

http://nersp.nerdc.ufl.edu/~asr/media/r-foo/try1.png

Those axes are a little messy. I tried nudging them around

http://nersp.nerdc.ufl.edu/~asr/media/r-foo/try2.png

which is better but not good.  What I really want to do is tell my axis()
function to reverse the tick direction:  put your ticks and labels inside
the graph.  Something like

http://nersp.nerdc.ufl.edu/~asr/media/r-foo/dirty-hack.png

which I accomplished by telling axis() 'line=-47.7'. 

Eugh.

Note that the distance between the left side and the right is different
between the right side to the left. :)   I don't particularly object to this,
when you abuse a tool in this manner you need to expect oddities.

I've wandered through the mailing list logs, and haven't seen reference to
this particular desire.  Am I alone? :) 



- Allen S. Rout

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html