Re: [R] Latex editor recommendations
Tom Backer Johnsen [EMAIL PROTECTED] writes: This question is not oriented towards R, but is posted here because I have the impression that there are at least some Latex users among the contributors. The question is: What editors for Latex are to be recommended? I have located one: EMACS. It's not just an editor, it's a religion. - Allen S. Rout __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Branching on 'grep' returns...
Greetings, all. I'm fiddling with some text manipulation in R, and I've found something which feels counterintuitive to my PERL-trained senses; I'm hoping that I can glean new R intuition about the situation. Here's an example, as concise as I could make it. trg-c(this,that) # these two work as I'd expected. if ( grep(this,trg) ) { cat(Y\n) } else { cat(N\n) } if ( grep(that,trg) ) { cat(Y\n) } else { cat(N\n) } # These all fail with error 'argument is of length zero' # if ( grep(other,trg) ) { cat(Y\n) } else { cat(N\n) } # if ( grep(other,trg) == TRUE) { cat(Y\n) } else { cat(N\n) } # if ( grep(other,trg) == 1) { cat(Y\n) } else { cat(N\n) } # This says that the result is a numeric zero. Shouldn't I be able # to if on that, or at least compare it with a number? grep(other, trg) # I eventually decided this worked, but felt odd to me. if ( any(grep(other,trg))) { cat(Y\n) } else { cat(N\n) } So, is the 'Wrap it in an any()' just normal R practice, and I'm too new to know it? Is there a more fundamental dumb move I'm making? - Allen S. Rout __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Sort of data.frame yields a thing which is not a data.frame...
I've observed something I don't understand, and I was hoping someone could point me to the right section of docs. There's a portion in one of my analyses in which I am wont to sort a data.frame so: seriesS - seriesS[order(as.Date(row.names(seriesS),format=%m/%d/%Y)),] So, I've got row.names which are textual representations of dates, I'd like to retain them as such, but order them datewise. As long as seriesS has more than one column, this works nicely, seriesS afterwards is a data.frame with columns similar to those it had going in. But if I encounter a case with only one column, the result is _not_ a data frame, but instead an ? array? list? I can solve this with a kluge: seriesS$stupid - 0; seriesS - seriesS[order(as.Date(row.names(seriesS),format=%m/%d/%Y)),] seriesS - seriesS[,-c(which(names(seriesS)==stupid)) ] but this mostly tells me I've failed to understand something about how the process should work. Any good references to the Chapter and Verse of the Canon of R I should hit? - Allen S. Rout __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] R usage for log analysis
Gabriel Diaz [EMAIL PROTECTED] writes: I'm taking an overview to the project documentation, and seems the database is the way to go to handle log files of GB order (normally between 2 and 4 GB each 15 day dump). In this document http://cran.r-project.org/doc/manuals/R-data.html, says R will load all data into memory to process it when using read.table and such. Using a database will do the same? Well, currently i have no machine with 2 GB of memory. Remember, swap too. This means you're using more time, not running into a hard limit. If you're concerned about gross size, then preprocessing could be useful; but consider: RAM is cheap. Calibrate RAM purchases w.r.t. hours of your coding time, -before- you start the project. Then you can at least mutter to yourself when you waste more than the cost of core trying to make the problem small. :) It's entirely reasonable to do all your development work on a smaller set, and then dump the real data into it and go home. Unless you've got something O(N^2) or so, you should be fine. - Allen S. Rout __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] R usage for log analysis
Gabriel Diaz [EMAIL PROTECTED] writes: and what is the correct path to do it? I mean, put logs files in a mysql or somehting like that, and then make R use that data, using the data from the files directly? I haven't stuck anything in a DB yet. I'm not sure how much of the DB clue is used under the covers. pre-parse the log files to accomodate them to R? Probably not; a little familiarity with the reading functions will obviate most needs to pre-parse. I need faqs, manuals, books, whatever to learn about this, can anyone give some advice? [...] Don't expect a warm welcome. This community is like all open-source communities, sharply focused on its' own concerns and expertise. And, in an unusual experience for computer types, our core competencies hold little or no sway here; they don't even give us much of a leg up. Just wait 'till you want to do something nutso like produce a business graphic. :) I'm working on understanding enough of R packaging and documentation to begin a 'task view' focused on systems administration, for humble submission. That might end up being mostly log analysis; the term can describe much of what we do, if it's stretched a bit. I'm hoping the task view will attract the teeming masses of sysadmins trapped in the mire of Gnuplot and friends. For starters, become familliar with read.table(); with a few variations it will take care of all the while () { @blah = split(/,/); etc. etc. etc. } you've been accustomed to doing. Name columns; this makes it easier to think about your data. names(my_data)-c(column,names,can,be,assigned,to) Start thinking of your data in generic sets, as opposed to specific rows. Situations which required iteration over specific rows in PERL-land fall neatly to precise assignment in R. For example, if you've got records with dates and times and you want to work with time structures: in PERL you'd foreach (...) {$foo-{pdate} = parsedate($foo-{date}. .$foo-{time})} or some such. In R-land, the iteration is implicit. Here's a snippet from something I'm using a$pdate-as.POSIXct(paste(format(a$dte,%Y/%m/%d),a$time)) You're really acting on logical columns all at once here. This is fantastically more efficient in terms of your thought processes. - Allen S. Rout __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] Curve fitting tutorial / clue stick?
Working through the R archives and webspace, I've mostly proved to myself that I don't know enough about what statisticians call Curve Fitting to even begin translating the basics. I'm a sysadmin, and have collected a variety of measurements of my systems, and I can draw pretty pictures in R showing what has happened. People are happy, customers feel empowered. Whee! Now, I want to take my corpus of data and make a prediction based on it; In statistics-moron speak, I want to draw a line or a simple curve across my extant graph, and figure out where the predictive curve passes threshold 'T', and then graph that too. I thought I'd be telling R something like: - I think this is exponential. Here's the data. Give me the best function you can come up with, and tell me how good the fit is. - I think this is quadratic. Here's the data. Give me the best function you can come up with, and tell me how good the fit is. Can someone point me at a spot in the docs which might be suitable for my level of ignorance? - Allen S. Rout __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] Dirty Rotten Hack. (reversing tickmarks on axes?)
I feel dirty. I have some graphs I'm building to communicate chargeback rates and service usage for our backup system here at the University of Florida. These come down to daily data points on a graph of number-of-bytes transferred and stored. Since we chargeback on the same basis (price per MB this, price per KB that) the same chart with a different scale can be used to communicate bytes and dollars. I set about trying to accomplish this like so: http://nersp.nerdc.ufl.edu/~asr/media/r-foo/try1.png Those axes are a little messy. I tried nudging them around http://nersp.nerdc.ufl.edu/~asr/media/r-foo/try2.png which is better but not good. What I really want to do is tell my axis() function to reverse the tick direction: put your ticks and labels inside the graph. Something like http://nersp.nerdc.ufl.edu/~asr/media/r-foo/dirty-hack.png which I accomplished by telling axis() 'line=-47.7'. Eugh. Note that the distance between the left side and the right is different between the right side to the left. :) I don't particularly object to this, when you abuse a tool in this manner you need to expect oddities. I've wandered through the mailing list logs, and haven't seen reference to this particular desire. Am I alone? :) - Allen S. Rout __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html