Re: [R] own TAB expansion
On Fri, Oct 8, 2010 at 6:19 AM, Sebastian Gibb li...@sebastiangibb.de wrote: Hello Duncan, thank for your advice, but it doesn't work like expected: setClass(Class=A, representation=representation(slotA=numeric, slotB=numeric)); setMethod($, A, function(x, name) {return(slot(x, name));}) setGeneric(.DollarNames) setMethod(.DollarNames, signature(x=A), function(x, pattern)grep(pattern=pattern, x=c(slotA, slotB), value=T)) a - new(A, slotA=1, slotB=2) a$sl TAB # doesn't print slotA/slotB a$ What I'm doing wrong? There is a namespace issue with making .DollarNames() generic; basically, the completion code in the utils namespace never sees the new S4 generic. See a previous discussion at http://www.mail-archive.com/r-de...@r-project.org/msg20553.html Defining a S3 method should work (without the need for a dummy S3 class even with inheritance if you are working with R 2.12): .DollarNames.A - function(x, pattern) { grep(pattern=pattern, x=c(slotA, slotB), value=T) } -Deepayan __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] point characters THICKER in xyplot()
On Fri, Oct 8, 2010 at 5:19 PM, array chip arrayprof...@yahoo.com wrote: Hi, how can I make the point characters thicker (NOT larger) in xyplot when groups= argument is used? dat-data.frame(x=1:100,y=1:100,group=rep(LETTERS[1:5],each=20)) ### lwd=2 doesn't work here xyplot(y~x,groups=group,data=dat,col=1:4,pch=1:4,lwd=2) ### lwd=2 works with panel.points(), but grouping is messed up! xyplot(y~x,groups=group,data=dat,col=1:4,pch=1:4, panel=function(...) {panel.points(...,lwd=2)}) ### group is correct with panel.superpose(), but lwd=2 doesn't work! xyplot(y~x,groups=group,data=dat,col=1:4,pch=1:4, panel=function(...) {panel.superpose(...,lwd=2)}) Any suggestions? xyplot(y~x,groups=group,data=dat,col=1:4,pch=1:4,lwd=2, panel = panel.superpose, panel.groups = panel.points) panel.xyplot() should also honor lwd at some point (but I haven't gotten around to it yet). -Deepayan __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] how to retrieve user coordinates in xyplot
On Fri, Oct 8, 2010 at 3:52 PM, array chip arrayprof...@yahoo.com wrote: Hi, is there a way to retrieve the extremes of the user coordinates of the plotting region, like what par(usr) does in general graphics? I'd like to use them to print additional texts at certain place inside each panel. Thanks ?current.panel.limits -Deepayan __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] R: Why this deosn't work?, matrix, rounding error?
On Fri, 2010-10-08 at 09:30 -0700, skan wrote: It's a problem much bigger. I use a matrix to store the results of a bigger problem. I loop through several variables and store the results of a computation on that matrix. At the beginning of the problem I initialize the matrix to zeros and I calculate its size from some input. And that seems not to work well maybe because of some rounding error. Several people have responded with a solution to your Q on stackoverflow: matrix(0, ncota*nslope, 4) As the 0 will get recycled to appropriate length. G -- %~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~% Dr. Gavin Simpson [t] +44 (0)20 7679 0522 ECRC, UCL Geography, [f] +44 (0)20 7679 0565 Pearson Building, [e] gavin.simpsonATNOSPAMucl.ac.uk Gower Street, London [w] http://www.ucl.ac.uk/~ucfagls/ UK. WC1E 6BT. [w] http://www.freshwaters.org.uk %~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~% __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] font question on pdf device
On Fri, 2010-10-08 at 14:19 +0100, Ted Harding wrote: On 08-Oct-10 12:44:12, Kari Ruohonen wrote: Hi, I wonder if this is something on my machine locally or R in general. When I do the following: plot(c(0,1),c(0,1),main=expression(paste(symbol(D),D,sep=))) I get a plot with a title having uppercase delta followed by D. But in the following pdf(file=deltaTest.pdf) plot(c(0,1),c(0,1),main=expression(paste(symbol(D),D,sep=))) dev.off() the uppercase delta looks like O with overstrike slash, i.e. Ø. snip [1] stats graphics grDevices utils datasets methods base which is the same as yours (except that I'm using a slightly earlier version of R, and on i486 rather than x86_64. Debian Etch by the way). Ted. E-Mail: (Ted Harding) ted.hard...@wlandres.net Fax-to-email: +44 (0)870 094 0861 Date: 08-Oct-10 Time: 14:19:48 -- XFMail -- Hi and thanks for suggestions. Based on these I installed acroread and found that when viewed with acroread the Delta in the pdf file prints out OK but when viewed with evince, the document viewer, I get the error. So, it seems not be an R issue at all. I am running 64-bit Ubuntu 9.10 for those who are interested in testing this. Many thanks for all help. Kari __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] R: Why this deosn't work?, matrix, rounding error?
Hello I've seen the answer at stackoverflow. They also said I must use zapsmall to avoid roundup problems. I didn't expect this behaviour when division gives an integer number. -- View this message in context: http://r.789695.n4.nabble.com/R-Why-this-deosn-t-work-matrix-rounding-error-tp2968527p2969459.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] R: Why this deosn't work?, matrix, rounding error?
On 2010-10-09 4:47, skan wrote: Hello I've seen the answer at stackoverflow. They also said I must use zapsmall to avoid roundup problems. I didn't expect this behaviour when division gives an integer number. The trouble is that your expectations may not coincide with reality. That's why people refer you to FAQ 7.31. Even replacing the rep(0, ) with just 0 will not necessarily give the expected result: eps1 - 1e-16 eps2 - 1e-15 ## try to generate a 3-by-4 matrix: matrix(0, nrow = 3 - eps1, ncol = 4) # [,1] [,2] [,3] [,4] #[1,]0000 #[2,]0000 #[3,]0000 matrix(0, nrow = 3 - eps2, ncol = 4) # [,1] [,2] [,3] [,4] #[1,]0000 #[2,]0000 matrix(0, nrow = zapsmall(3 - eps2), ncol = 4) # [,1] [,2] [,3] [,4] #[1,]0000 #[2,]0000 #[3,]0000 ## Note that your calculation did _not_ yield an integer: 1 + ((1.5 - 0.1) / 0.05) - 29 #[1] -3.552714e-15 Such are the vagaries of floating-point arithmetic. Play it safe; use zapsmall. -Peter Ehlers __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] A competition to create a recommendation engine for R packages
Hello everyone. There is a new competition, outlined on the blog dataistshttp://www.dataists.com/2010/10/using-data-tools-to-find-data-tools-the-yo-dawg-of-data-hacking/, inviting us to analyse statistics of the use of R packages (collected from 52 R users), to create a R-package suggestion engine for ourselves. Since I noticed several bloggers already wrote about it (as I have detailed herehttp://www.r-statistics.com/2010/10/a-competition-to-recommend-relevant-r-packages-and-the-future-of-r/), I thought it to be fitting to also notified the members of the R help mailing list as well. Best, Tal Contact Details:--- Contact me: tal.gal...@gmail.com | 972-52-7275845 Read me: www.talgalili.com (Hebrew) | www.biostatistics.co.il (Hebrew) | www.r-statistics.com (English) -- [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Hausman test for endogeneity
Dear folks, can anybody point me in the right direction on how to conduct a hausman test for endogeneity in simultanous equation models? Best, Holger -- View this message in context: http://r.789695.n4.nabble.com/Hausman-test-for-endogeneity-tp2969522p2969522.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Package for converting R datasets into SQL Server (create table and insert statements)?
Thanks, but I'm not looking for a function to save dataframes into a RDBMS. I'm looking for a function which creates CREATE TABLE and INSERT statements from a dataframe. -J 2010/10/5 Eric Lecoutre ericlecou...@gmail.com: Hi, You can have a look at RODBC and its function sqlSave. HTH, Eric 2010/10/3 johannes rara johannesr...@gmail.com Hi, R contains many good datasets which would be valuable in other platforms as well. My intention is to use R datasets on SQL Server as a sample tables. Is there a package that would do automatic conversion from the dataset schema into a SQL Server CREATE TABLE statement (and INSERT INTO statements)? For example. str(cars) 'data.frame': 50 obs. of 2 variables: $ speed: num 4 4 7 7 8 9 10 10 10 11 ... $ dist : num 2 10 4 22 16 10 18 26 34 17 ... would become create table dbo.cars ( id int identity(1,1) not null, speed int not null, dist int not null, constraint PK_id primary key clustered (id ASC) on [PRIMARY] ) insert into dbo.cars values (N'4', N'2'), (N'4', N'10'), (N'7', N'4'), etc. -J __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Eric Lecoutre Consultant - Business Decision Business Intelligence Customer Intelligence __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Package for converting R datasets into SQL Server (create table and insert statements)?
On Sat, Oct 9, 2010 at 9:02 AM, johannes rara johannesr...@gmail.com wrote: Thanks, but I'm not looking for a function to save dataframes into a RDBMS. I'm looking for a function which creates CREATE TABLE and INSERT statements from a dataframe. If the reason you want that is so you can manipulate R data frames in SQL then the sqldf package does that. There are no create statements to issue and no insert statements to issue (although you can). The database is automatically created, the create and insert statements are automatically generated and executed, your SQL statement is run, the result is automatically retrieved and the database is automatically destroyed afterwards. You just specify a select or other sql statement with the data frame name(s) replacing the table name(s). It works with built-in data frames that ship with R and with data frames you create yourself. See http://sqldf.googlecode.com for more. -- Statistics Software Consulting GKX Group, GKX Associates Inc. tel: 1-877-GKX-GROUP email: ggrothendieck at gmail.com __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Memory management in R
Hi David, I am replying to you and to the other people who provided some insight into my problems with grepl. Well, at least we now know that the bug is reproducible. Indeed it is a strange sequence the one I am postprocessing, probably pathological to some extent, nevertheless the problem is given by grepl crushing when a long (but not huge) chunk of repeated data is loaded has to be acknowledged. Now, my problem is the following: given a potentially long string (or before that a sequence, where every element has been generated via the hash function, algo='crc32' of the digest package), how can I, starting from an arbitrary position i along the list, calculate the shortest substring in the future of i (i.e. the interval i:end of the series) that has not occurred in the past of i (i.e. [1:i-1])? Efficiency is not the main point here, I need to run this code only once to get what I need, but it cannot crush on a 2000-entry string. Cheers Lorenzo On 10/09/2010 01:30 AM, David Winsemius wrote: What puzzles me is that the list is not really long (less than 2000 entries) and I have not experienced the same problem even with longer lists. But maybe your loop terminated in them eaarlier/ Someplace between 11*225 and 11*240 the grepping machine gives up: eprs - paste(rep(aa, 225), collapse=#) grepl(eprs, eprs) [1] TRUE eprs - paste(rep(aa, 240), collapse=#) grepl(eprs, eprs) Error in grepl(eprs, eprs) : invalid regular expression 'aa#aa#aa#aa#aa#aa#aa#aa#aa#aa#aa#aa#aa#aa#aa#aa#aa#aa#aa#aa#aa#aa#aa#aa#aa#aa#aa#aa#aa#aa#aa#aa#aa#aa#aa#aa#aa#aa#aa#aa#aa#aa#aa#aa#aa#aa#aa#aa#aa#aa#aa#aa#aa#aa#aa#aa#aa#aa#aa#aa#aa#aa#aa#aa#aa#aa#aa#aa#aa#aa#aa#aa#aa#aa#aa#aa#aa#aa#aa#aa#aa#aa#aa#aa#aa#aa#aa#a In addition: Warning message: In grepl(eprs, eprs) : regcomp error: 'Out of memory' The complexity of the problem may depend on the distribution of values. You have a very skewed distribution with the vast majority being in the same value as appeared in your error message : table(x) x 12653a6 202fbcc4 48bef8c3 4e084ddc 51f342a4 5d64d58a 78087f5e abddf3d1 1419 299 1 1 1 3 1 1 ac76183b b955be36 c600173a e96f6bbd e9c56275 1 30 5 1 9 And you have 1159 of them in one clump (which would seem to be somewhat improbably under a random null hypothesis: max(rle(x)$lengths) [1] 1159 which(rle(x)$lengths == 1159) [1] 123 rle(x)$values[123] [1] 12653a6 HTH (although I think it means you need to construct a different implementation strategy); David. Many thanks Lorenzo __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Package for converting R datasets into SQL Server (create table and insert statements)?
Package RSQLite has a dbBuildTableDefinition that creates the CREATE TABLE statement for a given a data.frame. I think other db related packages for MySQL and PostgreSQL also have such a function. Michael On 10 October 2010 00:39, Gabor Grothendieck ggrothendi...@gmail.com wrote: On Sat, Oct 9, 2010 at 9:02 AM, johannes rara johannesr...@gmail.com wrote: Thanks, but I'm not looking for a function to save dataframes into a RDBMS. I'm looking for a function which creates CREATE TABLE and INSERT statements from a dataframe. If the reason you want that is so you can manipulate R data frames in SQL then the sqldf package does that. There are no create statements to issue and no insert statements to issue (although you can). The database is automatically created, the create and insert statements are automatically generated and executed, your SQL statement is run, the result is automatically retrieved and the database is automatically destroyed afterwards. You just specify a select or other sql statement with the data frame name(s) replacing the table name(s). It works with built-in data frames that ship with R and with data frames you create yourself. See http://sqldf.googlecode.com for more. -- Statistics Software Consulting GKX Group, GKX Associates Inc. tel: 1-877-GKX-GROUP email: ggrothendieck at gmail.com __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Package for converting R datasets into SQL Server (create table and insert statements)?
Have you considered the DBI and RODBC packages? I'm trying to do something like this myself right now, and a post of my own (to R-SIG-DB) produced recommendations for these two packages. Both have vignettes. Hope this helps. Spencer On 10/9/2010 6:52 AM, Michael Bedward wrote: Package RSQLite has a dbBuildTableDefinition that creates the CREATE TABLE statement for a given a data.frame. I think other db related packages for MySQL and PostgreSQL also have such a function. Michael On 10 October 2010 00:39, Gabor Grothendieckggrothendi...@gmail.com wrote: On Sat, Oct 9, 2010 at 9:02 AM, johannes rarajohannesr...@gmail.com wrote: Thanks, but I'm not looking for a function to save dataframes into a RDBMS. I'm looking for a function which creates CREATE TABLE and INSERT statements from a dataframe. If the reason you want that is so you can manipulate R data frames in SQL then the sqldf package does that. There are no create statements to issue and no insert statements to issue (although you can). The database is automatically created, the create and insert statements are automatically generated and executed, your SQL statement is run, the result is automatically retrieved and the database is automatically destroyed afterwards. You just specify a select or other sql statement with the data frame name(s) replacing the table name(s). It works with built-in data frames that ship with R and with data frames you create yourself. See http://sqldf.googlecode.com for more. -- Statistics Software Consulting GKX Group, GKX Associates Inc. tel: 1-877-GKX-GROUP email: ggrothendieck at gmail.com __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Spencer Graves, PE, PhD President and Chief Operating Officer Structure Inspection and Monitoring, Inc. 751 Emerson Ct. San José, CA 95126 ph: 408-655-4567 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Package for converting R datasets into SQL Server (create table and insert statements)?
On Oct 9, 2010, at 9:02 AM, johannes rara wrote: Thanks, but I'm not looking for a function to save dataframes into a RDBMS. I'm looking for a function which creates CREATE TABLE and INSERT statements from a dataframe. (My first comment is speculation that Eric was intending that you look at the _code_ of sqlSave rather than on its output. My reading of the code (at the console rather than the source) suggests that it is constructing the code and passing it to the external drivers.) Looking at the web documentation linked from sqldf()'s help page, it appears that at least part of this could also addressed by example 9 of the current full documentation: http://code.google.com/p/sqldf/ BOD is a built-in dataframe: require(sqldf) ?sqldf # Portion of example 9: sqldf(pragma table_info(BOD)) cid name type notnull dflt_value pk 1 0 Time REAL 0 NA 0 2 1 demand REAL 0 NA 0 sqldf(c(select * from BOD, select * from sqlite_master)) type name tbl_name rootpage 1 table BOD BOD2 sql 1 CREATE TABLE `BOD` \n( Time REAL,\n\tdemand REAL \n) There is integration with a variety of SQL db's, although the act of table creation may be limited to SQLite, since the primary advertised activity is SELECT statements and it does its access through the SQLite drive in memory ... at least as I understand it. -- David. -J 2010/10/5 Eric Lecoutre ericlecou...@gmail.com: Hi, You can have a look at RODBC and its function sqlSave. HTH, Eric 2010/10/3 johannes rara johannesr...@gmail.com Hi, R contains many good datasets which would be valuable in other platforms as well. My intention is to use R datasets on SQL Server as a sample tables. Is there a package that would do automatic conversion from the dataset schema into a SQL Server CREATE TABLE statement (and INSERT INTO statements)? For example. str(cars) 'data.frame': 50 obs. of 2 variables: $ speed: num 4 4 7 7 8 9 10 10 10 11 ... $ dist : num 2 10 4 22 16 10 18 26 34 17 ... would become create table dbo.cars ( id int identity(1,1) not null, speed int not null, dist int not null, constraint PK_id primary key clustered (id ASC) on [PRIMARY] ) insert into dbo.cars values (N'4', N'2'), (N'4', N'10'), (N'7', N'4'), etc. -J __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Eric Lecoutre Consultant - Business Decision Business Intelligence Customer Intelligence __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. David Winsemius, MD West Hartford, CT __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Possible Bug in Effects Package
On 2010-10-02 11:47, Luciano Selzer wrote: Dear List, I find Effects package very useful, but I believe I have found a bug in allEffects function. Please consider the following code: test- data.frame(tries= round(runif(40, 5, 300)), tra = gl(4, 10, labels = c(V, D, C, L)), prop= runif(40, 0, 1)) test$success- round(with(test, tries*prop)) test$prop- with(test, success/tries) model- glm( cbind(success, tries) ~ -1 + tra, data = test, family = binomial) allEffects(model) #Error en eval(expr, envir, enclos) : objeto 'tra' no encontrado model2- glm( prop ~ -1 + tra, weights = tries, data = test, family = binomial) allEffects(model2) #Works On a quick search on the internet I've found nothing about this. Is this a bug? I think that this is indeed a bug, probably due to the use of the all.vars() function in effects:::analyze.model(). The obvious workaround is to specify your model as in model2 above or, if you want to use the matrix-response version, then give the matrix a name and use that in your model: respmat - with(test, cbind(success, tries - success)) ##[correcting your cbind] mod - glm(respmat ~ ) -Peter Ehlers Thanks for your time Luciano [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Unsubscribe me from mailing list
Please unsubscribe me from this list. Thank you. Marine Biologist Elasmobranch Bycatch Reduction Scientist SharkDefense Technologies, LLC (845) 702-7087 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Memory management in R
On Oct 9, 2010, at 9:45 AM, Lorenzo Isella wrote: Hi David, I am replying to you and to the other people who provided some insight into my problems with grepl. Well, at least we now know that the bug is reproducible. Indeed it is a strange sequence the one I am postprocessing, probably pathological to some extent, nevertheless the problem is given by grepl crushing when a long (but not huge) chunk of repeated data is loaded has to be acknowledged. Now, my problem is the following: given a potentially long string (or before that a sequence, where every element has been generated via the hash function, algo='crc32' of the digest package), how can I, starting from an arbitrary position i along the list, calculate the shortest substring in the future of i (i.e. the interval i:end of the series) that has not occurred in the past of i (i.e. [1:i-1])? Maybe you should work on a less convoluted explanation of the test? Or perhaps a couple of compact examples, preferably in R-copy-paste format? Efficiency is not the main point here, I need to run this code only once to get what I need, but it cannot crush on a 2000-entry string. My suggestion is to explore other alternatives. (I will admit that I don't yet fully understand the test that you are applying.) The two that have occurred to me are Biostrings which I have already mentioned and rle() which I have illustrated the use of but not referenced as an avenue. The Biostrings package is part of bioConductor (part of the R universe) although you should be prepared for a coffee break when you install it if you haven't gotten at least bioClite already installed. When I installed it last night it had 54 other package dependents also downloaded and installed. It seems to me that taking advantage of the coding resources in the molecular biology domain that are currently directed at decoding the information storage mechanism of life might be a smart strategy. You have not described the domain you are working in but I would guess that the digest package might be biological in primary application? So forgive me if I am preaching to the choir. The rle option also occurred to me but it might take a smarter coder than I to fully implement it. (But maybe Holtman would be up to it. He's a _lot_ smarter than I.) In your example the long x string is faithfully represented by two aligned vectors, each 197 characters in length. The long repeat sequence that broke the grepl mechanism are just one pair of values. rle(x) Run Length Encoding lengths: int [1:197] 1 1 2 1 1 4 1 9 1 1 ... values : chr [1:197] 5d64d58a ac76183b 202fbcc4 78087f5e ... So maybe as soon as you got to a bundle that was greater than 1/2 the overall length (as happened in the x case) you could stop, since it could not have occurred before. -- David. Cheers Lorenzo On 10/09/2010 01:30 AM, David Winsemius wrote: What puzzles me is that the list is not really long (less than 2000 entries) and I have not experienced the same problem even with longer lists. But maybe your loop terminated in them eaarlier/ Someplace between 11*225 and 11*240 the grepping machine gives up: eprs - paste(rep(aa, 225), collapse=#) grepl(eprs, eprs) [1] TRUE eprs - paste(rep(aa, 240), collapse=#) grepl(eprs, eprs) Error in grepl(eprs, eprs) : invalid regular expression 'aa#aa#aa#aa#aa#aa#aa#aa#aa#aa#aa#aa#aa#aa#aa#aa#aa#aa#aa#aa#aa#aa#aa#aa#aa#aa#aa#aa#aa#aa#aa#aa#aa#aa#aa#aa#aa#aa#aa#aa#aa#aa#aa#aa#aa#aa#aa#aa#aa#aa#aa#aa#aa#aa#aa#aa#aa#aa#aa#aa#aa#aa#aa#aa#aa#aa#aa#aa#aa#aa#aa#aa#aa#aa#aa#aa#aa#aa#aa#aa#aa#aa#aa#aa#aa#aa#aa#a In addition: Warning message: In grepl(eprs, eprs) : regcomp error: 'Out of memory' The complexity of the problem may depend on the distribution of values. You have a very skewed distribution with the vast majority being in the same value as appeared in your error message : table(x) x 12653a6 202fbcc4 48bef8c3 4e084ddc 51f342a4 5d64d58a 78087f5e abddf3d1 1419 299 1 1 1 3 1 1 ac76183b b955be36 c600173a e96f6bbd e9c56275 1 30 5 1 9 And you have 1159 of them in one clump (which would seem to be somewhat improbably under a random null hypothesis: max(rle(x)$lengths) [1] 1159 which(rle(x)$lengths ==
Re: [R] Unsubscribe me from mailing list
On Oct 9, 2010, at 10:45 AM, cr...@sharkdefense.com wrote: Please unsubscribe me from this list. Thank you. You need to do that yourself ... none of us can do that for you. Login and unsubscribe through the web page where you subscribed. You can also just leave yourself subscribed but turn off mailings or convert to once daily digests. https://stat.ethz.ch/mailman/listinfo/r-help (At the bottom of the page.) -- David Winsemius, MD West Hartford, CT __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Package for converting R datasets into SQL Server (create table and insert statements)?
Thanks Michael! dbBuildTableDefinition is something I was looking for but it does not seem to support SQL Server table definitions (CREATE TABLE statements may vary between different RDBMS). Thanks anyway, -J 2010/10/9 Michael Bedward michael.bedw...@gmail.com: Package RSQLite has a dbBuildTableDefinition that creates the CREATE TABLE statement for a given a data.frame. I think other db related packages for MySQL and PostgreSQL also have such a function. Michael On 10 October 2010 00:39, Gabor Grothendieck ggrothendi...@gmail.com wrote: On Sat, Oct 9, 2010 at 9:02 AM, johannes rara johannesr...@gmail.com wrote: Thanks, but I'm not looking for a function to save dataframes into a RDBMS. I'm looking for a function which creates CREATE TABLE and INSERT statements from a dataframe. If the reason you want that is so you can manipulate R data frames in SQL then the sqldf package does that. There are no create statements to issue and no insert statements to issue (although you can). The database is automatically created, the create and insert statements are automatically generated and executed, your SQL statement is run, the result is automatically retrieved and the database is automatically destroyed afterwards. You just specify a select or other sql statement with the data frame name(s) replacing the table name(s). It works with built-in data frames that ship with R and with data frames you create yourself. See http://sqldf.googlecode.com for more. -- Statistics Software Consulting GKX Group, GKX Associates Inc. tel: 1-877-GKX-GROUP email: ggrothendieck at gmail.com __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Possible Bug in Effects Package
Dear Peter and Luciano, I agree that this is a bug, and I'll try to fix it as soon as I have a chance -- probably the week after next. I was rather surprised that effect() works in a model without a constant, but it does seem to: model2- glm( prop ~ -1 + tra, weights = tries, data = test, family = binomial) allEffects(model2) model: prop ~ -1 + tra tra effect tra V D C L 0.4129073 0.4731815 0.5454545 0.4548451 model3- glm( prop ~ tra, weights = tries, data = test, family = binomial) allEffects(model3) model: prop ~ tra tra effect tra V D C L 0.4129073 0.4731815 0.5454545 0.4548451 1/(1 + exp(-coef(model2))) traV traD traC traL 0.4129073 0.4731815 0.5454545 0.4548451 I expect that this is peculiar to the one-way classification, and that effect() will not work in general for a model without a constant (which will violate marginality). Thanks for bringing the problem to my attention. I'm afraid that I've been so busy this fall that I've been unable to monitor the r-help list. John John Fox Senator William McMaster Professor of Social Statistics Department of Sociology McMaster University Hamilton, Ontario, Canada web: socserv.mcmaster.ca/jfox -Original Message- From: Peter Ehlers [mailto:ehl...@ucalgary.ca] Sent: October-09-10 10:20 AM To: Luciano Selzer Cc: r-help@r-project.org; John Fox Subject: Re: [R] Possible Bug in Effects Package On 2010-10-02 11:47, Luciano Selzer wrote: Dear List, I find Effects package very useful, but I believe I have found a bug in allEffects function. Please consider the following code: test- data.frame(tries= round(runif(40, 5, 300)), tra = gl(4, 10, labels = c(V, D, C, L)), prop= runif(40, 0, 1)) test$success- round(with(test, tries*prop)) test$prop- with(test, success/tries) model- glm( cbind(success, tries) ~ -1 + tra, data = test, family = binomial) allEffects(model) #Error en eval(expr, envir, enclos) : objeto 'tra' no encontrado model2- glm( prop ~ -1 + tra, weights = tries, data = test, family = binomial) allEffects(model2) #Works On a quick search on the internet I've found nothing about this. Is this a bug? I think that this is indeed a bug, probably due to the use of the all.vars() function in effects:::analyze.model(). The obvious workaround is to specify your model as in model2 above or, if you want to use the matrix-response version, then give the matrix a name and use that in your model: respmat - with(test, cbind(success, tries - success)) ##[correcting your cbind] mod - glm(respmat ~ ) -Peter Ehlers Thanks for your time Luciano [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] StrSplit
Newbie question ... I am looking something equivalent to read.delim but which accepts a text line as parameter instead of a file input. Below is my problem, I'm unable to get the exact output which is a simple data frame of the data where the delimiter exists ... coming quite close though I have a data frame with 10 lines called MF_Data MF_Data [1:10] [1] Scheme Code;Scheme Name;Net Asset Value;Repurchase Price;Sale Price;Date [2] [3] Open Ended Schemes ( Liquid ) [4] [5] [6] AIG Global Investment Group Mutual Fund [7] 106506;AIG India Liquid Fund-Institutional Plan-Daily Dividend Option;1001.;1001.;1001.;02-Oct-2010 [8] 106511;AIG India Liquid Fund-Institutional Plan-Growth Option;1210.4612;1210.4612;1210.4612;02-Oct-2010 [9] 106507;AIG India Liquid Fund-Institutional Plan-Weekly Dividend Option;1001.8765;1001.8765;1001.8765;02-Oct-2010 [10] 106503;AIG India Liquid Fund-Retail Plan-DailyDividend Option;1001.;1001.;1001.;02-Oct-2010 Now for the lines below .. they are delimted by ; ... I am using tempTxt - MF_Data[7] MF_Data_F - unlist(strsplit(tempTxt,;, fixed = TRUE)) tempTxt - MF_Data[8] MF_Data_F1 - unlist(strsplit(tempTxt,;, fixed = TRUE)) MF_Data_F - rbind(MF_Data_F,MF_Data_F1) But MF_Data_F is not a simple 2X6 data frame which is what I want __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Hausman test for endogeneity
On Saturday 09 October 2010 14:37:35 Holger Steinmetz wrote: Dear folks, can anybody point me in the right direction on how to conduct a hausman test for endogeneity in simultanous equation models? Best, Holger hausman.systemfit [1] should be what you are looking for. Cheers Giuseppe [1] http://cran.r-project.org/web/packages/systemfit/systemfit.pdf __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] StrSplit
Is this what you are after: x - c(Scheme Code;Scheme Name;Net Asset Value;Repurchase Price;Sale Price;Date + , + ,Open Ended Schemes ( Liquid ) + , + , + , AIG Global Investment Group Mutual Fund + , 106506;AIG India Liquid Fund-Institutional Plan-Daily Dividend Option;1001.;1001.;1001.;02-Oct-2010 + , 106511;AIG India Liquid Fund-Institutional Plan-Growth Option;1210.4612;1210.4612;1210.4612;02-Oct-2010 + , 106507;AIG India Liquid Fund-Institutional Plan-Weekly Dividend Option;1001.8765;1001.8765;1001.8765;02-Oct-2010 + , 106503;AIG India Liquid Fund-Retail Plan-DailyDividend Option;1001.;1001.;1001.;02-Oct-2010) myData - read.table(textConnection(x[7:10]), sep=';') closeAllConnections() str(myData) 'data.frame': 4 obs. of 6 variables: $ V1: int 106506 106511 106507 106503 $ V2: Factor w/ 4 levels AIG India Liquid Fund-Institutional Plan-Daily Dividend Option,..: 1 2 3 4 $ V3: num 1001 1210 1002 1001 $ V4: num 1001 1210 1002 1001 $ V5: num 1001 1210 1002 1001 $ V6: Factor w/ 1 level 02-Oct-2010: 1 1 1 1 myData V1 V2 V3 V4 V5 V6 1 106506 AIG India Liquid Fund-Institutional Plan-Daily Dividend Option 1001.000 1001.000 1001.000 02-Oct-2010 2 106511 AIG India Liquid Fund-Institutional Plan-Growth Option 1210.461 1210.461 1210.461 02-Oct-2010 3 106507 AIG India Liquid Fund-Institutional Plan-Weekly Dividend Option 1001.876 1001.876 1001.876 02-Oct-2010 4 106503 AIG India Liquid Fund-Retail Plan-DailyDividend Option 1001.000 1001.000 1001.000 02-Oct-2010 On Sat, Oct 9, 2010 at 12:18 PM, Santosh Srinivas santosh.srini...@gmail.com wrote: Newbie question ... I am looking something equivalent to read.delim but which accepts a text line as parameter instead of a file input. Below is my problem, I'm unable to get the exact output which is a simple data frame of the data where the delimiter exists ... coming quite close though I have a data frame with 10 lines called MF_Data MF_Data [1:10] [1] Scheme Code;Scheme Name;Net Asset Value;Repurchase Price;Sale Price;Date [2] [3] Open Ended Schemes ( Liquid ) [4] [5] [6] AIG Global Investment Group Mutual Fund [7] 106506;AIG India Liquid Fund-Institutional Plan-Daily Dividend Option;1001.;1001.;1001.;02-Oct-2010 [8] 106511;AIG India Liquid Fund-Institutional Plan-Growth Option;1210.4612;1210.4612;1210.4612;02-Oct-2010 [9] 106507;AIG India Liquid Fund-Institutional Plan-Weekly Dividend Option;1001.8765;1001.8765;1001.8765;02-Oct-2010 [10] 106503;AIG India Liquid Fund-Retail Plan-DailyDividend Option;1001.;1001.;1001.;02-Oct-2010 Now for the lines below .. they are delimted by ; ... I am using tempTxt - MF_Data[7] MF_Data_F - unlist(strsplit(tempTxt,;, fixed = TRUE)) tempTxt - MF_Data[8] MF_Data_F1 - unlist(strsplit(tempTxt,;, fixed = TRUE)) MF_Data_F - rbind(MF_Data_F,MF_Data_F1) But MF_Data_F is not a simple 2X6 data frame which is what I want __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem that you are trying to solve? __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Counting unique items in a list of matrices
On 2010-10-07 10:10, Jim Silverton wrote: Hello, I gave a list of 2 x 2 matrices called matlist. I have about 5000 2 x 2 matrices. I would like to count how many of each 2 x 2 unique matrix I have. So I am thinking that I need a list of the unique 2 x 2 matrices and their counts. Can anyone help. Here's one way, using the plyr package: require(plyr) ## make a list of 2X2 matrices L - vector('list', 5000) set.seed(4321) for(i in 1:5000) L[[i]] - matrix(round(runif(4), 1), 2, 2) ## convert each matrix to a string of 4 numbers, then ## form dataframe dL - ldply(L, function(.x) toString(unlist(.x))) ## add an index vector dL$ind - seq_len(5000) ## count unique strings; return string, frequency, indeces result - ddply(dL, .(V1), summarize, freq=length(V1), idx=toString(ind)) -Peter Ehlers __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] StrSplit
Jim's solution is the ideal way to read in the data: using the sep=; argument in read.table. However, if you do for some reason have a vector of strings like the following (maybe someone gives you an Rdata file instead of the raw data file): MF_Data - c(106506;AIG India Liquid Fund-Institutional Plan-Daily Dividend Option;1001.;1001.;1001.;02-Oct-2010,106511;AIG India Liquid Fund-Institutional Plan-Growth Option;1210.4612;1210.4612;1210.4612;02-Oct-2010) Then you can use this to get a data frame: as.data.frame(do.call(rbind, lapply(MF_Data, function(x) unlist(strsplit(x, ';') Cheers, Jeff. On Sat, Oct 9, 2010 at 12:30 PM, jim holtman jholt...@gmail.com wrote: Is this what you are after: x - c(Scheme Code;Scheme Name;Net Asset Value;Repurchase Price;Sale Price;Date + , + ,Open Ended Schemes ( Liquid ) + , + , + , AIG Global Investment Group Mutual Fund + , 106506;AIG India Liquid Fund-Institutional Plan-Daily Dividend Option;1001.;1001.;1001.;02-Oct-2010 + , 106511;AIG India Liquid Fund-Institutional Plan-Growth Option;1210.4612;1210.4612;1210.4612;02-Oct-2010 + , 106507;AIG India Liquid Fund-Institutional Plan-Weekly Dividend Option;1001.8765;1001.8765;1001.8765;02-Oct-2010 + , 106503;AIG India Liquid Fund-Retail Plan-DailyDividend Option;1001.;1001.;1001.;02-Oct-2010) myData - read.table(textConnection(x[7:10]), sep=';') closeAllConnections() str(myData) 'data.frame': 4 obs. of 6 variables: $ V1: int 106506 106511 106507 106503 $ V2: Factor w/ 4 levels AIG India Liquid Fund-Institutional Plan-Daily Dividend Option,..: 1 2 3 4 $ V3: num 1001 1210 1002 1001 $ V4: num 1001 1210 1002 1001 $ V5: num 1001 1210 1002 1001 $ V6: Factor w/ 1 level 02-Oct-2010: 1 1 1 1 myData V1 V2 V3 V4 V5 V6 1 106506 AIG India Liquid Fund-Institutional Plan-Daily Dividend Option 1001.000 1001.000 1001.000 02-Oct-2010 2 106511 AIG India Liquid Fund-Institutional Plan-Growth Option 1210.461 1210.461 1210.461 02-Oct-2010 3 106507 AIG India Liquid Fund-Institutional Plan-Weekly Dividend Option 1001.876 1001.876 1001.876 02-Oct-2010 4 106503 AIG India Liquid Fund-Retail Plan-DailyDividend Option 1001.000 1001.000 1001.000 02-Oct-2010 On Sat, Oct 9, 2010 at 12:18 PM, Santosh Srinivas santosh.srini...@gmail.com wrote: Newbie question ... I am looking something equivalent to read.delim but which accepts a text line as parameter instead of a file input. Below is my problem, I'm unable to get the exact output which is a simple data frame of the data where the delimiter exists ... coming quite close though I have a data frame with 10 lines called MF_Data MF_Data [1:10] [1] Scheme Code;Scheme Name;Net Asset Value;Repurchase Price;Sale Price;Date [2] [3] Open Ended Schemes ( Liquid ) [4] [5] [6] AIG Global Investment Group Mutual Fund [7] 106506;AIG India Liquid Fund-Institutional Plan-Daily Dividend Option;1001.;1001.;1001.;02-Oct-2010 [8] 106511;AIG India Liquid Fund-Institutional Plan-Growth Option;1210.4612;1210.4612;1210.4612;02-Oct-2010 [9] 106507;AIG India Liquid Fund-Institutional Plan-Weekly Dividend Option;1001.8765;1001.8765;1001.8765;02-Oct-2010 [10] 106503;AIG India Liquid Fund-Retail Plan-DailyDividend Option;1001.;1001.;1001.;02-Oct-2010 Now for the lines below .. they are delimted by ; ... I am using tempTxt - MF_Data[7] MF_Data_F - unlist(strsplit(tempTxt,;, fixed = TRUE)) tempTxt - MF_Data[8] MF_Data_F1 - unlist(strsplit(tempTxt,;, fixed = TRUE)) MF_Data_F - rbind(MF_Data_F,MF_Data_F1) But MF_Data_F is not a simple 2X6 data frame which is what I want __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem that you are trying to solve? __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] StrSplit
On Oct 9, 2010, at 12:46 PM, Jeffrey Spies wrote: Jim's solution is the ideal way to read in the data: using the sep=; argument in read.table. However, if you do for some reason have a vector of strings like the following (maybe someone gives you an Rdata file instead of the raw data file): MF_Data - c(106506;AIG India Liquid Fund-Institutional Plan-Daily Dividend Option;1001.;1001.;1001.;02-Oct-2010,106511;AIG India Liquid Fund-Institutional Plan-Growth Option;1210.4612;1210.4612;1210.4612;02-Oct-2010) Then you can use this to get a data frame: as.data.frame(do.call(rbind, lapply(MF_Data, function(x) unlist(strsplit(x, ';') If you are suggesting that Jim's solution would not work here, then I would disagree and suggest you try offering your vector (without the cr's inserted by our mail clients) to his code. It should work just fine and be far more readable. On the other hand if you were offering this with an explanation that strsplit's split argument is more flexible than the sep argument in the read functions because it accepts regular expressions and so can handle situations where multiple separators exist in the same line, then I would applaud you. -- David. Cheers, Jeff. On Sat, Oct 9, 2010 at 12:30 PM, jim holtman jholt...@gmail.com wrote: Is this what you are after: x - c(Scheme Code;Scheme Name;Net Asset Value;Repurchase Price;Sale Price;Date + , + ,Open Ended Schemes ( Liquid ) + , + , + , AIG Global Investment Group Mutual Fund + , 106506;AIG India Liquid Fund-Institutional Plan-Daily Dividend Option;1001.;1001.;1001.;02-Oct-2010 + , 106511;AIG India Liquid Fund-Institutional Plan-Growth Option;1210.4612;1210.4612;1210.4612;02-Oct-2010 + , 106507;AIG India Liquid Fund-Institutional Plan-Weekly Dividend Option;1001.8765;1001.8765;1001.8765;02-Oct-2010 + , 106503;AIG India Liquid Fund-Retail Plan-DailyDividend Option;1001.;1001.;1001.;02-Oct-2010) myData - read.table(textConnection(x[7:10]), sep=';') closeAllConnections() str(myData) 'data.frame': 4 obs. of 6 variables: $ V1: int 106506 106511 106507 106503 $ V2: Factor w/ 4 levels AIG India Liquid Fund-Institutional Plan-Daily Dividend Option,..: 1 2 3 4 $ V3: num 1001 1210 1002 1001 $ V4: num 1001 1210 1002 1001 $ V5: num 1001 1210 1002 1001 $ V6: Factor w/ 1 level 02-Oct-2010: 1 1 1 1 myData V1 V2 V3 V4 V5 V6 1 106506 AIG India Liquid Fund-Institutional Plan-Daily Dividend Option 1001.000 1001.000 1001.000 02-Oct-2010 2 106511 AIG India Liquid Fund-Institutional Plan-Growth Option 1210.461 1210.461 1210.461 02-Oct-2010 3 106507 AIG India Liquid Fund-Institutional Plan-Weekly Dividend Option 1001.876 1001.876 1001.876 02-Oct-2010 4 106503 AIG India Liquid Fund-Retail Plan-DailyDividend Option 1001.000 1001.000 1001.000 02-Oct-2010 On Sat, Oct 9, 2010 at 12:18 PM, Santosh Srinivas santosh.srini...@gmail.com wrote: Newbie question ... I am looking something equivalent to read.delim but which accepts a text line as parameter instead of a file input. Below is my problem, I'm unable to get the exact output which is a simple data frame of the data where the delimiter exists ... coming quite close though I have a data frame with 10 lines called MF_Data MF_Data [1:10] [1] Scheme Code;Scheme Name;Net Asset Value;Repurchase Price;Sale Price;Date [2] [3] Open Ended Schemes ( Liquid ) [4] [5] [6] AIG Global Investment Group Mutual Fund [7] 106506;AIG India Liquid Fund-Institutional Plan-Daily Dividend Option;1001.;1001.;1001.;02-Oct-2010 [8] 106511;AIG India Liquid Fund-Institutional Plan-Growth Option;1210.4612;1210.4612;1210.4612;02-Oct-2010 [9] 106507;AIG India Liquid Fund-Institutional Plan-Weekly Dividend Option;1001.8765;1001.8765;1001.8765;02-Oct-2010 [10] 106503;AIG India Liquid Fund-Retail Plan-DailyDividend Option;1001.;1001.;1001.;02-Oct-2010 Now for the lines below .. they are delimted by ; ... I am using tempTxt - MF_Data[7] MF_Data_F - unlist(strsplit(tempTxt,;, fixed = TRUE)) tempTxt - MF_Data[8] MF_Data_F1 - unlist(strsplit(tempTxt,;, fixed = TRUE)) MF_Data_F - rbind(MF_Data_F,MF_Data_F1) But MF_Data_F is not a simple 2X6 data frame which is what I want __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem that you are trying to solve? __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] problem with colors
Hi Phil and Thomas, Thanks for your helpful feedback. I must admit my solution to creating the vector of colors lacked your elegance. In brief, I saved the output of colors() into a text file, saved all but 47 colours in that file and read it back as a data frame and used the first column of the dataframe as a vector of 47 colours. This roundabout method may have caused the problem because when I chose colours according to the commands sent by both of you things seemed to work just fine. Thank you very much for your feedback. Anjan On Thu, Oct 7, 2010 at 3:25 PM, Thomas Stewart tgstew...@gmail.com wrote: I would be helpful if you provided a more complete, reproducible example. Consider the following code. It colors the boxes according to the first 47 colors listed in the color() vector. -tgs data-as.data.frame(matrix(rnorm(47*23),ncol=47)) boxplot(data,col=colors()[1:47]) On Thu, Oct 7, 2010 at 2:22 PM, ANJAN PURKAYASTHA anjan.purkayas...@gmail.com wrote: Hi, I have a data set of 47 columns. I would like to create a boxplot for each column, each boxplot of a different colour. So I created a vector col1. This vector has a subset of the colors returned by color()- red, cyan, green etc. Now I use the command: boxplot(dataset, col= col1) expecting to see 47 boxplots, each of a different colour. Here is the problem: the boxplots are drawn correctly but it seems that only the first few colours in col1 are being used in a repeated pattern. Anybody has any ideas on how to tackle this? Thanks in advance, Anjan -- === anjan purkayastha, phd. research associate fas center for systems biology, harvard university 52 oxford street cambridge ma 02138 phone-703.740.6939 === [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- === anjan purkayastha, phd. research associate fas center for systems biology, harvard university 52 oxford street cambridge ma 02138 phone-703.740.6939 === [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Counting unique items in a list of matrices
If you just want a list of matrices and their counts, you can use Peter's list of matrices, L, and then: With plyr: require(plyr) count(unlist(lapply(L, toString))) Without plyr: as.data.frame(table(unlist(lapply(L, toString Cheers, Jeff. On Sat, Oct 9, 2010 at 12:44 PM, Peter Ehlers ehl...@ucalgary.ca wrote: On 2010-10-07 10:10, Jim Silverton wrote: Hello, I gave a list of 2 x 2 matrices called matlist. I have about 5000 2 x 2 matrices. I would like to count how many of each 2 x 2 unique matrix I have. So I am thinking that I need a list of the unique 2 x 2 matrices and their counts. Can anyone help. Here's one way, using the plyr package: require(plyr) ## make a list of 2X2 matrices L - vector('list', 5000) set.seed(4321) for(i in 1:5000) L[[i]] - matrix(round(runif(4), 1), 2, 2) ## convert each matrix to a string of 4 numbers, then ## form dataframe dL - ldply(L, function(.x) toString(unlist(.x))) ## add an index vector dL$ind - seq_len(5000) ## count unique strings; return string, frequency, indeces result - ddply(dL, .(V1), summarize, freq=length(V1), idx=toString(ind)) -Peter Ehlers __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Hausman test for endogeneity
Hello On Sat, Oct 9, 2010 at 2:37 PM, Holger Steinmetz holger.steinm...@web.de wrote: can anybody point me in the right direction on how to conduct a hausman test for endogeneity in simultanous equation models? Try install.packages('sos') require(sos) findFn('hausman') Here I get these results: findFn('hausman') found 22 matches; retrieving 2 pages 2 Liviu __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] own TAB expansion
Am Samstag, 9. Oktober 2010, 08:39:36 schrieb Deepayan Sarkar: On Fri, Oct 8, 2010 at 6:19 AM, Sebastian Gibb li...@sebastiangibb.de wrote: Hello Duncan, thank for your advice, but it doesn't work like expected: setClass(Class=A, representation=representation(slotA=numeric, slotB=numeric)); setMethod($, A, function(x, name) {return(slot(x, name));}) setGeneric(.DollarNames) setMethod(.DollarNames, signature(x=A), function(x, pattern)grep(pattern=pattern, x=c(slotA, slotB), value=T)) a - new(A, slotA=1, slotB=2) a$sl TAB # doesn't print slotA/slotB a$ What I'm doing wrong? There is a namespace issue with making .DollarNames() generic; basically, the completion code in the utils namespace never sees the new S4 generic. See a previous discussion at http://www.mail-archive.com/r-de...@r-project.org/msg20553.html Defining a S3 method should work (without the need for a dummy S3 class even with inheritance if you are working with R 2.12): .DollarNames.A - function(x, pattern) { grep(pattern=pattern, x=c(slotA, slotB), value=T) } -Deepayan Hello Deepayan, thanks for the link. This solution works for R 2.12. Bye Sebastian __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] StrSplit
Obviously Jim's solution does work, and I did not intend to imply it didn't. In fact, his read.table solution would work both if the OP had a semi-colon delimited file to begin with (which I was trying to say was ideal from a workflow standpoint) or a vector of strings (for use when paired with textConnections). Using strsplit is merely another solution for the latter situation. I thought the OP might appreciate seeing how to use the function that they indicated they were having problems with. Plus, I have a penchant for R-ishly unreadble code. ;) Thanks for clarifying, Jeff. On Sat, Oct 9, 2010 at 1:04 PM, David Winsemius dwinsem...@comcast.net wrote: On Oct 9, 2010, at 12:46 PM, Jeffrey Spies wrote: Jim's solution is the ideal way to read in the data: using the sep=; argument in read.table. However, if you do for some reason have a vector of strings like the following (maybe someone gives you an Rdata file instead of the raw data file): MF_Data - c(106506;AIG India Liquid Fund-Institutional Plan-Daily Dividend Option;1001.;1001.;1001.;02-Oct-2010,106511;AIG India Liquid Fund-Institutional Plan-Growth Option;1210.4612;1210.4612;1210.4612;02-Oct-2010) Then you can use this to get a data frame: as.data.frame(do.call(rbind, lapply(MF_Data, function(x) unlist(strsplit(x, ';') If you are suggesting that Jim's solution would not work here, then I would disagree and suggest you try offering your vector (without the cr's inserted by our mail clients) to his code. It should work just fine and be far more readable. On the other hand if you were offering this with an explanation that strsplit's split argument is more flexible than the sep argument in the read functions because it accepts regular expressions and so can handle situations where multiple separators exist in the same line, then I would applaud you. -- David. Cheers, Jeff. On Sat, Oct 9, 2010 at 12:30 PM, jim holtman jholt...@gmail.com wrote: Is this what you are after: x - c(Scheme Code;Scheme Name;Net Asset Value;Repurchase Price;Sale Price;Date + , + ,Open Ended Schemes ( Liquid ) + , + , + , AIG Global Investment Group Mutual Fund + , 106506;AIG India Liquid Fund-Institutional Plan-Daily Dividend Option;1001.;1001.;1001.;02-Oct-2010 + , 106511;AIG India Liquid Fund-Institutional Plan-Growth Option;1210.4612;1210.4612;1210.4612;02-Oct-2010 + , 106507;AIG India Liquid Fund-Institutional Plan-Weekly Dividend Option;1001.8765;1001.8765;1001.8765;02-Oct-2010 + , 106503;AIG India Liquid Fund-Retail Plan-DailyDividend Option;1001.;1001.;1001.;02-Oct-2010) myData - read.table(textConnection(x[7:10]), sep=';') closeAllConnections() str(myData) 'data.frame': 4 obs. of 6 variables: $ V1: int 106506 106511 106507 106503 $ V2: Factor w/ 4 levels AIG India Liquid Fund-Institutional Plan-Daily Dividend Option,..: 1 2 3 4 $ V3: num 1001 1210 1002 1001 $ V4: num 1001 1210 1002 1001 $ V5: num 1001 1210 1002 1001 $ V6: Factor w/ 1 level 02-Oct-2010: 1 1 1 1 myData V1 V2 V3 V4 V5 V6 1 106506 AIG India Liquid Fund-Institutional Plan-Daily Dividend Option 1001.000 1001.000 1001.000 02-Oct-2010 2 106511 AIG India Liquid Fund-Institutional Plan-Growth Option 1210.461 1210.461 1210.461 02-Oct-2010 3 106507 AIG India Liquid Fund-Institutional Plan-Weekly Dividend Option 1001.876 1001.876 1001.876 02-Oct-2010 4 106503 AIG India Liquid Fund-Retail Plan-DailyDividend Option 1001.000 1001.000 1001.000 02-Oct-2010 On Sat, Oct 9, 2010 at 12:18 PM, Santosh Srinivas santosh.srini...@gmail.com wrote: Newbie question ... I am looking something equivalent to read.delim but which accepts a text line as parameter instead of a file input. Below is my problem, I'm unable to get the exact output which is a simple data frame of the data where the delimiter exists ... coming quite close though I have a data frame with 10 lines called MF_Data MF_Data [1:10] [1] Scheme Code;Scheme Name;Net Asset Value;Repurchase Price;Sale Price;Date [2] [3] Open Ended Schemes ( Liquid ) [4] [5] [6] AIG Global Investment Group Mutual Fund [7] 106506;AIG India Liquid Fund-Institutional Plan-Daily Dividend Option;1001.;1001.;1001.;02-Oct-2010 [8] 106511;AIG India Liquid Fund-Institutional Plan-Growth Option;1210.4612;1210.4612;1210.4612;02-Oct-2010 [9] 106507;AIG India Liquid Fund-Institutional Plan-Weekly Dividend Option;1001.8765;1001.8765;1001.8765;02-Oct-2010 [10] 106503;AIG India Liquid Fund-Retail Plan-DailyDividend Option;1001.;1001.;1001.;02-Oct-2010 Now for the lines below .. they are delimted by ; ... I am using tempTxt - MF_Data[7] MF_Data_F - unlist(strsplit(tempTxt,;, fixed = TRUE)) tempTxt - MF_Data[8] MF_Data_F1 - unlist(strsplit(tempTxt,;, fixed = TRUE)) MF_Data_F - rbind(MF_Data_F,MF_Data_F1) But MF_Data_F is not
[R] GPS data!
Hello R-experts, I have some coordinates that look like this: lat long 32 31.85 59 48.74 34 05.7 58 50.79 34 05.7 58 50.79 34 05.7 58 50.79 This was my GPS setting by the time of filed trip. I assume that the second column is minute + seconds. Am i right? I am looking for a function to convert them to decimal degree. Appreciate it if I get any help. All the best, Mehdi [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Plot time range with rect or boxplot
Hi, I am trying to use rect (R2.11) to plot a set of data as following data CompanyPt Pri Pub 1AWO520 8/5/09 2/11/10 2BWO893 7/30/03 2/24/05 3AWO258 12/8/08 6/17/10 4C WO248 1/13/09 9/2/10 pri- strptime(pri,%m/%d/%y) pub - strptime(pub,%m/%d/%y) plot.new() plot.window(xlim=c(min(pri,pub),max(pri,pub)),ylim=c(0,length(company)-1)) %y - seq(0,0.5*(length(company)-1),0.5) %h - 0.1 %rect(pri, y-h, pub, y+h, col=c(light blue,pink,yellow,red)) Neither xlim nor rect/boxplot recognizes pri/pub with date format. I wonder if there is a good way to deal with the date ploting so the x-axis can reflect the actual time range. Thank you, Eric __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Memory management in R
My suggestion is to explore other alternatives. (I will admit that I don't yet fully understand the test that you are applying.) Hi, I am trying to partially implement the Lempel Ziv compression algorithm. The point is that compressibility and entropy of a time series are related, hence my final goal is to evaluate the entropy of a time series. You can find more at http://bit.ly/93zX4T http://en.wikipedia.org/wiki/LZ77_and_LZ78 http://bit.ly/9NgIFt The two that have occurred to me are Biostrings which I have already mentioned and rle() which I have illustrated the use of but not referenced as an avenue. The Biostrings package is part of bioConductor (part of the R universe) although you should be prepared for a coffee break when you install it if you haven't gotten at least bioClite already installed. When I installed it last night it had 54 other package dependents also downloaded and installed. It seems to me that taking advantage of the coding resources in the molecular biology domain that are currently directed at decoding the information storage mechanism of life might be a smart strategy. You have not described the domain you are working in but I would guess that the digest package might be biological in primary application? So forgive me if I am preaching to the choir. The rle option also occurred to me but it might take a smarter coder than I to fully implement it. (But maybe Holtman would be up to it. He's a _lot_ smarter than I.) In your example the long x string is faithfully represented by two aligned vectors, each 197 characters in length. The long repeat sequence that broke the grepl mechanism are just one pair of values. rle(x) Run Length Encoding lengths: int [1:197] 1 1 2 1 1 4 1 9 1 1 ... values : chr [1:197] 5d64d58a ac76183b 202fbcc4 78087f5e ... So maybe as soon as you got to a bundle that was greater than 1/2 the overall length (as happened in the x case) you could stop, since it could not have occurred before. I doubt that rle() can be deployed to replace Lempel-Ziv (LZ) algorithm in a trivial way. As a less convoluted example, consider the series x - c(d,a,b,d,a,b,e,z) If i=4 and therefore the i-th element is the second 'd' in the series, the shortest series starting from i=4 that I do not see in the past of 'd' is d,a,b,e, whose length is equal to 4 and that is the value returned by the function below. The frustrating thing is that I already have the tools I need, just they crash for reasons beyond my control on relatively short series. If anyone can make the function below more robust, that is really a big help for me. Cheers Lorenzo ### entropy_lz - function(x,i){ past - x[1:i-1] n - length(x) lp - length(past) future - x[i:n] go_on - 1 count_len - 0 past_string - paste(past, collapse=#) while (go_on0){ new_seq - x[i:(i+count_len)] fut_string - paste(new_seq, collapse=#) count_len - count_len+1 if (grepl(fut_string,past_string)!=1){ go_on - -1 } } return(count_len) } x - c(c,a,b,c,a,b,e,z) S - entropy_lz(x,4) __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Loss of precision in read.csv.
Given a csv file from this location Airports-http://www.ourairports.com/data/airports.csv; download.file(Airports,basename(Airports)) airports -read.csv(airports.csv,encoding=UTF-8) airports[1,] id ident type name latitude_deg longitude_deg elevation_ft continent iso_country iso_region municipality scheduled_service 1 6523 00A heliport Total Rf Heliport *40.0708 -74.9336 * 11 NA US US-PA Bensalemno gps_code iata_code local_code home_link wikipedia_link keywords 1 00A 00A And the precision is lost which we can show by using readLines: fred-readLines(airports.csv) fred[2] [1] 6523,\00A\,\heliport\,\Total Rf Heliport\,* 40.07080078125,-74.9336013793945* ,11,\NA\,\US\,\US-PA\,\Bensalem\,\no\,\00A\,,\00A\,,, I tried various approaches, using colClasses, switching to read.tables, specifying dec=. I tested read.csv and it does preserve precision on my test case, but not on this data. Ideas? [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] same random numbers in different sessions
Dear all I'm using Xubuntu Lucid and I keep getting the same random numbers whenever I start a new session of R. For example, I keep getting sample(1:1000, 1) [1] 87 or rnorm(1:10) [1] -1.3618103 0.4241701 1.0720076 0.2208145 -0.5375314 -0.4846588 [7] 0.7576768 0.6527407 -0.6868786 0.8718527 I expected that some set.seed() instruction woudl be present in a config file in /usr/lib/R/etc/ but after grepping the only reference came out in Rprofile.site and it was commented out: # set.seed(1234) What else could be causing this? Regards Liviu sessionInfo() R version 2.11.1 (2010-05-31) x86_64-pc-linux-gnu locale: [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C [3] LC_TIME=en_US.UTF-8LC_COLLATE=en_US.UTF-8 [5] LC_MONETARY=C LC_MESSAGES=en_US.UTF-8 [7] LC_PAPER=en_US.UTF-8 LC_NAME=C [9] LC_ADDRESS=C LC_TELEPHONE=C [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C attached base packages: [1] stats graphics grDevices utils datasets methods base other attached packages: [1] fortunes_1.4-0 sos_1.3-0 brew_1.0-3 IPSUR_1.1 -- Do you know how to read? http://www.alienetworks.com/srtest.cfm http://goodies.xfce.org/projects/applications/xfce4-dict#speed-reader Do you know how to write? http://garbl.home.comcast.net/~garbl/stylemanual/e.htm#e-mail __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Loss of precision in read.csv.
Hi Steven, As near as I can tell, no precision is lost. R is just being courteous and not excessively filling our consoles. Try: print(airports[1,latitude_deg], digits = 22) which is the most digits R will print (although internally it can store more I believe). Alternately, you can convert it to character class: as.character(airports[1, ]) So in short, this is just a cosmetic feature of presenting the data, not its actual storage. Cheers, Josh On Sat, Oct 9, 2010 at 1:33 PM, steven mosher mosherste...@gmail.com wrote: Given a csv file from this location Airports-http://www.ourairports.com/data/airports.csv; download.file(Airports,basename(Airports)) airports -read.csv(airports.csv,encoding=UTF-8) airports[1,] id ident type name latitude_deg longitude_deg elevation_ft continent iso_country iso_region municipality scheduled_service 1 6523 00A heliport Total Rf Heliport *40.0708 -74.9336 * 11 NA US US-PA Bensalem no gps_code iata_code local_code home_link wikipedia_link keywords 1 00A 00A And the precision is lost which we can show by using readLines: fred-readLines(airports.csv) fred[2] [1] 6523,\00A\,\heliport\,\Total Rf Heliport\,* 40.07080078125,-74.9336013793945* ,11,\NA\,\US\,\US-PA\,\Bensalem\,\no\,\00A\,,\00A\,,, I tried various approaches, using colClasses, switching to read.tables, specifying dec=. I tested read.csv and it does preserve precision on my test case, but not on this data. Ideas? [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Joshua Wiley Ph.D. Student, Health Psychology University of California, Los Angeles http://www.joshuawiley.com/ __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] same random numbers in different sessions
-Original Message- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On Behalf Of Liviu Andronic Sent: Saturday, October 09, 2010 2:15 PM To: r-help@r-project.org Help Subject: [R] same random numbers in different sessions Dear all I'm using Xubuntu Lucid and I keep getting the same random numbers whenever I start a new session of R. For example, I keep getting sample(1:1000, 1) [1] 87 or rnorm(1:10) [1] -1.3618103 0.4241701 1.0720076 0.2208145 -0.5375314 -0.4846588 [7] 0.7576768 0.6527407 -0.6868786 0.8718527 I expected that some set.seed() instruction woudl be present in a config file in /usr/lib/R/etc/ but after grepping the only reference came out in Rprofile.site and it was commented out: # set.seed(1234) What else could be causing this? Regards Liviu Could you be reloading a workspace at start-up that is setting the seed? What happens if you start R using the --vanilla option? Dan Daniel Nordlund Bothell, WA USA __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] same random numbers in different sessions
Dear Liviu, On Sat, Oct 9, 2010 at 5:14 PM, Liviu Andronic landronim...@gmail.com wrote: Dear all I'm using Xubuntu Lucid and I keep getting the same random numbers whenever I start a new session of R. For example, I keep getting sample(1:1000, 1) [1] 87 or rnorm(1:10) [1] -1.3618103 0.4241701 1.0720076 0.2208145 -0.5375314 -0.4846588 [7] 0.7576768 0.6527407 -0.6868786 0.8718527 I expected that some set.seed() instruction woudl be present in a config file in /usr/lib/R/etc/ but after grepping the only reference came out in Rprofile.site and it was commented out: # set.seed(1234) What else could be causing this? Regards Liviu sessionInfo() R version 2.11.1 (2010-05-31) x86_64-pc-linux-gnu locale: [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C [3] LC_TIME=en_US.UTF-8 LC_COLLATE=en_US.UTF-8 [5] LC_MONETARY=C LC_MESSAGES=en_US.UTF-8 [7] LC_PAPER=en_US.UTF-8 LC_NAME=C [9] LC_ADDRESS=C LC_TELEPHONE=C [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C attached base packages: [1] stats graphics grDevices utils datasets methods base other attached packages: [1] fortunes_1.4-0 sos_1.3-0 brew_1.0-3 IPSUR_1.1 I notice that you have the IPSUR package loaded; you know, just a shot in the dark here, but did you try not loading it? I ask because the vignette is built by making a special choice for set.seed, and the workspace that ships with the package might be interacting in an unexpected way. Please let me know if IPSUR is the culprit. Regards, Jay __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] GPS data!
No need for a function; you can just write the expression yourself: x - read.table(textConnection( 32 31.85 +59 48.74 + 34 05.7 +58 50.79 + 34 05.7 +58 50.79 + 34 05.7 +58 50.79)) closeAllConnections() x V1V2 1 32 31.85 2 59 48.74 3 34 5.70 4 58 50.79 5 34 5.70 6 58 50.79 7 34 5.70 8 58 50.79 # convert x$V1 + x$V2 / 60 [1] 32.53083 59.81233 34.09500 58.84650 34.09500 58.84650 34.09500 58.84650 On Sat, Oct 9, 2010 at 2:49 PM, Mehdi Zarrei gagzar...@yahoo.com wrote: Hello R-experts, I have some coordinates that look like this: lat long 32 31.85 59 48.74 34 05.7 58 50.79 34 05.7 58 50.79 34 05.7 58 50.79 This was my GPS setting by the time of filed trip. I assume that the second column is minute + seconds. Am i right? I am looking for a function to convert them to decimal degree. Appreciate it if I get any help. All the best, Mehdi [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem that you are trying to solve? __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] GPS data!
Have you tried sos: install.packages('sos') # if not already installed library(sos) (gps - ???GPS) This found 63 matches for me right now. The results open as a table in a web browser with the package with the most matches first and with hot links to the help page for each match in the right hand column. Hope this helps. Spencer On 10/9/2010 2:54 PM, jim holtman wrote: No need for a function; you can just write the expression yourself: x- read.table(textConnection( 32 31.85 +59 48.74 + 34 05.7 +58 50.79 + 34 05.7 +58 50.79 + 34 05.7 +58 50.79)) closeAllConnections() x V1V2 1 32 31.85 2 59 48.74 3 34 5.70 4 58 50.79 5 34 5.70 6 58 50.79 7 34 5.70 8 58 50.79 # convert x$V1 + x$V2 / 60 [1] 32.53083 59.81233 34.09500 58.84650 34.09500 58.84650 34.09500 58.84650 On Sat, Oct 9, 2010 at 2:49 PM, Mehdi Zarreigagzar...@yahoo.com wrote: Hello R-experts, I have some coordinates that look like this: lat long 32 31.85 59 48.74 34 05.7 58 50.79 34 05.7 58 50.79 34 05.7 58 50.79 This was my GPS setting by the time of filed trip. I assume that the second column is minute + seconds. Am i right? I am looking for a function to convert them to decimal degree. Appreciate it if I get any help. All the best, Mehdi [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Spencer Graves, PE, PhD President and Chief Operating Officer Structure Inspection and Monitoring, Inc. 751 Emerson Ct. San José, CA 95126 ph: 408-655-4567 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] same random numbers in different sessions
You need to set the set.seed yourself. There are some simulation where I do want the same numbers generated and can use the set.seed to set it to a know value. If you want something random each time, then use the time of day in the call to set.seed. On Sat, Oct 9, 2010 at 5:14 PM, Liviu Andronic landronim...@gmail.com wrote: Dear all I'm using Xubuntu Lucid and I keep getting the same random numbers whenever I start a new session of R. For example, I keep getting sample(1:1000, 1) [1] 87 or rnorm(1:10) [1] -1.3618103 0.4241701 1.0720076 0.2208145 -0.5375314 -0.4846588 [7] 0.7576768 0.6527407 -0.6868786 0.8718527 I expected that some set.seed() instruction woudl be present in a config file in /usr/lib/R/etc/ but after grepping the only reference came out in Rprofile.site and it was commented out: # set.seed(1234) What else could be causing this? Regards Liviu sessionInfo() R version 2.11.1 (2010-05-31) x86_64-pc-linux-gnu locale: [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C [3] LC_TIME=en_US.UTF-8 LC_COLLATE=en_US.UTF-8 [5] LC_MONETARY=C LC_MESSAGES=en_US.UTF-8 [7] LC_PAPER=en_US.UTF-8 LC_NAME=C [9] LC_ADDRESS=C LC_TELEPHONE=C [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C attached base packages: [1] stats graphics grDevices utils datasets methods base other attached packages: [1] fortunes_1.4-0 sos_1.3-0 brew_1.0-3 IPSUR_1.1 -- Do you know how to read? http://www.alienetworks.com/srtest.cfm http://goodies.xfce.org/projects/applications/xfce4-dict#speed-reader Do you know how to write? http://garbl.home.comcast.net/~garbl/stylemanual/e.htm#e-mail __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem that you are trying to solve? __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Plot time range with rect or boxplot
Try this. You also had some typos on the names and weren't using the dataframe correctly. x - read.table(textConnection( CompanyPt Pri Pub 1AWO520 8/5/09 2/11/10 2BWO893 7/30/03 2/24/05 3AWO258 12/8/08 6/17/10 4C WO248 1/13/09 9/2/10), header = TRUE, as.is = TRUE) closeAllConnections() x$Pri - as.Date(x$Pri, format = '%m/%d/%y') x$Pub - as.Date(x$Pub, format = '%m/%d/%y') y - seq(0,0.5*(length(x$Company)-1),0.5) h - 0.1 plot(range(x$Pri, x$Pub), c(0, nrow(x) - 1), type = 'n') rect(x$Pri, y-h, x$Pub, y+h, col=c(light blue,pink,yellow,red)) On Sat, Oct 9, 2010 at 3:10 PM, Eric Hu eric...@gilead.com wrote: Hi, I am trying to use rect (R2.11) to plot a set of data as following data Company Pt Pri Pub 1 A WO520 8/5/09 2/11/10 2 B WO893 7/30/03 2/24/05 3 A WO258 12/8/08 6/17/10 4 C WO248 1/13/09 9/2/10 pri- strptime(pri,%m/%d/%y) pub - strptime(pub,%m/%d/%y) plot.new() plot.window(xlim=c(min(pri,pub),max(pri,pub)),ylim=c(0,length(company)-1)) %y - seq(0,0.5*(length(company)-1),0.5) %h - 0.1 %rect(pri, y-h, pub, y+h, col=c(light blue,pink,yellow,red)) Neither xlim nor rect/boxplot recognizes pri/pub with date format. I wonder if there is a good way to deal with the date ploting so the x-axis can reflect the actual time range. Thank you, Eric __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem that you are trying to solve? __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Memory management in R
On Oct 9, 2010, at 4:23 PM, Lorenzo Isella wrote: My suggestion is to explore other alternatives. (I will admit that I don't yet fully understand the test that you are applying.) Hi, I am trying to partially implement the Lempel Ziv compression algorithm. The point is that compressibility and entropy of a time series are related, hence my final goal is to evaluate the entropy of a time series. You can find more at http://bit.ly/93zX4T http://en.wikipedia.org/wiki/LZ77_and_LZ78 http://bit.ly/9NgIFt The two that have occurred to me are Biostrings which I have already mentioned and rle() which I have illustrated the use of but not referenced as an avenue. The Biostrings package is part of bioConductor (part of the R universe) although you should be prepared for a coffee break when you install it if you haven't gotten at least bioClite already installed. When I installed it last night it had 54 other package dependents also downloaded and installed. It seems to me that taking advantage of the coding resources in the molecular biology domain that are currently directed at decoding the information storage mechanism of life might be a smart strategy. You have not described the domain you are working in but I would guess that the digest package might be biological in primary application? So forgive me if I am preaching to the choir. The rle option also occurred to me but it might take a smarter coder than I to fully implement it. (But maybe Holtman would be up to it. He's a _lot_ smarter than I.) In your example the long x string is faithfully represented by two aligned vectors, each 197 characters in length. The long repeat sequence that broke the grepl mechanism are just one pair of values. rle(x) Run Length Encoding lengths: int [1:197] 1 1 2 1 1 4 1 9 1 1 ... values : chr [1:197] 5d64d58a ac76183b 202fbcc4 78087f5e ... So maybe as soon as you got to a bundle that was greater than 1/2 the overall length (as happened in the x case) you could stop, since it could not have occurred before. I doubt that rle() can be deployed to replace Lempel-Ziv (LZ) algorithm in a trivial way. As a less convoluted example, consider the series x - c(d,a,b,d,a,b,e,z) If i=4 and therefore the i-th element is the second 'd' in the series, the shortest series starting from i=4 that I do not see in the past of 'd' is d,a,b,e, whose length is equal to 4 and that is the value returned by the function below. The frustrating thing is that I already have the tools I need, just they crash for reasons beyond my control on relatively short series. If anyone can make the function below more robust, that is really a big help for me. I already offered the Biostrings package. It provides more robust methods for string matching than does grepl. Is there a reason that you choose not to? -- David. Cheers Lorenzo ### entropy_lz - function(x,i){ past - x[1:i-1] n - length(x) lp - length(past) future - x[i:n] go_on - 1 count_len - 0 past_string - paste(past, collapse=#) while (go_on0){ new_seq - x[i:(i+count_len)] fut_string - paste(new_seq, collapse=#) count_len - count_len+1 if (grepl(fut_string,past_string)!=1){ go_on - -1 } } return(count_len) } x - c(c,a,b,c,a,b,e,z) S - entropy_lz(x,4) David Winsemius, MD West Hartford, CT __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] GC verbose=false still showing report
I must be reading the help file for gc() wrong. I thought it said that gc(verbose=FALSE) will run the garbage collection without printing the Ncells/Vcells summary. However, this is what I get: gc(verbose = FALSE) used (Mb) gc trigger (Mb) max used (Mb) Ncells 267097 14.3 531268 28.4 531268 28.4 Vcells 429302 3.3 20829406 159.0 55923977 426.7 I'm embedding this in an Sweave/TeX file, so I *really* can't have this printing out. Suggestions other than manually editing the TeX file? Robin Jeffries MS, DrPH Candidate Department of Biostatistics UCLA 530-624-0428 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] GC verbose=false still showing report
Try invisible(gc()) ? Robin Jeffries rjeffr...@ucla.edu wrote: I must be reading the help file for gc() wrong. I thought it said that gc(verbose=FALSE) will run the garbage collection without printing the Ncells/Vcells summary. However, this is what I get: gc(verbose = FALSE) used (Mb) gc trigger (Mb) max used (Mb) Ncells 267097 14.3 531268 28.4 531268 28.4 Vcells 429302 3.3 20829406 159.0 55923977 426.7 I'm embedding this in an Sweave/TeX file, so I *really* can't have this printing out. Suggestions other than manually editing the TeX file? Robin Jeffries MS, DrPH Candidate Department of Biostatistics UCLA 530-624-0428 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. --- Jeff NewmillerThe . . Go Live... DCN:jdnew...@dcn.davis.ca.usBasics: ##.#. ##.#. Live Go... Live: OO#.. Dead: OO#.. Playing Research Engineer (Solar/BatteriesO.O#. #.O#. with /Software/Embedded Controllers) .OO#. .OO#. rocks...1k --- Sent from my phone. Please excuse my brevity. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] svg plot and dashed lines
Hi On 29/09/2010 11:15 p.m., Ivan Calandra wrote: Dear users, When I boxplot(), the lines of the whiskers are dashed. However, when I save in an svg file, the dashed lines of the whiskers are not dashed anymore. How can I have the dashed lines in the svg file? I don't have this problem with a ps file, but I cannot edit such file as easily as an svg file. That's why I'd like to stick to the svg format. Assuming you're on Windows, you could try something like ... # Install the 'Cairo' package from CRAN library(Cairo) CairoSVG(test.svg) boxplot(b~a, data=df) dev.off() ... on a modern Linux system, this should simplify to ... svg(test.svg)) boxplot(b~a, data=df) dev.off() Paul Thanks in advance, Ivan df- structure(list(a = structure(c(1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L), .Label = c(A, B), class = factor), b = c(0.904439748839731, -0.855322875817714, -0.957288625102814, 0.130401502975395, -1.27765131101282, -2.08861064654457, 1.10234256081394, -2.05533035069656, -1.04529859053820, -0.0847903566670016, 1.02553030160793, 0.321170740199536, 1.87419854190502, -0.891404432182873, 0.968745913802415, -0.85229752730528, 0.641555656821046, 1.72455661053506, -0.523097596614304, 1.26729031187194)), .Names = c(a, b), row.names = c(NA, -20L), class = data.frame) library(RSvgDevice) devSVG(file=test.svg) boxplot(b~a, data=df) dev.off() -- Dr Paul Murrell Department of Statistics The University of Auckland Private Bag 92019 Auckland New Zealand 64 9 3737599 x85392 p...@stat.auckland.ac.nz http://www.stat.auckland.ac.nz/~paul/ __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] StrSplit
Thanks Jim. Exactly what I needed! -Original Message- From: jim holtman [mailto:jholt...@gmail.com] Sent: 09 October 2010 22:01 To: Santosh Srinivas Cc: r-help@r-project.org Subject: Re: [R] StrSplit Is this what you are after: x - c(Scheme Code;Scheme Name;Net Asset Value;Repurchase Price;Sale Price;Date + , + ,Open Ended Schemes ( Liquid ) + , + , + , AIG Global Investment Group Mutual Fund + , 106506;AIG India Liquid Fund-Institutional Plan-Daily Dividend Option;1001.;1001.;1001.;02-Oct-2010 + , 106511;AIG India Liquid Fund-Institutional Plan-Growth Option;1210.4612;1210.4612;1210.4612;02-Oct-2010 + , 106507;AIG India Liquid Fund-Institutional Plan-Weekly Dividend Option;1001.8765;1001.8765;1001.8765;02-Oct-2010 + , 106503;AIG India Liquid Fund-Retail Plan-DailyDividend Option;1001.;1001.;1001.;02-Oct-2010) myData - read.table(textConnection(x[7:10]), sep=';') closeAllConnections() str(myData) 'data.frame': 4 obs. of 6 variables: $ V1: int 106506 106511 106507 106503 $ V2: Factor w/ 4 levels AIG India Liquid Fund-Institutional Plan-Daily Dividend Option,..: 1 2 3 4 $ V3: num 1001 1210 1002 1001 $ V4: num 1001 1210 1002 1001 $ V5: num 1001 1210 1002 1001 $ V6: Factor w/ 1 level 02-Oct-2010: 1 1 1 1 myData V1 V2 V3 V4 V5 V6 1 106506 AIG India Liquid Fund-Institutional Plan-Daily Dividend Option 1001.000 1001.000 1001.000 02-Oct-2010 2 106511 AIG India Liquid Fund-Institutional Plan-Growth Option 1210.461 1210.461 1210.461 02-Oct-2010 3 106507 AIG India Liquid Fund-Institutional Plan-Weekly Dividend Option 1001.876 1001.876 1001.876 02-Oct-2010 4 106503 AIG India Liquid Fund-Retail Plan-DailyDividend Option 1001.000 1001.000 1001.000 02-Oct-2010 On Sat, Oct 9, 2010 at 12:18 PM, Santosh Srinivas santosh.srini...@gmail.com wrote: Newbie question ... I am looking something equivalent to read.delim but which accepts a text line as parameter instead of a file input. Below is my problem, I'm unable to get the exact output which is a simple data frame of the data where the delimiter exists ... coming quite close though I have a data frame with 10 lines called MF_Data MF_Data [1:10] [1] Scheme Code;Scheme Name;Net Asset Value;Repurchase Price;Sale Price;Date [2] [3] Open Ended Schemes ( Liquid ) [4] [5] [6] AIG Global Investment Group Mutual Fund [7] 106506;AIG India Liquid Fund-Institutional Plan-Daily Dividend Option;1001.;1001.;1001.;02-Oct-2010 [8] 106511;AIG India Liquid Fund-Institutional Plan-Growth Option;1210.4612;1210.4612;1210.4612;02-Oct-2010 [9] 106507;AIG India Liquid Fund-Institutional Plan-Weekly Dividend Option;1001.8765;1001.8765;1001.8765;02-Oct-2010 [10] 106503;AIG India Liquid Fund-Retail Plan-DailyDividend Option;1001.;1001.;1001.;02-Oct-2010 Now for the lines below .. they are delimted by ; ... I am using tempTxt - MF_Data[7] MF_Data_F - unlist(strsplit(tempTxt,;, fixed = TRUE)) tempTxt - MF_Data[8] MF_Data_F1 - unlist(strsplit(tempTxt,;, fixed = TRUE)) MF_Data_F - rbind(MF_Data_F,MF_Data_F1) But MF_Data_F is not a simple 2X6 data frame which is what I want __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem that you are trying to solve? __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] GC verbose=false still showing report
invisible(gc()) worked perfectly. Thanks Jeff. @ Josh: I know how to toggle showing/hiding command echos, but I haven't figured out how to toggle on/off any printed output. On Sat, Oct 9, 2010 at 5:10 PM, Robin Jeffries rjeffr...@ucla.edu wrote: I must be reading the help file for gc() wrong. I thought it said that gc(verbose=FALSE) will run the garbage collection without printing the Ncells/Vcells summary. However, this is what I get: gc(verbose = FALSE) used (Mb) gc trigger (Mb) max used (Mb) Ncells 267097 14.3 531268 28.4 531268 28.4 Vcells 429302 3.3 20829406 159.0 55923977 426.7 I'm embedding this in an Sweave/TeX file, so I *really* can't have this printing out. Suggestions other than manually editing the TeX file? Robin Jeffries MS, DrPH Candidate Department of Biostatistics UCLA 530-624-0428 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Loss of precision in read.csv.
Ha Thanks, That was it. On Sat, Oct 9, 2010 at 2:38 PM, Joshua Wiley jwiley.ps...@gmail.com wrote: Hi Steven, As near as I can tell, no precision is lost. R is just being courteous and not excessively filling our consoles. Try: print(airports[1,latitude_deg], digits = 22) which is the most digits R will print (although internally it can store more I believe). Alternately, you can convert it to character class: as.character(airports[1, ]) So in short, this is just a cosmetic feature of presenting the data, not its actual storage. Cheers, Josh On Sat, Oct 9, 2010 at 1:33 PM, steven mosher mosherste...@gmail.com wrote: Given a csv file from this location Airports-http://www.ourairports.com/data/airports.csv; download.file(Airports,basename(Airports)) airports -read.csv(airports.csv,encoding=UTF-8) airports[1,] id ident type name latitude_deg longitude_deg elevation_ft continent iso_country iso_region municipality scheduled_service 1 6523 00A heliport Total Rf Heliport *40.0708 -74.9336 * 11 NA US US-PA Bensalemno gps_code iata_code local_code home_link wikipedia_link keywords 1 00A 00A And the precision is lost which we can show by using readLines: fred-readLines(airports.csv) fred[2] [1] 6523,\00A\,\heliport\,\Total Rf Heliport\,* 40.07080078125,-74.9336013793945* ,11,\NA\,\US\,\US-PA\,\Bensalem\,\no\,\00A\,,\00A\,,, I tried various approaches, using colClasses, switching to read.tables, specifying dec=. I tested read.csv and it does preserve precision on my test case, but not on this data. Ideas? [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Joshua Wiley Ph.D. Student, Health Psychology University of California, Los Angeles http://www.joshuawiley.com/ [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] read.table issue
Dear R-Group, I am getting this error message incomplete final line found by readTableHeader in the code below. It seems to me that the error message is because of quote in the text data. Is there any easy way to handle this? Or should I do a substitute. tempTxt - 100589;Canara Robeco Expo-Income Plan;18.92;18.92;19.35;02-Apr-2007 + read.table(textConnection(tempTxt), sep=';') V1 V2V3V4V5 V6 1 100589 Canara Robeco Expo-Income Plan 18.92 18.92 19.35 02-Apr-2007 tempTxt - 103272;Canara Robeco Fortune '94;30.07;30.07;30.75;02-Apr-2007 + read.table(textConnection(tempTxt), sep=';') Error in read.table(textConnection(tempTxt), sep = ;) : incomplete final line found by readTableHeader on 'tempTxt' Thanks, Santosh __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Help needed for getYahooData in TTR package writing the Yahoo data to excel
Dear all, I'm totally new to R. Recently I've been trying to use getYahooData in TTR package in order to download stock index daily open/high/low/close. The downloaded data is in the format of Open High Low Close Volume 2000-01-04 18937.45 19187.61 18937.45 19002.86 0 2000-01-05 19003.51 19003.51 18221.82 18542.55 0 2000-01-06 18574.01 18582.74 18168.27 18168.27 0 2000-01-07 18194.05 18285.73 18068.10 18193.41 0 2000-01-11 18246.10 18887.56 18246.10 18850.92 0 2000-01-12 18780.17 18811.87 18626.92 18677.42 0 2000-01-13 18667.18 18845.03 18667.18 18833.29 0 2000-01-14 18882.99 19058.02 18733.83 18956.55 0 2000-01-17 19025.62 19442.58 19025.62 19437.23 0 2000-01-18 19412.47 19412.47 19145.17 19196.57 0 However, when I attempted to write the data to excel using write.table, dates in the first colume become 1,2,3,4 in the excel file. Same problem happened if write.csv was used. If you run these two lines of code you'll get what I meant.. before running the code, package TTR needs to be loaded. N225 - getYahooData(^N225, 2101, ) write.table(N225,Nikkei.xls,sep='\t', row.name = TRUE , col.name = NA) Appreciate your kind assistance! Thanks a lot in advance. -- View this message in context: http://r.789695.n4.nabble.com/Help-needed-for-getYahooData-in-TTR-package-writing-the-Yahoo-data-to-excel-tp2970017p2970017.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Help needed for getYahooData in TTR package writing the Yahoo data to excel
On Oct 9, 2010, at 10:54 PM, missvanilla wrote: Dear all, I'm totally new to R. Recently I've been trying to use getYahooData in TTR package in order to download stock index daily open/high/low/close. The downloaded data is in the format of Open High Low Close Volume 2000-01-04 18937.45 19187.61 18937.45 19002.86 0 2000-01-05 19003.51 19003.51 18221.82 18542.55 0 2000-01-06 18574.01 18582.74 18168.27 18168.27 0 2000-01-07 18194.05 18285.73 18068.10 18193.41 0 2000-01-11 18246.10 18887.56 18246.10 18850.92 0 2000-01-12 18780.17 18811.87 18626.92 18677.42 0 2000-01-13 18667.18 18845.03 18667.18 18833.29 0 2000-01-14 18882.99 19058.02 18733.83 18956.55 0 2000-01-17 19025.62 19442.58 19025.62 19437.23 0 2000-01-18 19412.47 19412.47 19145.17 19196.57 0 However, when I attempted to write the data to excel using write.table, dates in the first colume become 1,2,3,4 in the excel file. Same problem happened if write.csv was used. If you run these two lines of code you'll get what I meant.. before running the code, package TTR needs to be loaded. N225 - getYahooData(^N225, 2101, ) write.table(N225,Nikkei.xls,sep='\t', row.name = TRUE , col.name = NA) There is a well-described problem with write.table files going into Excel. There is no leading item or tab on the first row. You need to insert an extra cell and move the header over one position. Then you won't be misinterpreting your row.names as dates. -- David Appreciate your kind assistance! Thanks a lot in advance. -- View this message in context: http://r.789695.n4.nabble.com/Help-needed-for-getYahooData-in-TTR-package-writing-the-Yahoo-data-to-excel-tp2970017p2970017.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] read.xls??
Greeting all, I am having a little trouble finding the 'right' package that will read in .xls Excel spreadsheets. My Ubuntu base does not seem to have the ability to read them. Any suggestions? Cheers, M __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] How to add a new column to a matrix?
Hi - I am a beginner to the R language. I have written the following matrix: Z.mat=matrix(c(2,2,2,1,1,1,3,2,1,6,5,4,9,1,1,2,3,2), nrow=6) I would like to add a 4th column consisting of: 6, 9, 8, 15, 16, 17 I would also like to name each column a, b, c, d as well. Thanks! [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] read.xls??
On Sat, Oct 9, 2010 at 11:56 PM, Matt Curcio matt.curcio...@gmail.com wrote: Greeting all, I am having a little trouble finding the 'right' package that will read in .xls Excel spreadsheets. My Ubuntu base does not seem to have the ability to read them. For various alternatives see: http://rwiki.sciviews.org/doku.php?id=tips:data-io:ms_windowss=excel -- Statistics Software Consulting GKX Group, GKX Associates Inc. tel: 1-877-GKX-GROUP email: ggrothendieck at gmail.com __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] How to add a new column to a matrix?
Hi, This should do it. I tried to comment to explain things. Z.mat - matrix(c(2,2,2,1,1,1,3,2,1,6,5,4,9,1,1,2,3,2), nrow=6) # column bind data together Z.mat - cbind(Z.mat, c(6,9,8,15,16,17)) # add names to the 2 dimensions of Z.mat # the first element of the list is the row names, left as empty # the second element is the column names # 'letters' is a built in vector of the lower case letters of # the Latin alphabet dimnames(Z.mat) - list(NULL, letters[1:4]) # Another way would be to use colnames(Z.mat) - letters[1:4] # For documentation see especially ? cbind ? dimnames Hope that helps, Josh On Sat, Oct 9, 2010 at 8:16 PM, Lakshmi Kastury skast...@students.poly.edu wrote: Hi - I am a beginner to the R language. I have written the following matrix: Z.mat=matrix(c(2,2,2,1,1,1,3,2,1,6,5,4,9,1,1,2,3,2), nrow=6) I would like to add a 4th column consisting of: 6, 9, 8, 15, 16, 17 I would also like to name each column a, b, c, d as well. Thanks! [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Joshua Wiley Ph.D. Student, Health Psychology University of California, Los Angeles http://www.joshuawiley.com/ __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Mapping the coordinates!
Hello, I have a series of coordinates (latitudes and longitudes) each one/several associated to a code (from 1 to 28). I used function points (latitude, longitudes) to transfer them to a per-prepared map. 1- I wonder how I might be able to automatically add codes (1-28) to the map too? 2-Moreover, mostly there are a few codes from the identical coordinates. What is the function to avoid overlapping of codes on the map? 3- I want to draw closed line around some geographical areas to define the habitats. Your help in any way (introducing manuals, codes, etc) is appreciated. All the best, Mehdi [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] How to add a new column to a matrix?
On Oct 9, 2010, at 11:16 PM, Lakshmi Kastury wrote: Hi - I am a beginner to the R language. I have written the following matrix: Z.mat=matrix(c(2,2,2,1,1,1,3,2,1,6,5,4,9,1,1,2,3,2), nrow=6) I would like to add a 4th column consisting of: 6, 9, 8, 15, 16, 17 ?cbind I would also like to name each column a, b, c, d as well. The help page for matrix seems to be perfectly clear on this point. ?matrix -- David Thanks! [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] read.table issue
The problem is that you have an unbalanced quote (') in your input . you need to specifiy quote = '' in read.table: tempTxt - 103272;Canara Robeco Fortune '94;30.07;30.07;30.75;02-Apr-2007 + read.table(textConnection(tempTxt), sep=';', quote = '') V1V2V3V4V5 V6 1 103272 Canara Robeco Fortune '94 30.07 30.07 30.75 02-Apr-2007 The quote is '94 in the string. On Sat, Oct 9, 2010 at 10:05 PM, Santosh Srinivas santosh.srini...@gmail.com wrote: Dear R-Group, I am getting this error message incomplete final line found by readTableHeader in the code below. It seems to me that the error message is because of quote in the text data. Is there any easy way to handle this? Or should I do a substitute. tempTxt - 100589;Canara Robeco Expo-Income Plan;18.92;18.92;19.35;02-Apr-2007 + read.table(textConnection(tempTxt), sep=';') V1 V2 V3 V4 V5 V6 1 100589 Canara Robeco Expo-Income Plan 18.92 18.92 19.35 02-Apr-2007 tempTxt - 103272;Canara Robeco Fortune '94;30.07;30.07;30.75;02-Apr-2007 + read.table(textConnection(tempTxt), sep=';') Error in read.table(textConnection(tempTxt), sep = ;) : incomplete final line found by readTableHeader on 'tempTxt' Thanks, Santosh __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem that you are trying to solve? __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Mapping the coordinates!
Mehdi Zarrei gagzar...@yahoo.com [Sun, Oct 10, 2010 at 06:11:23AM CEST]: Hello, I have a series of coordinates (latitudes and longitudes) each one/several associated to a code (from 1 to 28). I used function points (latitude, longitudes) to transfer them to a per-prepared map. 1- I wonder how I might be able to automatically add codes (1-28) to the map too? Type ?text at the R prompt. 2-Moreover, mostly there are a few codes from the identical coordinates. What is the function to avoid overlapping of codes on the map? jitter() adds some noise, I don't know of this is sufficient for you. 3- I want to draw closed line around some geographical areas to define the habitats. If you search for convex hull in rseek.org, you may find something relevant for you. I did this and the third result was http://addictedtor.free.fr/graphiques/RGraphGallery.php?graph=61 -- Johannes Hüsing There is something fascinating about science. One gets such wholesale returns of conjecture mailto:johan...@huesing.name from such a trifling investment of fact. http://derwisch.wikidot.com (Mark Twain, Life on the Mississippi) __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Loss of precision in read.csv.
On Sat, Oct 9, 2010 at 2:38 PM, Joshua Wiley jwiley.ps...@gmail.com wrote: Hi Steven, As near as I can tell, no precision is lost. R is just being courteous and not excessively filling our consoles. Try: print(airports[1,latitude_deg], digits = 22) which is the most digits R will print (although internally it can store more I believe). Dr. Heiberger was kind enough to point out to that the maximum is 53 binary digits, as stated in the R FAQ 7.31: http://cran.r-project.org/doc/FAQ/R-FAQ.html#Why-doesn_0027t-R-think-these-numbers-are-equal_003f Slides by the same from the recent UseR! 2010 conference also provide further explanation: http://user2010.org/slides/Heiberger.pdf One library that allows further precision is Rmpfr based on: http://www.mpfr.org/ To give a small example borrowing the sprintf() display from Dr. Heiberger's slides: library(Rmpfr) sprintf(%+17.17f, 2/3) [1] +0.3 mpfr(2, 260)/3 1 'mpfr' number of precision 260 bits [1] 0.6685 My sincerest apologies for the previous misinformation. Josh snip __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.