Re: [R] [R-pkgs] Release of ess 0.0.1
> * Jorge Cimentada <pvzragn...@tznvy.pbz> [2017-11-09 00:02:53 +0100]: > > I'm happy to announce the release of ess 0.0.1 a package designed to > download data from the European Social Survey Given the existence of ESS (Emacs Speaks Statistics - https://ess.r-project.org/) the package name "ess" seems unfortunate. -- Sam Steingold (http://sds.podval.org/) on darwin Ns 10.3.1504 http://steingoldpsychology.com http://www.childpsy.net http://iris.org.il http://mideasttruth.com http://thereligionofpeace.com https://jihadwatch.org MS Windows: error: the operation completed successfully. __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] help with binom.power
* Bert Gunter othagre.4...@tznvy.pbz [2015-08-17 10:27:58 -0700]: qbinom(.025,1000,.001,lower=FALSE) I don't think this is what I need. I am looking for an inverse of binom.confint. Sorry that my question was not clear. -- Sam Steingold (http://sds.podval.org/) on darwin Ns 10.3.1348 http://www.childpsy.net/ http://islamexposedonline.com http://jihadwatch.org http://iris.org.il http://dhimmi.org http://americancensorship.org Genius is immortal, but morons live longer. __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] strsplit with a vector split argument
Hi, I find this behavior unexpected: --8---cut here---start-8--- strsplit(c(a,b;c,d;e,f),c(,,;)) [[1]] [1] a b;c [[2]] [1] d e,f --8---cut here---end---8--- I thought that it should be identical to this: --8---cut here---start-8--- strsplit(c(a,b;c,d;e,f),[,;]) [[1]] [1] a b c [[2]] [1] d e f --8---cut here---end---8--- Is this a bug or did I misunderstand the docs? Thanks! -- Sam Steingold (http://sds.podval.org/) on Ubuntu 13.04 (raring) X 11.0.11303000 http://www.childpsy.net/ http://www.memritv.org http://truepeace.org http://camera.org http://openvotingconsortium.org http://palestinefacts.org Experience comes with debts. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] promise already under evaluation
Hi, I asked this question on SO but got no answers: http://stackoverflow.com/questions/17310825/r-promise-already-under-evaluation I understand that you are probably sick and tired of answering the same question again, but I am still getting the error discussed in several other questions: promise already under evaluation: recursive default argument reference or earlier problems? even though I did follow the cumbersome advice of prepending .: --8---cut here---start-8--- show.large.objects.threshold - 10 show.large.objects.exclude - c(closure) show.large.objects - function (.envir = sys.frame(), threshold = show.large.objects.threshold, exclude = show.large.objects.exclude) { for (n in print(ls(.envir, all.names = TRUE))) tryCatch({ o - get(n,envir = .envir) s - object.size(o) if (s threshold !(typeof(o) %in% exclude)) { cat(n,: ) print(s,units=auto) } }, error = function(e) { cat(n=,n,\n); print(e) }) } show.large.objects.stack - function (threshold = show.large.objects.threshold, skip.levels = 1,# do not examine the last level - this function exclude = show.large.objects.exclude) { for (level in 1:(sys.nframe()-skip.levels)) { cat(*** show.large.objects.stack(,level,) ) print(sys.call(level)) show.large.objects(.envir = sys.frame(level)) } } --8---cut here---end---8--- but I still get errors: --8---cut here---start-8--- f - function () { c - 1:1e7; d - 1:1e6; print(system.time(show.large.objects.stack())) } f() *** show.large.objects.stack( 1 ) f() [1] c d c : 38.1 Mb d : 3.8 Mb *** show.large.objects.stack( 2 ) print(system.time(show.large.objects.stack())) [1] ... x n= ... simpleError in get(n, envir = .envir): argument ... is missing, with no default n= x simpleError in get(n, envir = .envir): promise already under evaluation: recursive default argument reference or earlier problems? *** show.large.objects.stack( 3 ) system.time(show.large.objects.stack()) [1] exprgcFirst ppt time n= expr simpleError in get(n, envir = .envir): promise already under evaluation: recursive default argument reference or earlier problems? user systemelapsed 0 (0.00ms) 0 (0.00ms) 0.002 (2.00ms) --8---cut here---end---8--- So, what am I still doing wrong? Do I really need the . in .envir? Why do I get the [[argument ... is missing, with no default]] error? Why do I get the [[promise already under evaluation]] error? What is the right way to pass threshold and exclude from show.large.objects.stack to show.large.objects? Thanks! PS. I would prefer an answer on SO, but please feel free to reply using any venue you like and I will copy your explanation to the other venues. -- Sam Steingold (http://sds.podval.org/) on Ubuntu 13.04 (raring) X 11.0.11303000 http://www.childpsy.net/ http://iris.org.il http://mideasttruth.com http://honestreporting.com http://openvotingconsortium.org Linux - find out what you've been missing while you've been rebooting Windows. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] promise already under evaluation
* Sam Steingold f...@tah.bet [2013-07-03 11:33:47 -0400]: Hi, I asked this question on SO but got no answers: http://stackoverflow.com/questions/17310825/r-promise-already-under-evaluation Backlin explained on SO that the errors are to be expected: ... is a formal argument which was not supplied and expr and x were actually being evaluated at the time of get() call. The bottom line is that I must catch and ignore errors. The remaining problem is: how do I pass the same arguments down? e.g., --8---cut here---start-8--- f - function (... verbose=FALSE ...) { ... } g - function (... verbose=FALSE ...) { ... f(... verbose=verbose ...) ... } --8---cut here---end---8--- results in promise already under evaluation (and, yes, I do understand why). is there anything better than --8---cut here---start-8--- f - function ( ... f.verbose=FALSE ... ) { ... } g - function ( ... g.verbose=FALSE ... ) { ... f(... f.verbose=g.verbose ...) ... } --8---cut here---end---8--- -- Sam Steingold (http://sds.podval.org/) on Ubuntu 13.04 (raring) X 11.0.11303000 http://www.childpsy.net/ http://www.memritv.org http://mideasttruth.com http://honestreporting.com http://think-israel.org http://jihadwatch.org Incorrect time synchronization. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] cedta decided 'igraph' wasn't data.table aware
Hi, what does this mean? --8---cut here---start-8--- graph - graph.data.frame(merged[!v,], vertices=ve, directed=FALSE) cedta decided 'igraph' wasn't data.table aware cedta decided 'igraph' wasn't data.table aware cedta decided 'igraph' wasn't data.table aware cedta decided 'igraph' wasn't data.table aware cedta decided 'igraph' wasn't data.table aware --8---cut here---end---8--- `merged' and `ve' are `data.table' objects, and thus `data.frame' objects too. the igraph function graph.data.frame accepts data.frame as the first argument. the igraph maintainers say that it is not coming from igraph. thanks. -- Sam Steingold (http://sds.podval.org/) on Ubuntu 12.10 (quantal) X 11.0.1130 http://www.childpsy.net/ http://www.PetitionOnline.com/tap12009/ http://memri.org http://thereligionofpeace.com http://jihadwatch.org Growing Old is Inevitable; Growing Up is Optional. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] str on large data.frame is slow on factors with many levels
str() takes 2+ minutes to print --8---cut here---start-8--- 'data.frame': 9445743 obs. of 25 variables: $ share.id: Factor w/ 1641168 levels 387059b61ffef5cf,..: 7 118 118 209 242 242 254 254 263 291 ... ... --8---cut here---end---8--- pausing for tens of seconds to print each factor variable which have a lot of levels. Why? (R version 2.15.3 (2013-03-01) -- Security Blanket Platform: x86_64-pc-linux-gnu (64-bit)) -- Sam Steingold (http://sds.podval.org/) on Ubuntu 12.10 (quantal) X 11.0.1130 http://www.childpsy.net/ http://pmw.org.il http://palestinefacts.org http://mideasttruth.com http://americancensorship.org http://camera.org Garbage In, Gospel Out __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] !0 + !0 == !0 - !0
* Bert Gunter thagre.ore...@trar.pbz [2013-03-17 20:30:56 -0700]: I also think it fair to say that all (??) languages have these sorts of malapropisms due to operator precedence. Except for those languages which do _not_ have operator precedence. Like, e.g., Lisp. -- Sam Steingold (http://sds.podval.org/) on Ubuntu 12.04 (precise) X 11.0.11103000 http://www.childpsy.net/ http://openvotingconsortium.org http://jihadwatch.org http://palestinefacts.org http://mideasttruth.com http://camera.org DRM access management == prison freedom management. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] select rows with identical columns from a data frame
* Bert Gunter thagre.ore...@trar.pbz [2013-01-19 22:26:46 -0800]: But David W. and Bill Dunlap gave you solutions that also work and are much faster, no?! Yes, indeed, and I am now using David's solution as it is fast (enough), simple and concise. Thanks a lot to David, Bill, Rui, and arun for their answers (to this question, my many previous questions, and, I hope, my future questions in advance)! On Sat, Jan 19, 2013 at 9:41 PM, Sam Steingold s...@gnu.org wrote: * Rui Barradas ehvconeen...@fncb.cg [2013-01-18 21:02:20 +]: Try the following. complete.cases(f) apply(f, 1, function(x) all(x == x[1])) thanks, this works, but is horribly slow (dim(f) is 766,950x2) -- Sam Steingold (http://sds.podval.org/) on Ubuntu 12.04 (precise) X 11.0.11103000 http://www.childpsy.net/ http://americancensorship.org http://palestinefacts.org http://thereligionofpeace.com http://camera.org http://think-israel.org Lisp is a way of life. C is a way of death. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] select rows with identical columns from a data frame
* Rui Barradas ehvconeen...@fncb.cg [2013-01-18 21:02:20 +]: Try the following. complete.cases(f) apply(f, 1, function(x) all(x == x[1])) thanks, this works, but is horribly slow (dim(f) is 766,950x2) -- Sam Steingold (http://sds.podval.org/) on Ubuntu 12.04 (precise) X 11.0.11103000 http://www.childpsy.net/ http://truepeace.org http://palestinefacts.org http://thereligionofpeace.com http://honestreporting.com http://ffii.org usually: can't pay == don't buy. software: can't buy == don't pay __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] select rows with identical columns from a data frame
I have a data frame with several columns. I want to select the rows with no NAs (as with complete.cases) and all columns identical. E.g., for --8---cut here---start-8--- f - data.frame(a=c(1,NA,NA,4),b=c(1,NA,3,40),c=c(1,NA,5,40)) f a b c 1 1 1 1 2 NA NA NA 3 NA 3 5 4 4 40 40 --8---cut here---end---8--- I want the vector TRUE,FALSE,FALSE,FALSE selecting just the first row because there all 3 columns are the same and none is NA. thanks! -- Sam Steingold (http://sds.podval.org/) on Ubuntu 12.04 (precise) X 11.0.11103000 http://www.childpsy.net/ http://memri.org http://mideasttruth.com http://honestreporting.com http://pmw.org.il http://iris.org.il All extremists should be taken out and shot. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] select rows with identical columns from a data frame
I can do Reduce(==,f[complete.cases(f),]) but that creates an intermediate data frame which I would love to avoid (to save memory). * Sam Steingold f...@tah.bet [2013-01-18 15:53:21 -0500]: I have a data frame with several columns. I want to select the rows with no NAs (as with complete.cases) and all columns identical. E.g., for f - data.frame(a=c(1,NA,NA,4),b=c(1,NA,3,40),c=c(1,NA,5,40)) f a b c 1 1 1 1 2 NA NA NA 3 NA 3 5 4 4 40 40 I want the vector TRUE,FALSE,FALSE,FALSE selecting just the first row because there all 3 columns are the same and none is NA. thanks! -- Sam Steingold (http://sds.podval.org/) on Ubuntu 12.04 (precise) X 11.0.11103000 http://www.childpsy.net/ http://truepeace.org http://iris.org.il http://www.PetitionOnline.com/tap12009/ http://ffii.org http://jihadwatch.org War doesn't determine who's right, just who's left. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] non-consing count
Hi, to count vector elements with some property, the standard idiom seems to be length(which): --8---cut here---start-8--- x - c(1,1,0,0,0) count.0 - length(which(x == 0)) --8---cut here---end---8--- however, this approach allocates and discards 2 vectors: a logical vector of length=length(x) and an integer vector in which. is there a cheaper alternative? Thanks! -- Sam Steingold (http://sds.podval.org/) on Ubuntu 12.04 (precise) X 11.0.11103000 http://www.childpsy.net/ http://iris.org.il http://honestreporting.com http://jihadwatch.org http://pmw.org.il http://www.PetitionOnline.com/tap12009/ War doesn't determine who's right, just who's left. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] vectorization modifying globals in functions
I have the following code: --8---cut here---start-8--- d - rep(10,10) for (i in 1:100) { a - sample.int(length(d), size = 2) if (d[a[1]] = 1) { d[a[1]] - d[a[1]] - 1 d[a[2]] - d[a[2]] + 1 } } --8---cut here---end---8--- it does what I want, i.e., modified vector d 100 times. Now, if I want to repeat this 1e6 times instead of 1e2 times, I want to vectorize it for speed, so I do this: --8---cut here---start-8--- update - function (i) { a - sample.int(n.agents, size = 2) if (d[a[1]] = delta) { d[a[1]] - d[a[1]] - 1 d[a[2]] - d[a[2]] + 1 } entropy(d, unit=log2) } system.time(entropy.history - sapply(1:1e6,update)) --8---cut here---end---8--- however, the global d is not modified, apparently update modifies the local copy. so, 1. is there a way for a function to modify a global variable? 2. how would you vectorize this loop? thanks! -- Sam Steingold (http://sds.podval.org/) on Ubuntu 12.04 (precise) X 11.0.11103000 http://www.childpsy.net/ http://honestreporting.com http://pmw.org.il http://www.PetitionOnline.com/tap12009/ A number problem solved with floats turns into 1.9998 problems. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] lattice::xyplot file output
Hi, When I was using the regular plot() function, I added this: --8---cut here---start-8--- if (!is.null(file)) { do.call(tools::file_ext(file),list(file = file)) on.exit(dev.off()) cat(writing,file,\n) } --8---cut here---end---8--- to the beginning of each of my functions which plotted anything. now that I am using lattice::xyplot to plot multiple lines, the above code does NOT result in the plot being written to a file. why? I trued passing file=file to xyplot but that appears to be ignored too. so, how do I tell lattice::xyplot to write charts in png files? thanks! -- Sam Steingold (http://sds.podval.org/) on Ubuntu 12.04 (precise) X 11.0.11103000 http://www.childpsy.net/ http://honestreporting.com http://jihadwatch.org http://think-israel.org http://mideasttruth.com cogito cogito ergo cogito sum __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] axes labeling
Is it possible to control formatting of the numbers which go along the axes in plots? e.g. plot(x=1:100,y=1:100) will label the X axis as 0d+00, 2e+05 c. I want that to read 0, 200k, 400k c. I know of the function axis(), but it offers far too much control for this simple task. thanks. -- Sam Steingold (http://sds.podval.org/) on Ubuntu 12.04 (precise) X 11.0.11103000 http://www.childpsy.net/ http://www.memritv.org http://jihadwatch.org http://pmw.org.il http://americancensorship.org http://think-israel.org Why do we want intelligent terminals when there are so many stupid users? __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] axes labeling
* David L Carlson qpney...@gnzh.rqh [2012-12-20 13:58:00 -0600]: It is possible, but only by using axis() since you can specify axis breaks in a plot command, but not the labels. You can ignore most of the axis() options so the commands are pretty simple: plot(x=c(1, 100), y=c(1, 100), xlab=x, ylab=y, xaxt=n, yaxt=n, las=2) pos - c(0, 20, 40, 60, 80, 100) lbl - c(0, 200k, 400k, 600k, 800k, 1000k) axis(1, pos, lbl) axis(2, pos, lbl) That's what I meant when I said too much control. I am happy with the way R selects positions. All I want is a say in the way R formats those positions. Think in terms of 100 being a variable. To use axis, I will need to write a map from variable range to axis tick positions first, and then sapply my formatting to the positions. -- Sam Steingold (http://sds.podval.org/) on Ubuntu 12.04 (precise) X 11.0.11103000 http://www.childpsy.net/ http://think-israel.org http://iris.org.il http://mideasttruth.com http://www.memritv.org http://memri.org All extremists should be taken out and shot. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] sitools: bug: f2si(0)=
Jonas, I think f2si(0) should be 0, not as it is now. Thanks. -- Sam Steingold (http://sds.podval.org/) on Ubuntu 12.04 (precise) X 11.0.11103000 http://www.childpsy.net/ http://ffii.org http://mideasttruth.com http://thereligionofpeace.com http://iris.org.il http://truepeace.org Type louder, please. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] the value of the last expression
* Richard M. Heiberger e...@grzcyr.rqh [2012-02-09 21:48:50 -0500]: .Last.value Thanks; it worked for a while, but not anymore: http://stat.ethz.ch/R-manual/R-patched/library/base/html/Last.value.html --8---cut here---start-8--- gamma(1:15) [1] 1 1 2 6 24 120 [7] 7205040 40320 362880 362880039916800 [13] 479001600 6227020800 87178291200 z - .Last.value z NULL --8---cut here---end---8--- could my .Rprofile be at fault? --8---cut here---start-8--- ## breaks ess ## options(error = utils::recover) options(max.print = 100, repos = c(CRAN = http://lib.stat.cmu.edu/R/CRAN/;)) library(compiler) compiler::enableJIT(3) compiler::compilePKGS(1) --8---cut here---end---8--- On Thu, Feb 9, 2012 at 9:44 PM, Sam Steingold s...@gnu.org wrote: Is there an analogue of common lisp * variable which contains the value of the last expression? E.g., in lisp: (+ 1 2) 3 * 3 I wish I could recover the value of the last expression without re-evaluating it. -- Sam Steingold (http://sds.podval.org/) on Ubuntu 12.04 (precise) X 11.0.11103000 http://www.childpsy.net/ http://mideasttruth.com http://thereligionofpeace.com http://www.memritv.org http://iris.org.il http://americancensorship.org Diplomacy is the art of saying nice doggy until you can find a nice rock. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] sum portions of a vector
How do I sum portions of a vector into another vector? E.g., for --8---cut here---start-8--- vec - 1:10 breaks - c(3,8,10) --8---cut here---end---8--- I want to get a vector of length 3 with content --8---cut here---start-8--- 6 = 1+2+3 30 = 4+5+6+7+8 19 = 9+10 --8---cut here---end---8--- Obviously, I could write a loop, but I would rather have a vectorized version. Thanks! -- Sam Steingold (http://sds.podval.org/) on Ubuntu 12.04 (precise) X 11.0.11103000 http://www.childpsy.net/ http://palestinefacts.org http://ffii.org http://jihadwatch.org http://www.PetitionOnline.com/tap12009/ One can find Holy Grail or Higgs boson, but not the second sock. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] the value of the last expression
* arun fznegcvax...@lnubb.pbz [2012-12-10 11:22:03 -0800]: It is working for me. I do not claim to have found a bug. I am merely pleading for help figuring out what could have gone wrong. .Last.value word when I first start R under Emacs/ESS. Then it stops working. I can't figure out when or why... --8---cut here---start-8--- sessionInfo() R version 2.15.2 (2012-10-26) Platform: x86_64-pc-linux-gnu (64-bit) locale: [1] C attached base packages: [1] stats graphics grDevices utils datasets compiler methods [8] base loaded via a namespace (and not attached): [1] tools_2.15.2 --8---cut here---end---8--- sessionInfo() R version 2.15.0 (2012-03-30) Platform: x86_64-pc-linux-gnu (64-bit) locale: [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C [3] LC_TIME=en_US.UTF-8 LC_COLLATE=en_US.UTF-8 [5] LC_MONETARY=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8 [7] LC_PAPER=C LC_NAME=C [9] LC_ADDRESS=C LC_TELEPHONE=C [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C attached base packages: [1] stats graphics grDevices utils datasets methods base other attached packages: [1] matrixStats_0.6.2 stringr_0.6 reshape_0.8.4 plyr_1.7.1 loaded via a namespace (and not attached): [1] R.methodsS3_1.4.2 tools_2.15.0 A.K. - Original Message - From: Sam Steingold s...@gnu.org To: r-help@r-project.org; Richard M. Heiberger r...@temple.edu Cc: Sent: Monday, December 10, 2012 2:13 PM Subject: Re: [R] the value of the last expression * Richard M. Heiberger e...@grzcyr.rqh [2012-02-09 21:48:50 -0500]: .Last.value Thanks; it worked for a while, but not anymore: http://stat.ethz.ch/R-manual/R-patched/library/base/html/Last.value.html gamma(1:15) [1] 1 1 2 6 24 120 [7] 720 5040 40320 362880 3628800 39916800 [13] 479001600 6227020800 87178291200 z - .Last.value z NULL could my .Rprofile be at fault? ## breaks ess ## options(error = utils::recover) options(max.print = 100, repos = c(CRAN = http://lib.stat.cmu.edu/R/CRAN/;)) library(compiler) compiler::enableJIT(3) compiler::compilePKGS(1) On Thu, Feb 9, 2012 at 9:44 PM, Sam Steingold s...@gnu.org wrote: Is there an analogue of common lisp * variable which contains the value of the last expression? E.g., in lisp: (+ 1 2) 3 * 3 I wish I could recover the value of the last expression without re-evaluating it. -- Sam Steingold (http://sds.podval.org/) on Ubuntu 12.04 (precise) X 11.0.11103000 http://www.childpsy.net/ http://americancensorship.org http://pmw.org.il http://www.memritv.org http://iris.org.il http://jihadwatch.org http://ffii.org If it has syntax, it isn't user friendly. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] list to matrix?
How do I convert a list to a matrix? --8---cut here---start-8--- list(c(5, 101), c(1e+05, 46), c(15, 31), c(2e+05, 17), c(25, 19), c(3e+05, 11), c(35, 12), c(4e+05, 25), c(45, 19), c(5e+05, 16)) as.matrix(a) [,1] [1,] Numeric,2 [2,] Numeric,2 [3,] Numeric,2 [4,] Numeric,2 [5,] Numeric,2 [6,] Numeric,2 [7,] Numeric,2 [8,] Numeric,2 [9,] Numeric,2 --8---cut here---end---8--- thanks! -- Sam Steingold (http://sds.podval.org/) on Ubuntu 12.04 (precise) X 11.0.11103000 http://www.childpsy.net/ http://palestinefacts.org http://dhimmi.com http://jihadwatch.org http://www.PetitionOnline.com/tap12009/ http://memri.org Rhinoceros has poor vision, but, due to his size, it's not his problem. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] aggregate() runs out of memory
* Steve Lianoglou znvyvatyvfg.ubarl...@tznvy.pbz [2012-11-26 19:47:25 -0500]: On Monday, November 26, 2012, Sam Steingold wrote: [snip] there is precisely one country for each id. i.e., unique(country) is the same as country[1]. thanks a lot for the suggestion! R result - f[, list(min=min(delay), max=max(delay), count=.N,country=country[1L]), by=share.id] And is it performant? acceptable. It just occurred to me that this is even better: R setkeyv(f, c(share.id, delay)) R result - f[, list(min=delay[1L], max=delay[.N], count=.N, country=country[1L]), by=share.id] this assumes that delays are sorted (like in my example) which, in reality, they are not. thanks for your help! -- Sam Steingold (http://sds.podval.org/) on Ubuntu 12.04 (precise) X 11.0.11103000 http://www.childpsy.net/ http://honestreporting.com http://americancensorship.org http://memri.org http://www.memritv.org Illiterate? Write today, for free help! __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] aggregate() runs out of memory
* Steve Lianoglou znvyvatyvfg.ubarl...@tznvy.pbz [2012-11-27 12:53:23 -0500]: On Tue, Nov 27, 2012 at 11:29 AM, Sam Steingold s...@gnu.org wrote: * Steve Lianoglou znvyvatyvfg.ubarl...@tznvy.pbz [2012-11-26 19:47:25 -0500]: [snip] It just occurred to me that this is even better: R setkeyv(f, c(share.id, delay)) R result - f[, list(min=delay[1L], max=delay[.N], count=.N, country=country[1L]), by=share.id] this assumes that delays are sorted (like in my example) which, in reality, they are not. When you include delay in the call to `setkeyv` as I did above, it sorts low to high w/in each share.id group. Ah, but then I would have to _sort_ (~n*log(n)) by delay within each ID group, while all I care about is min/max (~n). thanks again! -- Sam Steingold (http://sds.podval.org/) on Ubuntu 12.04 (precise) X 11.0.11103000 http://www.childpsy.net/ http://think-israel.org http://truepeace.org http://thereligionofpeace.com http://mideasttruth.com http://www.memritv.org If You Want Breakfast In Bed, Sleep In the Kitchen. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] printing difftime summary
this overcomes the summary generation, but not printing: --8---cut here---start-8--- summary.difftime - function (v, ...) { s - summary(as.numeric(v), ...) r - as.data.frame(sapply(s,difftime2string),stringsAsFactors=FALSE) names(r) - c(string) r[[units(v)]] - s class(r) - c(data.frame,summary.difftime) r } print.summary.difftime - function (sd) print.data.frame(sd) --8---cut here---end---8--- summary(infl), where infl$delay is a difftime vector, prints ... delay string:c(492.00 ms, 18.08 min, 1.77 hrs, 8.20 hrs, 8.13 hrs, 6.98 days) secs :c( 0.5, 1085.1, 6370.2, 29534.4, 29254.0, 602949.7) instead of something like delay Min.:492 ms 1st Qu.: 18.08 min c so, how do I arrange for a proper printing of difftime summary as a part of the data frame summary? * David Winsemius qjvafrz...@pbzpnfg.arg [2012-11-25 00:50:51 -0800]: On Nov 24, 2012, at 7:48 PM, Sam Steingold wrote: * David Winsemius qjvafrz...@pbzpnfg.arg [2012-11-23 13:14:17 -0800]: See http://cran.r-project.org/doc/FAQ/R-FAQ.html#How-should-I-write-summary-methods_003f --8---cut here---start-8--- summary.difftime - function (v) { s - summary(as.numeric(v)) r - as.data.frame(sapply(s,difftime2string),stringsAsFactors=FALSE) names(r) - c(string) r[[units(v)]] - s class(r) - c(data.frame,summary.difftime) r } print.summary.difftime - function (sd) print.data.frame(sd) --8---cut here---end---8--- it appears to work for a single vector: --8---cut here---start-8--- r1 - summary(infl$delay) r1 string secs Min.492.00 ms 0.5 1st Qu. 18.08 min 1085.0 Median 1.77 hrs 6370.0 Mean 8.20 hrs 29530.0 3rd Qu. 8.12 hrs 29250.0 Max.6.98 days 602900.0 str(r1) Classes 'summary.difftime' and 'data.frame': 6 obs. of 2 variables: $ string: chr 492.00 ms 18.08 min 1.77 hrs 8.20 hrs ... $ secs :Classes 'summaryDefault', 'table' num [1:6] 4.92e-01 1.08e+03 6.37e+03 2.95e+04 2.92e+04 ... --8---cut here---end---8--- but not as a part of data frame: --8---cut here---start-8--- a - summary(infl) Error in summary.difftime(X[[22L]], ...) : unused argument(s) (maxsum = 7, digits = 12) --8---cut here---end---8--- I guess I should somehow accept a list of options in summary.difftime() and pass them on to the inner call to summary() (or should it be explicitly summary.numeric()?) In the usual way. If you know that the function will be called with arguments from the summary.data.frame function then you should allow the argument list to accept them. You can ignore them or provide provisions for them. You just can't define your function to have only one argument if you expect (as you should since you passes summary a dataframe object) that it might be called within summary.data.frame. This is the argument list for summary.data.frame: summary.data.frame function (object, maxsum = 7, digits = max(3, getOption(digits) - 3), ...) how do I do that? summary.difftime - function (v, ... ) { There are many asked and answered questions on rhelp about how to deal with the dots arguments. -- Sam Steingold (http://sds.podval.org/) on Ubuntu 12.04 (precise) X 11.0.11103000 http://www.childpsy.net/ http://www.memritv.org http://memri.org http://honestreporting.com http://dhimmi.com http://openvotingconsortium.org People with a good taste are especially appreciated by cannibals. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] aggregate() runs out of memory
Hi, * Steve Lianoglou znvyvatyvfg.ubarl...@tznvy.pbz [2012-11-19 13:30:03 -0800]: For instance, if you want the min and max of `delay` within each group defined by `share.id`, and let's assume `infl` is a data.frame, you can do something like so: R as.data.table(infl) R setkey(infl, share.id) R result - infl[, list(min=min(delay), max=max(delay)), by=share.id] perfect, thanks. alas, the resulting table does not contain the share.id column. do I need to add something like id=unique(share.id) to the list? also, if there is a field in the original table infl which only depends on share.id, how do I add this unique value to the summary? it appears that count=unique(country) in list() does what I need, but it slows down the process. -- Sam Steingold (http://sds.podval.org/) on Ubuntu 12.04 (precise) X 11.0.11103000 http://www.childpsy.net/ http://openvotingconsortium.org http://jihadwatch.org http://thereligionofpeace.com http://palestinefacts.org http://dhimmi.com Why use Windows, when there are Doors? __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] aggregate() runs out of memory
hi Steve, * Steve Lianoglou znvyvatyvfg.ubarl...@tznvy.pbz [2012-11-26 16:08:59 -0500]: On Mon, Nov 26, 2012 at 3:13 PM, Sam Steingold s...@gnu.org wrote: * Steve Lianoglou znvyvatyvfg.ubarl...@tznvy.pbz [2012-11-19 13:30:03 -0800]: For instance, if you want the min and max of `delay` within each group defined by `share.id`, and let's assume `infl` is a data.frame, you can do something like so: R as.data.table(infl) R setkey(infl, share.id) R result - infl[, list(min=min(delay), max=max(delay)), by=share.id] perfect, thanks. alas, the resulting table does not contain the share.id column. do I need to add something like id=unique(share.id) to the list? also, if there is a field in the original table infl which only depends on share.id, how do I add this unique value to the summary? it appears that count=unique(country) in list() does what I need, but it slows down the process. Hmm ... I think it should be there, but I'm having a hard time remember what you want. Could you please copy paste the output of `(head(infl, 20))` as well as an approximation of what the result is that you want. this prints all the levels for all the factor columns and takes megabytes. --8---cut here---start-8--- f - data.frame(id=rep(1:3,4),country=rep(6:8,4),delay=1:12) f id country delay 1 1 6 1 2 2 7 2 3 3 8 3 4 1 6 4 5 2 7 5 6 3 8 6 7 1 6 7 8 2 7 8 9 3 8 9 10 1 610 11 2 711 12 3 812 f - as.data.table(f) setkey(f,id) delays - f[,list(min=min(delay),max=max(delay),count=.N,country=unique(country)),by=id] delays id min max count country 1: 1 1 10 4 6 2: 2 2 11 4 7 3: 3 3 12 4 8 --8---cut here---end---8--- this is still too slow, apparently because of unique. how do I speed it up? Thanks. -- Sam Steingold (http://sds.podval.org/) on Ubuntu 12.04 (precise) X 11.0.11103000 http://www.childpsy.net/ http://iris.org.il http://ffii.org http://pmw.org.il http://mideasttruth.com Programming is like sex: one mistake and you have to support it for a lifetime. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] printing difftime summary
* David Winsemius qjvafrz...@pbzpnfg.arg [2012-11-26 08:46:35 -0800]: On Nov 26, 2012, at 7:14 AM, Sam Steingold wrote: summary(infl), where infl$delay is a difftime vector, prints ... delay string:c(492.00 ms, 18.08 min, 1.77 hrs, 8.20 hrs, 8.13 hrs, 6.98 days) secs :c( 0.5, 1085.1, 6370.2, 29534.4, 29254.0, 602949.7) instead of something like delay Min.:492 ms 1st Qu.: 18.08 min c so, how do I arrange for a proper printing of difftime summary as a part of the data frame summary? If you like a particular format from an existing print method then why not look it up and copy the code? methods(print) the problem is that I cannot figure out which function prints this: delay string:c(492.00 ms, 18.08 min, 1.77 hrs, 8.20 hrs, 8.13 hrs, 6.98 days) secs :c( 0.5, 1085.1, 6370.2, 29534.4, 29254.0, 602949.7) I added cat()s to print.summary.difftime and I do not see them, so it appears that I have no direct control over how a summary.difftime is printed as a part of a summary of a data.frame. --8---cut here---start-8--- summary.difftime - function (v, ...) { s - summary(as.numeric(v), ...) r - as.data.frame(sapply(s,difftime2string),stringsAsFactors=FALSE) names(r) - c(string) r[[units(v)]] - s class(r) - c(summary.difftime,data.frame) invisible(r) } print.summary.difftime - function (sd, ...) { cat([[[print.summary.difftime]]]\n) print(list(...)) print.data.frame(sd, ...) } --8---cut here---end---8--- -- Sam Steingold (http://sds.podval.org/) on Ubuntu 12.04 (precise) X 11.0.11103000 http://www.childpsy.net/ http://palestinefacts.org http://think-israel.org http://www.memritv.org http://openvotingconsortium.org http://mideasttruth.com The force of gravity doubles when acting on a body on a couch. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] aggregate() runs out of memory
Hi, * Steve Lianoglou znvyvatyvfg.ubarl...@tznvy.pbz [2012-11-26 17:32:21 -0500]: --8---cut here---start-8--- f - data.frame(id=rep(1:3,4),country=rep(6:8,4),delay=1:12) f id country delay 1 1 6 1 2 2 7 2 3 3 8 3 4 1 6 4 5 2 7 5 6 3 8 6 7 1 6 7 8 2 7 8 9 3 8 9 10 1 610 11 2 711 12 3 812 f - as.data.table(f) setkey(f,id) delays - f[,list(min=min(delay),max=max(delay),count=.N,country=unique(country)),by=id] delays id min max count country 1: 1 1 10 4 6 2: 2 2 11 4 7 3: 3 3 12 4 8 --8---cut here---end---8--- this is still too slow, apparently because of unique. how do I speed it up? I think I'm missing something. Your call to `min(delay)` and `max(delay)` will return the minimum and maximum delays within the particular id you are grouping by. I guess there must be several values for country within each id group -- do you really want the same min and max values to be replicated as many times as there are unique countrys? there is precisely one country for each id. i.e., unique(country) is the same as country[1]. thanks a lot for the suggestion! R result - f[, list(min=min(delay), max=max(delay), count=.N,country=country[1L]), by=share.id] -- Sam Steingold (http://sds.podval.org/) on Ubuntu 12.04 (precise) X 11.0.11103000 http://www.childpsy.net/ http://thereligionofpeace.com http://pmw.org.il http://honestreporting.com http://americancensorship.org Why do you never call me back after I scream that I will never talk to you again?! __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] printing difftime summary
Thanks a lot - almost there! --8---cut here---start-8--- format.summary.difftime - function(sd, ...) { t - matrix(sd$string) rownames(t) - rownames(sd) print(t) format(as.table(t)) } print.summary.difftime - function (sd, ...) { print(format(sd), quote=FALSE) invisible(sd) } --8---cut here---end---8--- this almost works: --8---cut here---start-8--- summary(delays) share.id min max 12cf12372b87cce9: 1 NULL:492.00 ms NULL:492.00 ms 12cf36060bdb9581: 1 NULL:3.70 minNULL:21.80 min 12d2665c906bb232: 1 NULL:20.32 min NULL:3.26 hrs 12d2802f1435b4cd: 1 NULL:5.52 hrsNULL:13.78 hrs 12d292988f5f8422: 1 NULL:2.81 hrsNULL:16.20 hrs 12d29dd2894e2790: 1 NULL:6.95 days NULL:6.98 days --8---cut here---end---8--- why do I see NULLs?! --8---cut here---start-8--- t - matrix(sd$string) rownames(t) - rownames(sd) t [,1] Min.492.00 ms 1st Qu. 3.70 min Median 20.32 min Mean5.52 hrs 3rd Qu. 2.81 hrs Max.6.95 days as.table(t) A Min.492.00 ms 1st Qu. 3.70 min Median 20.32 min Mean5.52 hrs 3rd Qu. 2.81 hrs Max.6.95 days format(as.table(t)) A Min.492.00 ms 1st Qu. 3.70 min Median 20.32 min Mean5.52 hrs 3rd Qu. 2.81 hrs Max.6.95 days --8---cut here---end---8--- * William Dunlap jqha...@gvopb.pbz [2012-11-26 23:02:48 +]: It looks like summary.data.frame(d) calls format(d[[i]]) for i in seq_len(ncol(d)) and pastes the results together into a table object for printing. Hence, write a format.summary.difftime if you want objects of class summary.difftime (which I assume summary.difftime produces) to be formatted as you wish when a difftime object is in a data.frame. Once you've written it, have your print.summary.difftime call it too. E.g., with the following methods summary.difftime - function(x, ...) { ret - quantile(x, p=(0:2)/2, na.rm=TRUE) class(ret) - c(summary.difftime, class(ret)) ret } format.summary.difftime - function(x, ...) c(Min.Med.Max = paste(collapse=..., NextMethod(format))) print.summary.difftime - function(x, ...){ print(format(x), quote=FALSE) ; invisible(x) } I get d - data.frame(Num=1:5, Date=as.Date(2012-11-26)+(0:4), Delta=diff(as.Date(2012-11-26)+2^(0:5))) summary(d) Num DateDelta Min. :1 Min. :2012-11-26 Min.Med.Max: 1 days... 4 days...16 days 1st Qu.:2 1st Qu.:2012-11-27 Median :3 Median :2012-11-28 Mean :3 Mean :2012-11-28 3rd Qu.:4 3rd Qu.:2012-11-29 Max. :5 Max. :2012-11-30 summary(d$Delta) Min.Med.Max 1 days... 4 days...16 days My summary.difftime inherits from difftime so the format method is not really needed, as format.difftime does a reasonable job (except that it does not copy the input names to its output). I put it in to show how it gets called. Bill Dunlap Spotfire, TIBCO Software wdunlap tibco.com -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On Behalf Of Sam Steingold Sent: Monday, November 26, 2012 2:20 PM To: r-help@r-project.org; David Winsemius Subject: Re: [R] printing difftime summary * David Winsemius qjvafrz...@pbzpnfg.arg [2012-11-26 08:46:35 -0800]: On Nov 26, 2012, at 7:14 AM, Sam Steingold wrote: summary(infl), where infl$delay is a difftime vector, prints ... delay string:c(492.00 ms, 18.08 min, 1.77 hrs, 8.20 hrs, 8.13 hrs, 6.98 days) secs :c( 0.5, 1085.1, 6370.2, 29534.4, 29254.0, 602949.7) instead of something like delay Min.:492 ms 1st Qu.: 18.08 min c so, how do I arrange for a proper printing of difftime summary as a part of the data frame summary? If you like a particular format from an existing print method then why not look it up and copy the code? methods(print) the problem is that I cannot figure out which function prints this: delay string:c(492.00 ms, 18.08 min, 1.77 hrs, 8.20 hrs, 8.13 hrs, 6.98 days) secs :c( 0.5, 1085.1, 6370.2, 29534.4, 29254.0, 602949.7) I added cat()s to print.summary.difftime and I do not see them, so it appears that I have no direct control over how a summary.difftime is printed as a part of a summary of a data.frame. --8---cut here---start-8--- summary.difftime - function (v, ...) { s - summary(as.numeric(v), ...) r - as.data.frame(sapply(s,difftime2string),stringsAsFactors=FALSE) names(r) - c(string) r[[units(v)]] - s class(r) - c(summary.difftime,data.frame) invisible(r
Re: [R] printing difftime summary
Looks like format.summary.difftime - function(sd, ...) structure(sd$string, names=rownames(sd)) does the job. any reason not to use it? On Mon, Nov 26, 2012 at 7:36 PM, William Dunlap wdun...@tibco.com wrote: why do I see NULLs?! because ... format.difftime does a reasonable job (except that it does not copy the input names to its output). Replace your call of the form format(difftimeObject) with structure(format(difftimeObject), names=names(difftimeObject)) to work around this. Bill Dunlap Spotfire, TIBCO Software wdunlap tibco.com -Original Message- From: Sam Steingold [mailto:sam.steing...@gmail.com] On Behalf Of Sam Steingold Sent: Monday, November 26, 2012 4:09 PM To: William Dunlap Cc: r-help@r-project.org; David Winsemius Subject: Re: [R] printing difftime summary Thanks a lot - almost there! --8---cut here---start-8--- format.summary.difftime - function(sd, ...) { t - matrix(sd$string) rownames(t) - rownames(sd) print(t) format(as.table(t)) } print.summary.difftime - function (sd, ...) { print(format(sd), quote=FALSE) invisible(sd) } --8---cut here---end---8--- this almost works: --8---cut here---start-8--- summary(delays) share.id min max 12cf12372b87cce9: 1 NULL:492.00 ms NULL:492.00 ms 12cf36060bdb9581: 1 NULL:3.70 minNULL:21.80 min 12d2665c906bb232: 1 NULL:20.32 min NULL:3.26 hrs 12d2802f1435b4cd: 1 NULL:5.52 hrsNULL:13.78 hrs 12d292988f5f8422: 1 NULL:2.81 hrsNULL:16.20 hrs 12d29dd2894e2790: 1 NULL:6.95 days NULL:6.98 days --8---cut here---end---8--- why do I see NULLs?! --8---cut here---start-8--- t - matrix(sd$string) rownames(t) - rownames(sd) t [,1] Min.492.00 ms 1st Qu. 3.70 min Median 20.32 min Mean5.52 hrs 3rd Qu. 2.81 hrs Max.6.95 days as.table(t) A Min.492.00 ms 1st Qu. 3.70 min Median 20.32 min Mean5.52 hrs 3rd Qu. 2.81 hrs Max.6.95 days format(as.table(t)) A Min.492.00 ms 1st Qu. 3.70 min Median 20.32 min Mean5.52 hrs 3rd Qu. 2.81 hrs Max.6.95 days --8---cut here---end---8--- * William Dunlap jqha...@gvopb.pbz [2012-11-26 23:02:48 +]: It looks like summary.data.frame(d) calls format(d[[i]]) for i in seq_len(ncol(d)) and pastes the results together into a table object for printing. Hence, write a format.summary.difftime if you want objects of class summary.difftime (which I assume summary.difftime produces) to be formatted as you wish when a difftime object is in a data.frame. Once you've written it, have your print.summary.difftime call it too. E.g., with the following methods summary.difftime - function(x, ...) { ret - quantile(x, p=(0:2)/2, na.rm=TRUE) class(ret) - c(summary.difftime, class(ret)) ret } format.summary.difftime - function(x, ...) c(Min.Med.Max = paste(collapse=..., NextMethod(format))) print.summary.difftime - function(x, ...){ print(format(x), quote=FALSE) ; invisible(x) } I get d - data.frame(Num=1:5, Date=as.Date(2012-11-26)+(0:4), Delta=diff(as.Date(2012-11-26)+2^(0:5))) summary(d) Num DateDelta Min. :1 Min. :2012-11-26 Min.Med.Max: 1 days... 4 days...16 days 1st Qu.:2 1st Qu.:2012-11-27 Median :3 Median :2012-11-28 Mean :3 Mean :2012-11-28 3rd Qu.:4 3rd Qu.:2012-11-29 Max. :5 Max. :2012-11-30 summary(d$Delta) Min.Med.Max 1 days... 4 days...16 days My summary.difftime inherits from difftime so the format method is not really needed, as format.difftime does a reasonable job (except that it does not copy the input names to its output). I put it in to show how it gets called. Bill Dunlap Spotfire, TIBCO Software wdunlap tibco.com -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On Behalf Of Sam Steingold Sent: Monday, November 26, 2012 2:20 PM To: r-help@r-project.org; David Winsemius Subject: Re: [R] printing difftime summary * David Winsemius qjvafrz...@pbzpnfg.arg [2012-11-26 08:46:35 -0800]: On Nov 26, 2012, at 7:14 AM, Sam Steingold wrote: summary(infl), where infl$delay is a difftime vector, prints ... delay string:c(492.00 ms, 18.08 min, 1.77 hrs, 8.20 hrs, 8.13 hrs, 6.98 days) secs :c( 0.5, 1085.1, 6370.2, 29534.4, 29254.0, 602949.7) instead of something like delay Min.:492 ms 1st Qu.: 18.08 min c so, how do I arrange for a proper printing of difftime summary as a part
Re: [R] printing difftime summary
* David Winsemius qjvafrz...@pbzpnfg.arg [2012-11-23 13:14:17 -0800]: See http://cran.r-project.org/doc/FAQ/R-FAQ.html#How-should-I-write-summary-methods_003f --8---cut here---start-8--- summary.difftime - function (v) { s - summary(as.numeric(v)) r - as.data.frame(sapply(s,difftime2string),stringsAsFactors=FALSE) names(r) - c(string) r[[units(v)]] - s class(r) - c(data.frame,summary.difftime) r } print.summary.difftime - function (sd) print.data.frame(sd) --8---cut here---end---8--- it appears to work for a single vector: --8---cut here---start-8--- r1 - summary(infl$delay) r1 string secs Min.492.00 ms 0.5 1st Qu. 18.08 min 1085.0 Median 1.77 hrs 6370.0 Mean 8.20 hrs 29530.0 3rd Qu. 8.12 hrs 29250.0 Max.6.98 days 602900.0 str(r1) Classes 'summary.difftime' and 'data.frame':6 obs. of 2 variables: $ string: chr 492.00 ms 18.08 min 1.77 hrs 8.20 hrs ... $ secs :Classes 'summaryDefault', 'table' num [1:6] 4.92e-01 1.08e+03 6.37e+03 2.95e+04 2.92e+04 ... --8---cut here---end---8--- but not as a part of data frame: --8---cut here---start-8--- a - summary(infl) Error in summary.difftime(X[[22L]], ...) : unused argument(s) (maxsum = 7, digits = 12) --8---cut here---end---8--- I guess I should somehow accept a list of options in summary.difftime() and pass them on to the inner call to summary() (or should it be explicitly summary.numeric()?) how do I do that? -- Sam Steingold (http://sds.podval.org/) on Ubuntu 12.04 (precise) X 11.0.11103000 http://www.childpsy.net/ http://camera.org http://jihadwatch.org http://americancensorship.org http://truepeace.org http://memri.org Why do you never call me back after I scream that I will never talk to you again?! __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] printing difftime summary
* R. Michael Weylandt zvpunry.jrlyn...@tznvy.pbz [2012-11-23 09:13:36 +]: 2. because difftime.summary returns a data.frame and not a Classes 'summaryDefault', 'table' as I assume summary must return. See http://cran.r-project.org/doc/FAQ/R-FAQ.html#How-should-I-write-summary-methods_003f what are the requirements on the class summary.foo? does it have to inherit from some other class? how do I define a class? -- Sam Steingold (http://sds.podval.org/) on Ubuntu 12.04 (precise) X 11.0.11103000 http://www.childpsy.net/ http://dhimmi.com http://honestreporting.com http://thereligionofpeace.com http://iris.org.il http://americancensorship.org In the race between idiot-proof software and idiots, the idiots are winning. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] printing difftime summary
* R. Michael Weylandt zvpunry.jrlyn...@tznvy.pbz [2012-11-22 12:11:55 +]: I now think that what I want is --8---cut here---start-8--- difftime.summary - function (v) { s - summary(as.numeric(v)) r - as.data.frame(sapply(s,difftime2string),stringsAsFactors=FALSE) names(r) - c(string) r[[units(v)]] - s r } Any reason not summary.difftime to get S3 dispatch? I hoped that someone will ask this :-) 1. because its argument has type vector of difftime, not difftime (coming from CLOS, I do not expect summary(vector of difftime) to dispatch to summary.difftime, but to summary.vector.of.difftime or something) 2. because difftime.summary returns a data.frame and not a Classes 'summaryDefault', 'table' as I assume summary must return. if these are not valid issues, then I wonder why my function should not be the system default method. -- Sam Steingold (http://sds.podval.org/) on Ubuntu 12.04 (precise) X 11.0.11103000 http://www.childpsy.net/ http://memri.org http://honestreporting.com http://jihadwatch.org http://openvotingconsortium.org http://ffii.org Sex is like air. It's only a big deal if you can't get any. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] printing difftime summary
Hi, I have a vector of difftime objects and I want to see its summary. Alas: --8---cut here---start-8--- summary(infl$delay) LengthClass Mode 9008386 difftime numeric --8---cut here---end---8--- this is almost completely useless. I can use as.numeric: --8---cut here---start-8--- s - summary(as.numeric(infl$delay)) dput(s) structure(c(0.5, 1027, 5969, 29870, 28970, 603100), .Names = c(Min., 1st Qu., Median, Mean, 3rd Qu., Max.), class = c(summaryDefault, table)) s Min. 1st Qu. Median Mean 3rd Qu. Max. 0.5 1027.0 5969.0 29870.0 28970.0 603100.0 --8---cut here---end---8--- but the printed representation is very unreadable: the fact that 603100.0 is almost exactly 7 days is not obvious. Okay, maybe as.difftime will help? --8---cut here---start-8--- as.difftime(s,units=secs) Time differences in secs Min. 1st Qu. Median Mean 3rd Qu. Max. 0.5 1027.0 5969.0 29870.0 28970.0 603100.0 as.difftime(s/3600,units=hours) Time differences in hours Min. 1st Qu. Median Mean 3rd Qu. Max. 1.39e-04 2.852778e-01 1.658056e+00 8.297222e+00 8.047222e+00 1.675278e+02 --8---cut here---end---8--- nope; still unreadable. What I really want to see _printed_ is something likes this: --8---cut here---start-8--- sapply(s,difftime2string) Min. 1st Qu. MedianMean 3rd Qu.Max. 500.00 ms 17.12 min 99.48 min 8.30 hrs 8.05 hrs 6.98 days --8---cut here---end---8--- except that the quotes are not needed in the printed output. Here I wrote: --8---cut here---start-8--- difftime2string - function (x) { if (x 1) return(sprintf(%.2f ms,x*1000)) if (x 100) return(sprintf(%.2f sec,x)) if (x 6000) return(sprintf(%.2f min,x/60)) if (x 108000) return(sprintf(%.2f hrs,x/3600)) if (x 400*24*3600) return(sprintf(%.2f days,x/(24*3600))) sprintf(%.2f years,x/(365.25*24*3600)) } --8---cut here---end---8--- So, what is The Right R Way to print a summary of difftime objects? Thanks! -- Sam Steingold (http://sds.podval.org/) on Ubuntu 12.04 (precise) X 11.0.11103000 http://www.childpsy.net/ http://openvotingconsortium.org http://memri.org http://camera.org http://mideasttruth.com http://pmw.org.il MS Windows: error: the operation completed successfully. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] printing difftime summary
Hi, * arun fznegcvax...@lnubb.pbz [2012-11-21 14:04:36 -0800]: Are you looking for some other function (difftime2string) ot just remove the quotes from the printed output? I am wondering what others do when they want to see a summary of difftime. If it is the latter, then this should do it. res-do.call(data.frame,lapply(s,difftime2string)) names(res)-names(s) res # Min. 1st Qu. Median Mean 3rd Qu. Max. #1 500.00 ms 17.12 min 99.48 min 8.30 hrs 8.05 hrs 6.98 days cool, thanks. I now think that what I want is --8---cut here---start-8--- difftime.summary - function (v) { s - summary(as.numeric(v)) r - as.data.frame(sapply(s,difftime2string),stringsAsFactors=FALSE) names(r) - c(string) r[[units(v)]] - s r } difftime.summary(infl$delay) string secs Min.500.00 ms 0.5 1st Qu. 17.12 min 1027.0 Median 99.48 min 5969.0 Mean 8.30 hrs 29870.0 3rd Qu. 8.05 hrs 28970.0 Max.6.98 days 603100.0 --8---cut here---end---8--- -- Sam Steingold (http://sds.podval.org/) on Ubuntu 12.04 (precise) X 11.0.11103000 http://www.childpsy.net/ http://ffii.org http://jihadwatch.org http://memri.org http://www.memritv.org http://camera.org http://mideasttruth.com A computer scientist is someone who fixes things that aren't broken. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] generated list element names
How can I create lists with element names created on the fly? --8---cut here---start-8--- list (foo = 10) $foo [1] 10 list (foo = 10) $foo [1] 10 list (paste(f,oo,sep=) = 10) Error: unexpected '=' in list (paste(f,oo,sep=) = --8---cut here---end---8--- I understand that tags in list() are not evaluated, but is there a more elegant way than --8---cut here---start-8--- z - list(10) names(z) - paste(f,oo,sep=) z $foo [1] 10 --8---cut here---end---8--- thanks! -- Sam Steingold (http://sds.podval.org/) on Ubuntu 12.04 (precise) X 11.0.11103000 http://www.childpsy.net/ http://www.memritv.org http://thereligionofpeace.com http://truepeace.org Unix roulette: `dd if=/dev/urandom of=/dev/kmem bs=1 count=1 seek=$RANDOM` __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] generated list element names
* jim holtman wubyg...@tznvy.pbz [2012-11-19 13:14:05 -0500]: How about this (if you don't like writing two lines, encapsulate it in a function): x - list(10) names(x) - paste('f', 'oo', sep = '') str(x) List of 1 $ foo: num 10 I am sorry, how is this different from my second snippet (except that you use x and I use z and you use single quotes in paste and I use double quotes)? On Mon, Nov 19, 2012 at 1:07 PM, Sam Steingold s...@gnu.org wrote: How can I create lists with element names created on the fly? --8---cut here---start-8--- list (foo = 10) $foo [1] 10 list (foo = 10) $foo [1] 10 list (paste(f,oo,sep=) = 10) Error: unexpected '=' in list (paste(f,oo,sep=) = --8---cut here---end---8--- I understand that tags in list() are not evaluated, but is there a more elegant way than --8---cut here---start-8--- z - list(10) names(z) - paste(f,oo,sep=) z $foo [1] 10 --8---cut here---end---8--- thanks! -- Sam Steingold (http://sds.podval.org/) on Ubuntu 12.04 (precise) X 11.0.11103000 http://www.childpsy.net/ http://www.memritv.org http://thereligionofpeace.com http://truepeace.org Unix roulette: `dd if=/dev/urandom of=/dev/kmem bs=1 count=1 seek=$RANDOM` __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Sam Steingold (http://sds.podval.org/) on Ubuntu 12.04 (precise) X 11.0.11103000 http://www.childpsy.net/ http://memri.org http://truepeace.org http://ffii.org http://think-israel.org http://jihadwatch.org http://palestinefacts.org The only time you have too much fuel is when you're on fire. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] aggregate() runs out of memory
Thanks Steve, what is the analogue of .N for min and max? i.e., what is the data.table's version of aggregate(infl$delay,by=list(infl$share.id),FUN=min) aggregate(infl$delay,by=list(infl$share.id),FUN=max) thanks! Sam. On Fri, Sep 14, 2012 at 3:40 PM, Steve Lianoglou mailinglist.honey...@gmail.com wrote: Hi, On Fri, Sep 14, 2012 at 3:26 PM, Sam Steingold s...@gnu.org wrote: I have a large data.frame Z (2,424,185,944 bytes, 10,256,441 rows, 17 columns). I want to get the result of table(aggregate(Z$V1, FUN = length, by = list(id=Z$V2))$x) alas, aggregate has been running for ~30 minute, RSS is 14G, VIRT is 24.3G, and no end in sight. both V1 and V2 are characters (not factors). Is there anything I could do to speed this up? Thanks. You might find you'll get a lot of mileage out of data.table when working with such large data.frames ... To get something close to what you're after, you can try: R library(data.table) R Z - as.data.table(Z) R setkeyv(Z, 'V2') R agg - Z[, list(count=.N), by='V2'] From here you might R tab1 - table(agg$count) I think that'll get you where you want to be ... I'm ashamed to say that I haven't really done much w/ aggregate since I mostly have used plyr and data.table like stuff, so I might be missing your end goal -- providing a reproducible example with a small data.frame from you can help here (for me at least). HTH, -steve -- Steve Lianoglou Graduate Student: Computational Systems Biology | Memorial Sloan-Kettering Cancer Center | Weill Medical College of Cornell University Contact Info: http://cbio.mskcc.org/~lianos/contact -- Sam Steingold http://sds.podval.org http://www.childpsy.net/ __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] LiblineaR: accept sparse matrices
Hi, * Thibault Helleputte guvonhyg.uryyrch...@qanylgvpf.pbz [2012-11-09 09:22:11 +0100]: The next release of LiblineaR should offer the possibility of using sparse matrices. However, the next release date is not fixed yet... thanks. On Thu, Nov 8, 2012 at 10:07 PM, Sam Steingold s...@gnu.org wrote: It would also be nice if there were functions to read/write files in the native liblinear file format; I am sure the original liblinear library provides at least the input code. How about i/o? -- Sam Steingold (http://sds.podval.org/) on Ubuntu 12.04 (precise) X 11.0.11103000 http://www.childpsy.net/ http://palestinefacts.org http://dhimmi.com http://think-israel.org http://www.memritv.org http://openvotingconsortium.org Money does not bother me at all. In fact, it calms me down. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] as.data.frame(do.call(rbind,lapply)) produces something weird
The following code: --8---cut here---start-8--- myfun - function (x) list(x=x,y=x*x) z - as.data.frame(do.call(rbind,lapply(1:3,function(x) c(a=paste(a,x,sep=),as.list(unlist(list(b=myfun(x),c=myfun(x*x*x z a b.x b.y c.x c.y 1 a1 1 1 1 1 2 a2 2 4 8 64 3 a3 3 9 27 729 --8---cut here---end---8--- the appearance of z is good, but str() and summary betray some weirdness: --8---cut here---start-8--- str(z) 'data.frame': 3 obs. of 5 variables: $ a :List of 3 ..$ : chr a1 ..$ : chr a2 ..$ : chr a3 $ b.x:List of 3 ..$ : int 1 ..$ : int 2 ..$ : int 3 $ b.y:List of 3 ..$ : int 1 ..$ : int 4 ..$ : int 9 $ c.x:List of 3 ..$ : int 1 ..$ : int 8 ..$ : int 27 $ c.y:List of 3 ..$ : int 1 ..$ : int 64 ..$ : int 729 --8---cut here---end---8--- how do I ensure that the columns of z are vectors, as in --8---cut here---start-8--- z - data.frame(a=c(a1,a2,a3),b.x=c(1,2,3),b.y=c(1,4,9),c.x=c(1,8,27),c.y=c(1,64,729)) z a b.x b.y c.x c.y 1 a1 1 1 1 1 2 a2 2 4 8 64 3 a3 3 9 27 729 str(z) 'data.frame': 3 obs. of 5 variables: $ a : Factor w/ 3 levels a1,a2,a3: 1 2 3 $ b.x: num 1 2 3 $ b.y: num 1 4 9 $ c.x: num 1 8 27 $ c.y: num 1 64 729 --8---cut here---end---8--- thanks! -- Sam Steingold (http://sds.podval.org/) on Ubuntu 12.04 (precise) X 11.0.11103000 http://www.childpsy.net/ http://jihadwatch.org http://think-israel.org http://www.PetitionOnline.com/tap12009/ http://honestreporting.com Programming is like sex: one mistake and you have to support it for a lifetime. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] as.data.frame(do.call(rbind, lapply)) produces something weird
* arun fznegcvax...@lnubb.pbz [2012-11-09 11:33:43 -0800]: z2-within(z1,{b.x-as.numeric(as.character(b.x));b.y-as.numeric(as.character(b.y));c.x-as.numeric(as.character(c.x));c.y-as.numeric(as.character(c.y))}) 1. I don't want to have to list all the column names explicitly 2. I find the num-char-num conversion repugnant and unacceptable. -- Sam Steingold (http://sds.podval.org/) on Ubuntu 12.04 (precise) X 11.0.11103000 http://www.childpsy.net/ http://www.PetitionOnline.com/tap12009/ http://truepeace.org http://honestreporting.com http://ffii.org What was the best thing before sliced bread? __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] LiblineaR: accept sparse matrices
* Ben Bolker ooby...@tznvy.pbz [2012-11-07 21:51:07 +]: Sam Steingold sds at gnu.org writes: It would be nice if LiblineaR() accepted data in the form of a sparse matrix (it does not accept whatever e1071::read.matrix.csr returns). It would also be nice if there were functions to read/write files in the native liblinear file format; I am sure the original liblinear library provides at least the input code. You appear to have sent this to the general R-help mailing list rather than to the maintainer (or maybe you Bcc'd the maintainer)? It was CCed (not BCCed) to Thibault Helleputte thellepu...@gmail.com Sparse matrices are nice, but once you start using sparse matrices you have to start worrying about the details of which linear algebra operators have been defined for them (e.g. whether the available operators allow pivoting, or work on rank-deficient matrices, or ...) So it's not always as easy as flipping a switch ... The library in question is merely a thin layes which passes the data to the underlying C++ library. The original library comes with a command line interface which accepts input file in sparse matrix format _ONLY_. -- Sam Steingold (http://sds.podval.org/) on Ubuntu 12.04 (precise) X 11.0.11103000 http://www.childpsy.net/ http://mideasttruth.com http://dhimmi.com http://honestreporting.com http://think-israel.org http://jihadwatch.org XFM: Exit file manager? [Continue] [Cancel] [Abort] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] matrix.csr %*% matrix -- matrix
* Martin Maechler znrpu...@fgng.zngu.rgum.pu [2012-11-07 10:10:51 +0100]: Sam == Sam Steingold s...@gnu.org on Tue, 6 Nov 2012 13:08:30 -0500 writes: Sam The question is even more pressing for me now given that I no longer can Sam convert some csr matrices to the regular ones for scaling. Sam (http://article.gmane.org/gmane.comp.lang.r.general:279305) Sam Any suggestions? (the original csr matrix is too large to be converted Sam to a regular one, but the product is small enough). * Sam Steingold f...@tah.bet [2012-08-27 14:58:47 -0400]: When a sparse matrix is multiplied by a regular one, the result is usually not sparse. However, when matrix.csr is multiplied by a regular matrix in R, a matrix.csr is produced. Is there a way to avoid this? Thanks! Why don't you use the sparse matrix classes from the Matrix package .. which is part of every R distribution ? SparseM has been written as very first package to support sparse matrices, and is to be applauded for that, but it does lack many features nowadays (and also uses less modern algorithm for e.g. the sparse Cholesky decomposition). Thank you very much for your advice. I do not think I use SparseM directly. I use e1071::read.matrix.csr and e1071::write.matrix.csr which use SparseM. I.e., I need to be able to do i/o on files which are palatable to libsvm/liblinear, specifically, read/write files like --8---cut here---start-8--- 1.2 2:3.5 6:5.1 2 4:6.7 8 7:6.6 --8---cut here---end---8--- As you can see from my other messages (e.g., http://article.gmane.org/gmane.comp.lang.r.general:279387), I am not happy with my current setup. I would be delighted to learn that there is an alternative, but so far the only matrix i/o I could find is Matrix::readHB and it does not handle the libsvm/liblinear format. -- Sam Steingold (http://sds.podval.org/) on Ubuntu 12.04 (precise) X 11.0.11103000 http://www.childpsy.net/ http://openvotingconsortium.org http://truepeace.org http://palestinefacts.org http://camera.org http://www.memritv.org Heck is a place for people who don't believe in gosh. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] c weirdness
is there a way to avoid c() appending .0 and .1 to seed? --8---cut here---start-8--- c(nons=1, seed=3) nons seed ## good! 13 c(nons=1, seed=tab[1]) nons seed.0 ## don't want .0! 1 2344600 c(nons=1, seed=tab[2]) nons seed.1 ## don't want .1! 1 6843 tab 0 1 23446006843 --8---cut here---end---8--- -- Sam Steingold (http://sds.podval.org/) on Ubuntu 12.04 (precise) X 11.0.11103000 http://www.childpsy.net/ http://pmw.org.il http://memri.org http://ffii.org http://openvotingconsortium.org Islam is a religion of Peace. Its adherents will kill anyone who disagrees. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] LiblineaR: accept sparse matrices
Thibault, It would be nice if LiblineaR() accepted data in the form of a sparse matrix (it does not accept whatever e1071::read.matrix.csr returns). It would also be nice if there were functions to read/write files in the native liblinear file format; I am sure the original liblinear library provides at least the input code. Thanks! -- Sam Steingold (http://sds.podval.org/) on Ubuntu 12.04 (precise) X 11.0.11103000 http://www.childpsy.net/ http://iris.org.il http://pmw.org.il http://ffii.org http://dhimmi.com http://www.PetitionOnline.com/tap12009/ Sex is like air. It's only a big deal if you can't get any. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] no method for coercing this S4 class to a vector
The matrix z is save()d in http://sds.podval.org/data/z. It is a product of a sparse matrix and a non-sparse matrix. I need to scale it and write to a file in the sparse format for libsvm. platform x86_64-pc-linux-gnu arch x86_64 os linux-gnu system x86_64, linux-gnu status major 2 minor 15.2 year 2012 month 10 day26 svn rev61015 language R version.string R version 2.15.2 (2012-10-26) Package:SparseM Version:0.96 Author: Roger Koenker rkoen...@uiuc.edu and Pin Ng pin...@nau.edu Maintainer: Roger Koenker rkoen...@uiuc.edu Depends:R (= 2.4.1), methods, stats, utils Description:Basic linear algebra for sparse matrices License:GPL (= 2) Title: Sparse Linear Algebra URL:http://www.econ.uiuc.edu/~roger/research/sparse/sparse.html Packaged: 2012-03-18 19:39:05 UTC; root Repository: CRAN Date/Publication: 2012-03-18 20:55:08 Built: R 2.15.2; x86_64-pc-linux-gnu; 2012-11-05 17:46:36 UTC; unix * Sam Steingold f...@tah.bet [2012-11-05 12:40:25 -0500]: all of a sudden, after a SparseM upgrade(?) I get this error: str(z) Formal class 'matrix.csr' [package SparseM] with 4 slots ..@ ra : num [1:85372672] -0.4288 0.0397 0.0104 -0.1843 -0.1203 ... ..@ ja : int [1:85372672] 1 2 3 4 5 6 7 8 9 10 ... ..@ ia : int [1:699777] 1 123 245 367 489 611 733 855 977 1099 ... ..@ dimension: int [1:2] 699776 122 z1-as.matrix(z) Error in as.vector(data) : no method for coercing this S4 class to a vector z1-scale(z) Error in as.vector(data) : no method for coercing this S4 class to a vector what has happened? how do I scale the matrix.csr object (to be written to a file)? PS. write.matrix.csr is very slow: it takes user system elapsed 1137.058 510.615 1649.925 to write the matrix z above. thanks! -- Sam Steingold (http://sds.podval.org/) on Ubuntu 12.04 (precise) X 11.0.11103000 http://www.childpsy.net/ http://openvotingconsortium.org http://www.memritv.org http://iris.org.il http://pmw.org.il He who laughs last thinks slowest. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] matrix.csr %*% matrix -- matrix
The question is even more pressing for me now given that I no longer can convert some csr matrices to the regular ones for scaling. (http://article.gmane.org/gmane.comp.lang.r.general:279305) Any suggestions? (the original csr matrix is too large to be converted to a regular one, but the product is small enough). * Sam Steingold f...@tah.bet [2012-08-27 14:58:47 -0400]: When a sparse matrix is multiplied by a regular one, the result is usually not sparse. However, when matrix.csr is multiplied by a regular matrix in R, a matrix.csr is produced. Is there a way to avoid this? Thanks! -- Sam Steingold (http://sds.podval.org/) on Ubuntu 12.04 (precise) X 11.0.11103000 http://www.childpsy.net/ http://www.memritv.org http://think-israel.org http://camera.org http://openvotingconsortium.org http://honestreporting.com If you have no enemies, you are probably dead. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] write.matrix.csr data conversion
David, thanks for adding the feature. read.matrix.csr and, especially, write.matrix.csr are extremely slow: usersystem elapsed 8381.988 3810.396 12345.349 for a 2797634 x 224 matrix I have to deal with. The help page http://rss.acs.unt.edu/Rdoc/library/e1071/html/read.matrix.csr.html says David Meyer (based on C/C++-code by Chih-Chung Chang and Chih-Jen Lin) is there any chance that you might consider replacing the R code with the original C/C++? Thanks a lot! * David Meyer qnivq.zr...@jh.np.ng [2012-08-27 22:57:17 +0200]: done, thanks for the suggestion. David On 2012-08-27 21:15, Sam Steingold wrote: * jim holtman wubyg...@tznvy.pbz [2012-08-27 14:55:08 -0400]: Most likely when 'y' is converted to a dataframe (not sure what the function 'write.matrix.csr' does since you did not say where you got it), sorry, library(e1071) '0' and '1' are converted to factors which probably show up as 1 and 2 in the file. sounds reasonable, thanks. David, could you please add an option `fac' to `write.matrix.csr', similar to `read.matrix.csr' which already accepts `fac'? thanks! -- Sam Steingold (http://sds.podval.org/) on Ubuntu 12.04 (precise) X 11.0.11103000 http://www.childpsy.net/ http://jihadwatch.org http://honestreporting.com http://iris.org.il http://www.memritv.org http://mideasttruth.com The only intuitive interface is the nipple. The rest has to be learned. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] write.matrix.csr data conversion
Dear David, * David Meyer zrl...@grpuavxhz-jvra.ng [2012-11-06 19:49:15 +0100]: there is C-code related to *reading* in such a file, but in the internal libsvm-format, not the matrix.csr format. How is the libsvm-format differ from matrix.csr format? I actually use matrix.csr only because it prints to what libsvm can read. There is certainly a way to speed this up, but I am not likely to do this in the near future. too bad. On 2012-11-06 19:15, Sam Steingold wrote: David, thanks for adding the feature. read.matrix.csr and, especially, write.matrix.csr are extremely slow: usersystem elapsed 8381.988 3810.396 12345.349 for a 2797634 x 224 matrix I have to deal with. The help page http://rss.acs.unt.edu/Rdoc/library/e1071/html/read.matrix.csr.html says David Meyer (based on C/C++-code by Chih-Chung Chang and Chih-Jen Lin) is there any chance that you might consider replacing the R code with the original C/C++? -- Sam Steingold (http://sds.podval.org/) on Ubuntu 12.04 (precise) X 11.0.11103000 http://www.childpsy.net/ http://truepeace.org http://mideasttruth.com http://openvotingconsortium.org http://memri.org http://pmw.org.il Programming is like sex: one mistake and you have to support it for a lifetime. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] no method for coercing this S4 class to a vector
all of a sudden, after a SparseM upgrade(?) I get this error: str(z) Formal class 'matrix.csr' [package SparseM] with 4 slots ..@ ra : num [1:85372672] -0.4288 0.0397 0.0104 -0.1843 -0.1203 ... ..@ ja : int [1:85372672] 1 2 3 4 5 6 7 8 9 10 ... ..@ ia : int [1:699777] 1 123 245 367 489 611 733 855 977 1099 ... ..@ dimension: int [1:2] 699776 122 z1-as.matrix(z) Error in as.vector(data) : no method for coercing this S4 class to a vector z1-scale(z) Error in as.vector(data) : no method for coercing this S4 class to a vector what has happened? how do I scale the matrix.csr object (to be written to a file)? PS. write.matrix.csr is very slow: it takes user system elapsed 1137.058 510.615 1649.925 to write the matrix z above. thanks! -- Sam Steingold (http://sds.podval.org/) on Ubuntu 12.04 (precise) X 11.0.11103000 http://www.childpsy.net/ http://thereligionofpeace.com http://iris.org.il http://jihadwatch.org A year spent in artificial intelligence is enough to make one believe in God. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] R 2.15.2 is released
Cool. I have some packages installed using install.packages(). Do I need to reinstall them? https://r-forge.r-project.org/tracker/?func=detailatid=294aid=2224group_id=61 Not a bug: This only happens under the circumstance of a Matrix package installation *not* matching your R installation. In other words: One way to fix your problem is to re install the Matrix package in the version of R you are using. So, will the bug reappear now? -- Sam Steingold (http://sds.podval.org/) on Ubuntu 12.04 (precise) X 11.0.11103000 http://www.childpsy.net/ http://openvotingconsortium.org http://mideasttruth.com http://www.memritv.org Lisp: Serious empowerment. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] R 2.15.2 is released
* Bert Gunter thagre.ore...@trar.pbz [2012-11-04 09:48:58 -0800]: ?update.packages It is not obvious to me that this is the answer to my question. Specifically, I have package X version 1.2.3 installed and built against R version 2.15.1. If 1.2.3 is the current latest version of X, then update.packages() will _not_ try to update it, but, apparently, at least for some packages, I do need to rebuild them against the new R version 2.15.2. Thanks. On Sun, Nov 4, 2012 at 7:01 AM, Sam Steingold s...@gnu.org wrote: I have some packages installed using install.packages(). Do I need to reinstall them? https://r-forge.r-project.org/tracker/?func=detailatid=294aid=2224group_id=61 Not a bug: This only happens under the circumstance of a Matrix package installation *not* matching your R installation. In other words: One way to fix your problem is to re install the Matrix package in the version of R you are using. So, will the bug reappear now? -- Sam Steingold (http://sds.podval.org/) on Ubuntu 12.04 (precise) X 11.0.11103000 http://www.childpsy.net/ http://americancensorship.org http://palestinefacts.org http://www.PetitionOnline.com/tap12009/ http://www.memritv.org http://memri.org If a woman is listening to a you without interrupting, do not wake her up! __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] R 2.15.2 is released
* Marc Schwartz znep_fpujn...@zr.pbz [2012-11-04 12:33:20 -0600]: On Nov 4, 2012, at 12:22 PM, Sam Steingold s...@gnu.org wrote: * Bert Gunter thagre.ore...@trar.pbz [2012-11-04 09:48:58 -0800]: ?update.packages It is not obvious to me that this is the answer to my question. Take note of the 'checkBuilt' argument, which defaults to FALSE... Thanks a lot! So, what I need to do is: update.packages(checkBuilt=TRUE, ask=FALSE, lib.loc=.libPaths()[grep(^/home/,.libPaths())]) -- Sam Steingold (http://sds.podval.org/) on Ubuntu 12.04 (precise) X 11.0.11103000 http://www.childpsy.net/ http://americancensorship.org http://pmw.org.il http://iris.org.il http://camera.org http://jihadwatch.org http://dhimmi.com Kleptomania: the ability to find stuff even before its owner loses it. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] how to concatenate factor vectors?
* Bert Gunter thagre.ore...@trar.pbz [2012-10-17 23:21:44 -0700]: However, Is level 5 in 'a' the same as level 5 in 'b' ? yes, of course. would anyone want to _different_ factors with identical string representations?! -- Sam Steingold (http://sds.podval.org/) on Ubuntu 12.04 (precise) X 11.0.11103000 http://www.childpsy.net/ http://pmw.org.il http://americancensorship.org http://memri.org http://think-israel.org http://camera.org Lisp is a language for doing what you've been told is impossible. - Kent Pitman __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] how to concatenate factor vectors?
hi Jorge, * Jorge I Velez wbetrvinair...@tznvy.pbz [2012-10-18 16:43:58 +1100]: a - factor(5:1,levels=1:9) b - factor(9:1,levels=1:9) lev - sort(unique(f - c(a, b))) f - factor(f, levels = lev) str(f) Factor w/ 9 levels 1,2,3,4,..: 5 4 3 2 1 9 8 7 6 5 ... is sort(unique()) really necessary? I think lev - levels(a) should be enough. However, this does not quite do what I want. I want a function which will _NOT_ have a non-factor vector as an intermediate value because that would waste a LOT of memory in my case. I want a function which will check that a and b have identical levels (in Lisp lingo, the levels are EQ, not just EQUALP). --8---cut here---start-8--- a - factor(letters[sample(1:10,20,replace=TRUE)],levels=letters) [1] e e a b c e j d a b h i a e e g j a c e Levels: a b c d e f g h i j k l m n o p q r s t u v w x y z b - factor(letters[sample(1:10,30,replace=TRUE)],levels=letters) [1] d d f c j b d e j j g i g j j g g a j a b e d c b i i a b f Levels: a b c d e f g h i j k l m n o p q r s t u v w x y z c(a,b) [1] 5 5 1 2 3 5 10 4 1 2 8 9 1 5 5 7 10 1 3 5 4 4 6 3 10 [26] 2 4 5 10 10 7 9 7 10 10 7 7 1 10 1 2 5 4 3 2 9 9 1 2 6 factor(letters[c(a,b)],levels=letters) [1] e e a b c e j d a b h i a e e g j a c e d d f c j b d e j j g i g j j g g a [39] j a b e d c b i i a b f Levels: a b c d e f g h i j k l m n o p q r s t u v w x y z --8---cut here---end---8--- however, this is not a direct way (unlike my unlist(list(...))): there is an intermediate integer vector c(a,b) which is mapped to a character vector via letters, which is converted back to integers (==factors). IIUC, a factor is an integer vector which knows that the integers refer to levels. c(a,b) creates such an integer vector. How do I tell it that it is a factor? -- Sam Steingold (http://sds.podval.org/) on Ubuntu 12.04 (precise) X 11.0.11103000 http://www.childpsy.net/ http://palestinefacts.org http://www.memritv.org http://www.PetitionOnline.com/tap12009/ http://dhimmi.com usually: can't pay == don't buy. software: can't buy == don't pay __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] how to concatenate factor vectors?
* R. Michael Weylandt zvpunry.jrlyn...@tznvy.pbz [2012-10-18 16:01:37 +0100]: On Thursday, October 18, 2012, Sam Steingold wrote: * Bert Gunter thagre.ore...@trar.pbz [2012-10-17 23:21:44 -0700]: However, Is level 5 in 'a' the same as level 5 in 'b' ? yes, of course. would anyone want to _different_ factors with identical string representations?! Off the cuff, studying education and grades: F could be a grade or a gender. would you ever want to concatenate a vector of grades with a vector of genders? as I said elsewhere, the function which concatenates factors must check that the levels are identical before proceeding. -- Sam Steingold (http://sds.podval.org/) on Ubuntu 12.04 (precise) X 11.0.11103000 http://www.childpsy.net/ http://pmw.org.il http://camera.org http://openvotingconsortium.org http://truepeace.org http://jihadwatch.org Ernqvat guvf ivbyngrf QZPN. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] how to concatenate factor vectors?
* Jeff Newmiller wqarj...@qpa.qnivf.pn.hf [2012-10-18 07:53:24 -0700]: If you HAVE defined your factors using explicit levels definitions, you should have no trouble combining them. http://article.gmane.org/gmane.comp.lang.r.general:277719 -- Sam Steingold (http://sds.podval.org/) on Ubuntu 12.04 (precise) X 11.0.11103000 http://www.childpsy.net/ http://memri.org http://iris.org.il http://pmw.org.il http://think-israel.org http://honestreporting.com http://www.memritv.org A person without flaws probably lacks strengths either. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] how to concatenate factor vectors?
* William Dunlap jqha...@gvopb.pbz [2012-10-18 15:33:38 +]: c() has an unfortunate history. :-) ISTR reading in the R manual ~15(?) years ago that the language was in a flux and one could not expect code written for the current release to work in the next release. I was considering R as the graphing back end at that time, so this note turned me off. Now it turns out that R has a legacy it cannot shake. :-) Or, you can decide to write a new concatenation function and stop using c(). As for EQ vs. EQUALP, don't even think of EQ in R: it doesn't make sense there. identical() is a pretty quick way to check that two objects have identical contents. Good! That's what I was looking for! concatenate.factors - function (x, y) { stopifnot(identical(levels(x),levels(y))) unlist(list(x,y), use.names=FALSE) } This seems to do what I need. I see that identical(levels(concatenate.factors(a,b)),levels(a)) == TRUE DIUC that concatenate.factors does NOT create an intermediate vector and then re-factor it? Thank you very much for your insight! -- Sam Steingold (http://sds.podval.org/) on Ubuntu 12.04 (precise) X 11.0.11103000 http://www.childpsy.net/ http://jihadwatch.org http://openvotingconsortium.org http://www.memritv.org http://memri.org http://truepeace.org Live Lisp and prosper. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] uniq -c
* Sam Steingold f...@tah.bet [2012-10-16 11:03:27 -0400]: I need an analogue of uniq -c for a data frame. Summary of options: 1. William: isFirstInRun - function(x) UseMethod(isFirstInRun) isFirstInRun.default - function(x) c(TRUE, x[-1] != x[-length(x)]) isFirstInRun.data.frame - function(x) { stopifnot(ncol(x)0) retval - isFirstInRun(x[[1]]) for(column in x) { retval - retval | isFirstInRun(column) } retval } row.count.1 - function (x) { i - which(isFirstInRun(x)) data.frame(x[i,], count=diff(c(i, 1L+nrow(x } 147 seconds 2. http://orgmode.org/worg/org-contrib/babel/examples/Rpackage.html#sec-6-1 row.count.2 - function (x) { equal.to.previous - rowSums( x[2:nrow(x),] != x[1:(nrow(x)-1),] )==0 tf.runs - rle(equal.to.previous) counts - c(1, unlist(mapply(function(x,y) if (y) x+1 else (rep(1,x)), tf.runs$length, tf.runs$value))) counts - counts[ c( diff( counts ) = 0, TRUE ) ] unique.rows - which( c(TRUE, !equal.to.previous ) ) cbind(x[ unique.rows, ,drop=FALSE ], counts) } 136 seconds 3. Micael: paste/strsplit row.count.3 - function (x) { pa - do.call(paste,x) rl - rle(p) sp - strsplit(as.character(rl$values), ) data.frame(user = sapply(sp,[,1), country = sapply(sp,[,2), language = sapply(sp,[,3), count = rl$length) } here I know the columns and rely on absense of spaces in values. 27 seconds. Thanks to all who answered. -- Sam Steingold (http://sds.podval.org/) on Ubuntu 12.04 (precise) X 11.0.11103000 http://www.childpsy.net/ http://www.PetitionOnline.com/tap12009/ http://thereligionofpeace.com http://ffii.org http://camera.org A slave dreams not of Freedom, but of owning his own slaves. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] how to concatenate factor vectors?
How do I concatenate two vectors of factors? --8---cut here---start-8--- a - factor(5:1,levels=1:9) b - factor(9:1,levels=1:9) str(c(a,b)) int [1:14] 5 4 3 2 1 9 8 7 6 5 ... str(unlist(list(a,b),use.names=FALSE)) Factor w/ 9 levels 1,2,3,4,..: 5 4 3 2 1 9 8 7 6 5 ... --8---cut here---end---8--- so, unlist(list()) works. is there a better way or is this how this is supposed to be done? Thanks! -- Sam Steingold (http://sds.podval.org/) on Ubuntu 12.04 (precise) X 11.0.11103000 http://www.childpsy.net/ http://honestreporting.com http://think-israel.org http://thereligionofpeace.com http://mideasttruth.com (lisp programmers do it better) __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] uniq -c
, 4475376L, 4475377L, 4475378L, 4475379L, 5500564L, 7871329L, 7871330L, 8670694L), class = data.frame) --8---cut here---end---8--- thanks! -- Sam Steingold (http://sds.podval.org/) on Ubuntu 12.04 (precise) X 11.0.11103000 http://www.childpsy.net/ http://thereligionofpeace.com http://dhimmi.com http://ffii.org http://truepeace.org http://mideasttruth.com Bus error -- please leave by the rear door. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] uniq -c
* R. Michael Weylandt zvpunry.jrlyn...@tznvy.pbz [2012-10-16 16:19:27 +0100]: Have you looked at using table() directly? If I understand what you want correctly something like: table(do.call(paste, x)) I wished to avoid paste (I will have to re-split later, so it will be a performance nightmare). Also, if you take a look at the development version of R, changes are being put in place to allow much larger data sets. xtabs(), although dog slow, would have footed the bill nicely: --8---cut here---start-8--- x - data.frame(a=1:32,b=1:32,c=1:32,d=1:32,e=1:32) system.time(subset(as.data.frame(xtabs( ~. , x )), Freq != 0 )) user system elapsed 12.788 4.288 17.224 --8---cut here---end---8--- you should not need much larger data sets for this. x is sorted. -- Sam Steingold (http://sds.podval.org/) on Ubuntu 12.04 (precise) X 11.0.11103000 http://www.childpsy.net/ http://openvotingconsortium.org http://iris.org.il http://www.memritv.org http://memri.org http://think-israel.org Just because you're paranoid doesn't mean they AREN'T after you. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] uniq -c
* Duncan Murdoch zheqbpu.qha...@tznvy.pbz [2012-10-16 12:47:36 -0400]: On 16/10/2012 12:29 PM, Sam Steingold wrote: x is sorted. sparseby(data=x, INDICES=x, FUN=nrow) this takes forever; apparently, it does not use the fact that x is sorted (even then - it should not take more than a few minutes)... -- Sam Steingold (http://sds.podval.org/) on Ubuntu 12.04 (precise) X 11.0.11103000 http://www.childpsy.net/ http://openvotingconsortium.org http://www.memritv.org http://think-israel.org http://pmw.org.il http://thereligionofpeace.com Save the whales, feed the hungry, free the mallocs. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] cannot coerce class 'rle' into a data.frame
why? rle Run Length Encoding lengths: int [1:1650061] 2 2 8 2 4 5 6 3 26 46 ... values : chr [1:1650061] 4bbf9e94cbceb70c BG bg 4fbbf2c67e0fb867 SK sk ... as.data.frame(rle) Error in as.data.frame.default(vertices.rle) : cannot coerce class 'rle' into a data.frame it seems that rle.df - data.frame(values=rle$values,length=rle$length) works and DTRT. -- Sam Steingold (http://sds.podval.org/) on Ubuntu 12.04 (precise) X 11.0.11103000 http://www.childpsy.net/ http://iris.org.il http://memri.org http://www.PetitionOnline.com/tap12009/ http://camera.org char*a=char*a=%c%s%c;main(){printf(a,34,a,34);};main(){printf(a,34,a,34);} __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] uniq -c
* Duncan Murdoch zheqbpu.qha...@tznvy.pbz [2012-10-16 12:47:36 -0400]: sparseby(data=x, INDICES=x, FUN=nrow) Error in `[-.data.frame`(`*tmp*`, index, , value = list(user = c(2L, : missing values are not allowed in subscripted assignments of data frames -- Sam Steingold (http://sds.podval.org/) on Ubuntu 12.04 (precise) X 11.0.11103000 http://www.childpsy.net/ http://camera.org http://mideasttruth.com http://palestinefacts.org http://www.memritv.org http://thereligionofpeace.com Diplomacy is the art of saying nice doggy until you can find a nice rock. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] uniq -c
* Duncan Murdoch zheqbpu.qha...@tznvy.pbz [2012-10-16 14:22:51 -0400]: On 16/10/2012 1:46 PM, Sam Steingold wrote: * Duncan Murdoch zheqbpu.qha...@tznvy.pbz [2012-10-16 12:47:36 -0400]: On 16/10/2012 12:29 PM, Sam Steingold wrote: x is sorted. sparseby(data=x, INDICES=x, FUN=nrow) this takes forever; apparently, it does not use the fact that x is sorted (even then - it should not take more than a few minutes)... It was more or less instantaneous on the examples you posted. It would be a bit more honest to say it was fast on the examples, but it was very slow when I ran it on my real data, which consists of 100 cases. sure, I did not mean any insult to your code, sorry. all I was saying was that it was too slow for my purposes because it ignores the fact that the data is sorted. it turned out that paste+sort+rle+strsplit is fast enough. (although there should be a way to avoid paste/strsplit!) Thanks! -- Sam Steingold (http://sds.podval.org/) on Ubuntu 12.04 (precise) X 11.0.11103000 http://www.childpsy.net/ http://camera.org http://truepeace.org http://jihadwatch.org http://www.PetitionOnline.com/tap12009/ Every day above ground is a good day. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] what to use for sna/graphs?
What do people use for SNA/graph analysis in R? So far I have been using igraph (it implements the Louvain community detection algorithm as multilevel.community, which is the killer feature for me). However, igraph is severely lacking in visualization, which I also need. graphviz gephi are alleged to be good at visualization, but, apparently, not so for analysis (specifically, community detection). Also, it appears that there is no way to directly interface R to gephi (apparently I am supposed to save graphs into csv and read them into gephi separately), and the Rgraphviz package must be installed in a quite unorthodox way (source(http://bioconductor.org/biocLite.R;); biocLite(Rgraphviz)); and then it is not clear how to turn an IGRAPH graph object into an Ragraph object which Rgraphviz can handle. So, what/how do people use/recommend? Thanks! -- Sam Steingold (http://sds.podval.org/) on Ubuntu 12.04 (precise) X 11.0.11103000 http://www.childpsy.net/ http://www.PetitionOnline.com/tap12009/ http://jihadwatch.org http://think-israel.org http://truepeace.org You can have it good, soon or cheap. Pick two... __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Rgraphviz: how to read a dot file?
The Rgraphviz package index says nothing about reading dot files. (it has toFile to write them but no fromFile). How do I create an Ragraph object? (either by reading a dot file or from a list of edges with weights and vertices with names and other attributes). -- Sam Steingold (http://sds.podval.org/) on Ubuntu 12.04 (precise) X 11.0.11103000 http://www.childpsy.net/ http://truepeace.org http://americancensorship.org http://honestreporting.com http://openvotingconsortium.org Is there another word for synonym? __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] a merge() problem
* Prof Brian Ripley evc...@fgngf.bk.np.hx [2012-10-08 06:37:07 +0100]: On 08/10/2012 02:57, Peter Ehlers wrote: On 2012-10-07 14:44, Sam Steingold wrote: * Peter Ehlers ruy...@hpnytnel.pn [2012-10-07 10:03:42 -0700]: On 2012-10-07 08:34, Sam Steingold wrote: I know it does not look very good - using the same column names to mean different things in different data frames, but here you go: --8---cut here---start-8--- x - data.frame(a=c(1,2,3),b=c(4,5,6)) y - data.frame(b=c(1,2),a=c(a,b)) merge(x,y,by.x=a,by.y=b,all.x=TRUE,suffixes=c(,y)) a ba 1 1 4a 2 2 5b 3 3 6 NA Warning message: In merge.data.frame(x, y, by.x = a, by.y = b, all.x = TRUE) : column name 'a' is duplicated in the result --8---cut here---end---8--- why is the suffixes argument ignored? I mean, I expected that the second a to be a.y. The 'suffixes' argument refers to _non-by_ names only (as per ?merge). yes, but a in y is _not_ a by-name. Yes, it is. The set of by-names is the union of names specified by by.x and by.y, in your case: c(a, b). I suppose that a case could be made that ?merge does not spell that out sufficiently explicitly. It does in 'Details' (and where else would there be such a detail?) E.g. in R 2.15.1: If the remaining columns in the data frames have any common names, these have ‘suffixes’ (‘.x’ and ‘.y’ by default) appended to try to make the names of the result unique. If this is not possible, an error is thrown. Note *remaining*, and read what comes before that. I read the docs and re-read them after seeing your message and, with all due respect, I fail to interpret them the way you do: The doc speaks about columns to merge on, not column names. I specify both by.x and by.y, thus I do not specify the column y$b. Note, however, that I do not want the doc fixed, I want the behavior modified. I see no advantage in the current behavior (a warning + duplicate column names) as opposed to the behavior I expected (renaming the column in the result to b.y). Thanks a lot for your kind replies and insight! -- Sam Steingold (http://sds.podval.org/) on Ubuntu 12.04 (precise) X 11.0.11103000 http://www.childpsy.net/ http://americancensorship.org http://iris.org.il http://jihadwatch.org http://ffii.org http://truepeace.org Never argue with the person who is preparing your parachute. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] a merge() problem
I know it does not look very good - using the same column names to mean different things in different data frames, but here you go: --8---cut here---start-8--- x - data.frame(a=c(1,2,3),b=c(4,5,6)) y - data.frame(b=c(1,2),a=c(a,b)) merge(x,y,by.x=a,by.y=b,all.x=TRUE,suffixes=c(,y)) a ba 1 1 4a 2 2 5b 3 3 6 NA Warning message: In merge.data.frame(x, y, by.x = a, by.y = b, all.x = TRUE) : column name 'a' is duplicated in the result --8---cut here---end---8--- why is the suffixes argument ignored? I mean, I expected that the second a to be a.y. (when I omit suffixes, the result is the same). Thanks. -- Sam Steingold (http://sds.podval.org/) on Ubuntu 12.04 (precise) X 11.0.11103000 http://www.childpsy.net/ http://palestinefacts.org http://honestreporting.com http://truepeace.org http://openvotingconsortium.org My name is Deja Vu. Have we met before? __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] a merge() problem
* Peter Ehlers ruy...@hpnytnel.pn [2012-10-07 10:03:42 -0700]: On 2012-10-07 08:34, Sam Steingold wrote: I know it does not look very good - using the same column names to mean different things in different data frames, but here you go: --8---cut here---start-8--- x - data.frame(a=c(1,2,3),b=c(4,5,6)) y - data.frame(b=c(1,2),a=c(a,b)) merge(x,y,by.x=a,by.y=b,all.x=TRUE,suffixes=c(,y)) a ba 1 1 4a 2 2 5b 3 3 6 NA Warning message: In merge.data.frame(x, y, by.x = a, by.y = b, all.x = TRUE) : column name 'a' is duplicated in the result --8---cut here---end---8--- why is the suffixes argument ignored? I mean, I expected that the second a to be a.y. The 'suffixes' argument refers to _non-by_ names only (as per ?merge). yes, but a in y is _not_ a by-name. -- Sam Steingold (http://sds.podval.org/) on Ubuntu 12.04 (precise) X 11.0.11103000 http://www.childpsy.net/ http://americancensorship.org http://think-israel.org http://www.memritv.org http://mideasttruth.com Save time: send elected officials to jail directly, bypassing the office. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] max summary contradict each other
why does summary report max 27600 and not 27603? x - c(27603, 1) max(x) [1] 27603 summary(x) Min. 1st Qu. MedianMean 3rd Qu.Max. 16902 13800 13800 20700 27600 -- Sam Steingold (http://sds.podval.org/) on Ubuntu 12.04 (precise) X 11.0.11103000 http://www.childpsy.net/ http://memri.org http://pmw.org.il http://dhimmi.com http://iris.org.il http://mideasttruth.com Vegetarians eat Vegetables, Humanitarians are scary. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] aggregate help
Thanks. Why does aggregate(z, list(id=z$id),FUN=list) id id a1 a2 1 10 10, 10, 10 a, a, b x, x, z 2 20 20, 20b, by, y 3 30 30 c z work, but aggregate(z, list(id=z$id),FUN=function(l) { t - sort(table(l),decreasing=TRUE) list(length(t),t[1],names(t)[1],t[2],names(t)[2]) }) id id a1 a2 1 10 1 2 2 2 20 1 1 1 3 30 1 1 1 Warning message: In format.data.frame(x, digits = digits, na.encode = FALSE) : corrupt data frame: columns will be truncated or padded with NAs does not? (I do not want to put the whole list of all possible values into the return value of aggregate because I am afraid of running out of ram) * arun fznegcvax...@lnubb.pbz [2012-09-20 14:24:37 -0700]: Hi, Try this: z1-aggregate(z,list(id=z$id),FUN=paste,sep=,) dat1-data.frame(id=z1[,1],a1.total=unlist(lapply(z1[,3],length)),a1.val1=unique(z$a1),a1.num=unlist(lapply(lapply(z1[,3],table),`[`,1)),a1.val2=unlist(lapply(z1[,3],`[`,3)),a1.num2=unlist(lapply(lapply(z1[,3],table),`[`,2)),a2.total=unlist(lapply(z1[,4],length)),a2.val1=unique(z$a2),a2.num=unlist(lapply(lapply(z1[,4],table),`[`,1)),a2.val2=unlist(lapply(z1[,4],`[`,3)),a2.num2=unlist(lapply(lapply(z1[,4],table),`[`,2))) dat1 # id a1.total a1.val1 a1.num a1.val2 a1.num2 a2.total a2.val1 a2.num a2.val2 #0 10 3 a 2 b 1 3 x 2 z #1 20 2 b 2 NA NA 2 y 2 NA #2 30 1 c 1 NA NA 1 z 1 NA # a2.num2 #0 1 #1 NA #2 NA #It is not an elegant way! A.K. - Original Message - From: Sam Steingold s...@gnu.org To: r-help@r-project.org Cc: Sent: Thursday, September 20, 2012 2:06 PM Subject: [R] aggregate help I want to count attributes of IDs: z - data.frame(id=c(10,20,10,30,10,20), a1=c(a,b,a,c,b,b), a2=c(x,y,x,z,z,y), stringsAsFactors=FALSE) z id a1 a2 1 10 a x 2 20 b y 3 10 a x 4 30 c z 5 10 b z 6 20 b y I want to get something like id a1.tot a1.val1 a1.num1 a1.val2 a1.num2 a2.tot a2.val1 a2.num1 a2.val2 a2.num2 10 3 a 2 b 1 3 x 2 z 1 20 2 b 2 NA 0 2 y 2 NA 0 30 1 c 1 NA 0 1 z 1 NA 0 (except that I don't care what appears in the cells marked with NA) I tried this: aggregate(z,by=list(id=z$id),function (s) { t - sort(table(s),decreasing=TRUE) if (length(t) == 1) list(length(s),names(t)[1],t[1],junk,0) else list(length(s),names(t)[1],t[1],names(t)[2],t[2]) }) id id a1 a2 1 10 3 3 3 2 20 2 2 2 3 30 1 1 1 Warning message: In format.data.frame(x, digits = digits, na.encode = FALSE) : corrupt data frame: columns will be truncated or padded with NAs Thanks! -- Sam Steingold (http://sds.podval.org/) on Ubuntu 12.04 (precise) X 11.0.11103000 http://www.childpsy.net/ http://mideasttruth.com http://think-israel.org http://jihadwatch.org http://palestinefacts.org http://iris.org.il Bill Gates is great, as long as `bill' is a verb. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] aggregate help
I want to count attributes of IDs: --8---cut here---start-8--- z - data.frame(id=c(10,20,10,30,10,20), a1=c(a,b,a,c,b,b), a2=c(x,y,x,z,z,y), stringsAsFactors=FALSE) z id a1 a2 1 10 a x 2 20 b y 3 10 a x 4 30 c z 5 10 b z 6 20 b y --8---cut here---end---8--- I want to get something like --8---cut here---start-8--- id a1.tot a1.val1 a1.num1 a1.val2 a1.num2 a2.tot a2.val1 a2.num1 a2.val2 a2.num2 10 3 a 2 b 1 3 x 2 z 1 20 2 b 2 NA 0 2 y 2 NA0 30 1 c 1 NA 0 1 z 1 NA0 --8---cut here---end---8--- (except that I don't care what appears in the cells marked with NA) I tried this: --8---cut here---start-8--- aggregate(z,by=list(id=z$id),function (s) { t - sort(table(s),decreasing=TRUE) if (length(t) == 1) list(length(s),names(t)[1],t[1],junk,0) else list(length(s),names(t)[1],t[1],names(t)[2],t[2]) }) id id a1 a2 1 10 3 3 3 2 20 2 2 2 3 30 1 1 1 Warning message: In format.data.frame(x, digits = digits, na.encode = FALSE) : corrupt data frame: columns will be truncated or padded with NAs --8---cut here---end---8--- Thanks! -- Sam Steingold (http://sds.podval.org/) on Ubuntu 12.04 (precise) X 11.0.11103000 http://www.childpsy.net/ http://www.memritv.org http://palestinefacts.org http://pmw.org.il http://dhimmi.com http://jihadwatch.org http://ffii.org I'm out of my mind, but feel free to leave a message... __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] drop zero slots from table?
I find myself doing --8---cut here---start-8--- tab - table(...) tab - tab[tab 0] tab - sort(tab,decreasing=TRUE) --8---cut here---end---8--- all the time. I am wondering if the drop 0 (and maybe even sort?) can be effected by some magic argument to table() which I fail to discover in the docs? Obviously, I could use droplevels() to avoid 0 counts in the first place, but I do not want to drop the levels in the data. -- Sam Steingold (http://sds.podval.org/) on Ubuntu 12.04 (precise) X 11.0.11103000 http://www.childpsy.net/ http://ffii.org http://truepeace.org http://www.memritv.org http://honestreporting.com http://dhimmi.com MS Windows: error: the operation completed successfully. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] drop zero slots from table?
Function --8---cut here---start-8--- sorted.table - function (vec) { tab - table(vec) tab - tab[tab 0] sort(tab, decreasing=TRUE) } --8---cut here---end---8--- does what I want but it prints vec instead of the name of its argument: --8---cut here---start-8--- sorted.table(foo$bar) vec A B 10 3 --8---cut here---end---8--- how do I pass all arguments of sorted.table() on to table() as is? thanks! * Sam Steingold f...@tah.bet [2012-09-19 11:51:08 -0400]: I find myself doing tab - table(...) tab - tab[tab 0] tab - sort(tab,decreasing=TRUE) all the time. I am wondering if the drop 0 (and maybe even sort?) can be effected by some magic argument to table() which I fail to discover in the docs? Obviously, I could use droplevels() to avoid 0 counts in the first place, but I do not want to drop the levels in the data. -- Sam Steingold (http://sds.podval.org/) on Ubuntu 12.04 (precise) X 11.0.11103000 http://www.childpsy.net/ http://memri.org http://thereligionofpeace.com http://www.PetitionOnline.com/tap12009/ http://dhimmi.com Beauty is only a light switch away. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] drop zero slots from table?
cool, thanks! Still, I wonder if there is a way to pass all args as is from a function downward (like in a lisp macro); something like sorted.table - function (...) { tab - table(...); ... } * William Dunlap jqha...@gvopb.pbz [2012-09-19 16:26:08 +]: Here is one way: sorted.table - function(x, name = if (is.list(x))names(x) else deparse(substitute(x))[1]) { +tab - table(x) +names(dimnames(tab)) - name +tab - tab[tab 0] +sort(tab, decreasing=TRUE) + } digits - factor(trunc(100*log2(1:20)%%.1), levels=0:9) sorted.table(digits) digits 0 8 2 6 4 5 9 4 3 2 1 1 sorted.table(data.frame(DigitsColumn=digits)) DigitsColumn 0 8 2 6 4 5 9 4 3 2 1 1 sorted.table(digits, name=My Digits) My Digits 0 8 2 6 4 5 9 4 3 2 1 1 Bill Dunlap Spotfire, TIBCO Software wdunlap tibco.com -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On Behalf Of Sam Steingold Sent: Wednesday, September 19, 2012 9:13 AM To: r-help@r-project.org Subject: Re: [R] drop zero slots from table? Function --8---cut here---start-8--- sorted.table - function (vec) { tab - table(vec) tab - tab[tab 0] sort(tab, decreasing=TRUE) } --8---cut here---end---8--- does what I want but it prints vec instead of the name of its argument: --8---cut here---start-8--- sorted.table(foo$bar) vec A B 10 3 --8---cut here---end---8--- how do I pass all arguments of sorted.table() on to table() as is? thanks! * Sam Steingold f...@tah.bet [2012-09-19 11:51:08 -0400]: I find myself doing tab - table(...) tab - tab[tab 0] tab - sort(tab,decreasing=TRUE) all the time. I am wondering if the drop 0 (and maybe even sort?) can be effected by some magic argument to table() which I fail to discover in the docs? Obviously, I could use droplevels() to avoid 0 counts in the first place, but I do not want to drop the levels in the data. -- Sam Steingold (http://sds.podval.org/) on Ubuntu 12.04 (precise) X 11.0.11103000 http://www.childpsy.net/ http://memri.org http://thereligionofpeace.com http://www.PetitionOnline.com/tap12009/ http://dhimmi.com Beauty is only a light switch away. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Sam Steingold (http://sds.podval.org/) on Ubuntu 12.04 (precise) X 11.0.11103000 http://www.childpsy.net/ http://thereligionofpeace.com http://mideasttruth.com http://palestinefacts.org http://openvotingconsortium.org http://truepeace.org If you will not bring your husband coffee in bed, his day may start with a beer. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] where are these NAs coming from?
I see this: --8---cut here---start-8--- length(which(is.na(z$language))) [1] 0 locals - z[z$country == mycountry,] length(which(is.na(locals$language))) [1] 229 --8---cut here---end---8--- where are those locals without the language coming from?! -- Sam Steingold (http://sds.podval.org/) on Ubuntu 12.04 (precise) X 11.0.11103000 http://www.childpsy.net/ http://ffii.org http://honestreporting.com http://camera.org http://www.memritv.org http://dhimmi.com I don't like cats! -- Come on, you just don't know how to cook them! __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] where are these NAs coming from?
Thanks, Sarah, your answer is, indeed, revealing: --8---cut here---start-8--- z - data.frame(a=c(1,2,3),b=c(5,6,NA)) z a b 1 1 5 2 2 6 3 3 NA z[z$b==6,] a b 2 2 6 NA NA NA --8---cut here---end---8--- why do I get an extra all NA row? * Sarah Goslee fnenu.tbf...@tznvy.pbz [2012-09-19 13:54:56 -0400]: Well, you have no reproducible example, but I suspect either of these will fix it: locals - z[z$country == mycountry !is.na(z$country),] locals - subset(z, country == mycountry) Sarah On Wed, Sep 19, 2012 at 1:50 PM, Sam Steingold s...@gnu.org wrote: I see this: --8---cut here---start-8--- length(which(is.na(z$language))) [1] 0 locals - z[z$country == mycountry,] length(which(is.na(locals$language))) [1] 229 --8---cut here---end---8--- where are those locals without the language coming from?! -- Sam Steingold (http://sds.podval.org/) on Ubuntu 12.04 (precise) X 11.0.11103000 http://www.childpsy.net/ http://americancensorship.org http://honestreporting.com http://truepeace.org http://ffii.org .ACMD setaloiv siht gnidaeR __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] drop zero slots from table?
* William Dunlap jqha...@gvopb.pbz [2012-09-19 18:20:50 +]: Why don't you try that and tell us if it works? Because in my wildest dreams it did not occur to me that this could be valid code in any programming language. It appears to be valid R, which seems to be out-perling Perl at every turn. However, it does not do what I want: it does not result in the right name for the returned table. Thanks a lot for your insight! -Original Message- From: Sam Steingold [mailto:sam.steing...@gmail.com] On Behalf Of Sam Steingold Sent: Wednesday, September 19, 2012 10:48 AM To: r-help@r-project.org; William Dunlap Subject: Re: drop zero slots from table? cool, thanks! Still, I wonder if there is a way to pass all args as is from a function downward (like in a lisp macro); something like sorted.table - function (...) { tab - table(...); ... } * William Dunlap jqha...@gvopb.pbz [2012-09-19 16:26:08 +]: Here is one way: sorted.table - function(x, name = if (is.list(x))names(x) else deparse(substitute(x))[1]) { +tab - table(x) +names(dimnames(tab)) - name +tab - tab[tab 0] +sort(tab, decreasing=TRUE) + } digits - factor(trunc(100*log2(1:20)%%.1), levels=0:9) sorted.table(digits) digits 0 8 2 6 4 5 9 4 3 2 1 1 sorted.table(data.frame(DigitsColumn=digits)) DigitsColumn 0 8 2 6 4 5 9 4 3 2 1 1 sorted.table(digits, name=My Digits) My Digits 0 8 2 6 4 5 9 4 3 2 1 1 Bill Dunlap Spotfire, TIBCO Software wdunlap tibco.com -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On Behalf Of Sam Steingold Sent: Wednesday, September 19, 2012 9:13 AM To: r-help@r-project.org Subject: Re: [R] drop zero slots from table? Function --8---cut here---start-8--- sorted.table - function (vec) { tab - table(vec) tab - tab[tab 0] sort(tab, decreasing=TRUE) } --8---cut here---end---8--- does what I want but it prints vec instead of the name of its argument: --8---cut here---start-8--- sorted.table(foo$bar) vec A B 10 3 --8---cut here---end---8--- how do I pass all arguments of sorted.table() on to table() as is? thanks! * Sam Steingold f...@tah.bet [2012-09-19 11:51:08 -0400]: I find myself doing tab - table(...) tab - tab[tab 0] tab - sort(tab,decreasing=TRUE) all the time. I am wondering if the drop 0 (and maybe even sort?) can be effected by some magic argument to table() which I fail to discover in the docs? Obviously, I could use droplevels() to avoid 0 counts in the first place, but I do not want to drop the levels in the data. -- Sam Steingold (http://sds.podval.org/) on Ubuntu 12.04 (precise) X 11.0.11103000 http://www.childpsy.net/ http://memri.org http://thereligionofpeace.com http://www.PetitionOnline.com/tap12009/ http://dhimmi.com Beauty is only a light switch away. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Sam Steingold (http://sds.podval.org/) on Ubuntu 12.04 (precise) X 11.0.11103000 http://www.childpsy.net/ http://thereligionofpeace.com http://mideasttruth.com http://palestinefacts.org http://openvotingconsortium.org http://truepeace.org If you will not bring your husband coffee in bed, his day may start with a beer. -- Sam Steingold (http://sds.podval.org/) on Ubuntu 12.04 (precise) X 11.0.11103000 http://www.childpsy.net/ http://truepeace.org http://iris.org.il http://thereligionofpeace.com http://palestinefacts.org Feynman: 'Philosophy of science is as useful to scientists as ornithology is to birds' __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] where are these NAs coming from?
* jim holtman wubyg...@tznvy.pbz [2012-09-19 13:58:08 -0400]: At least provide a reproducible example by creating the problem with a subset of 'z' and 'mycountry' if I knew how to reproduce the problem, I would have known what was going on. Could something like this be happening? precisely, thanks! x - data.frame(country = 1:5, language = 1:5) mycountry - NA z - x[x$country == mycountry,] z country language NANA NA NA.1 NA NA NA.2 NA NA NA.3 NA NA NA.4 NA NA On Wed, Sep 19, 2012 at 1:50 PM, Sam Steingold s...@gnu.org wrote: I see this: --8---cut here---start-8--- length(which(is.na(z$language))) [1] 0 locals - z[z$country == mycountry,] length(which(is.na(locals$language))) [1] 229 --8---cut here---end---8--- where are those locals without the language coming from?! -- Sam Steingold (http://sds.podval.org/) on Ubuntu 12.04 (precise) X 11.0.11103000 http://www.childpsy.net/ http://ffii.org http://honestreporting.com http://camera.org http://www.memritv.org http://dhimmi.com I don't like cats! -- Come on, you just don't know how to cook them! __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Sam Steingold (http://sds.podval.org/) on Ubuntu 12.04 (precise) X 11.0.11103000 http://www.childpsy.net/ http://ffii.org http://camera.org http://jihadwatch.org http://americancensorship.org http://mideasttruth.com Independence: nobody pays for you. Liberty: nobody thinks for you. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] multi-column factor
I have a data frame with columns which draw on the same underlying universe, so I want them to be factors with the same level set: --8---cut here---start-8--- z - data.frame(a=c(a,b,c),b=c(b,c,d),stringsAsFactors=FALSE) str(z) 'data.frame': 3 obs. of 2 variables: $ a: chr a b c $ b: chr b c d z$a - factor(z$a,levels=union(z$a,z$b)) z$b - factor(z$b,levels=union(z$a,z$b)) str(z) 'data.frame': 3 obs. of 2 variables: $ a: Factor w/ 4 levels a,b,c,d: 1 2 3 $ b: Factor w/ 4 levels a,b,c,d: 2 3 4 --8---cut here---end---8--- factor(z$a,levels=union(z$a,z$b)) is factor(z$a,levels=union(z$a,z$b)) the right way to handle this? maybe there is a better way to extract levels than union()? (bear in mind that I have ~10M rows and ~1M levels, so performance is an issue). Thanks! -- Sam Steingold (http://sds.podval.org/) on Ubuntu 12.04 (precise) X 11.0.11103000 http://www.childpsy.net/ http://iris.org.il http://honestreporting.com http://camera.org http://www.memritv.org http://jihadwatch.org When you talk to God, it's prayer; when He talks to you, it's schizophrenia. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] sum(table(v)) == length(v)
Is it possible to violate the identity sum(table(v)) == length(v) ?? I see no way to do that and it holds in my small examples, but it is violated in the huge set I have: system.time(z - unique(data.frame(u=U,s=S))) tab1 - table(z$u) tab1 - tab1[tab10] # S is factor so some counts were 0 tab2 - table(z$s) stopifnot(length(tab2) == nrow(z)) # yes stopifnot(sum(tab1) == nrow(z)) ### no! sum(tab1) 728587 length(tab1) 503374 length(tab2) 2112951 -- Sam Steingold (http://sds.podval.org/) on Ubuntu 12.04 (precise) X 11.0.11103000 http://www.childpsy.net/ http://think-israel.org http://americancensorship.org http://ffii.org http://memri.org http://jihadwatch.org http://pmw.org.il Live Lisp and prosper. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] please comment on my function
this function is supposed to canonicalize the language: --8---cut here---start-8--- canonicalize.language - function (s) { s - tolower(s) long - nchar(s) == 5 s[long] - sub(^([a-z]{2})[-_][a-z]{2}$,\\1,s[long]) s[nchar(s) != 2 s != c] - unknown s } canonicalize.language(c(aa,bb-cc,DD-abc,eee,ff_FF,C)) [1] aa bb unknown unknown ff c --8---cut here---end---8--- it does what I want it to do, but it takes 4.5 seconds on a vector of length 10,256,341 - I wonder if I might be doing something aufully stupid. I thought that sub() was slow, but my second attempt: --8---cut here---start-8--- canonicalize.language - function (s) { s - tolower(s) good - nchar(s) == 5 substr(s,3,3) %in% c(_,-) s[good] - substr(s[good],1,2) s[nchar(s) != 2 s != c] - unknown s } --8---cut here---end---8--- was even slower (6.4 sec). My two concerns are: 1. avoid allocating many small objects which are never collected 2. run fast Which would be the best implementation? Thanks a lot for your insight! -- Sam Steingold (http://sds.podval.org/) on Ubuntu 12.04 (precise) X 11.0.11103000 http://www.childpsy.net/ http://think-israel.org http://openvotingconsortium.org http://memri.org http://camera.org http://truepeace.org WHO ATE MY BREAKFAST PANTS? __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] please comment on my function
* jim holtman wubyg...@tznvy.pbz [2012-09-14 13:10:37 -0400]: more than half the time is in 'tolower' and 'nchar', so it is not all 'sub's problem. aha, thanks! This version runs a little faster since it does not need the 'tolower': canonicalize.language - function (s) { # s - tolower(s) long - nchar(s) == 5 s[long] - sub(^([[:alpha:]]{2})[-_][[:alpha:]]{2}$,\\1,s[long]) s[nchar(s) != 2 s != c] - unknown s } but it does not convert EN to en, so it is not good for my purposes. -- Sam Steingold (http://sds.podval.org/) on Ubuntu 12.04 (precise) X 11.0.11103000 http://www.childpsy.net/ http://thereligionofpeace.com http://mideasttruth.com http://iris.org.il http://honestreporting.com http://memri.org Life is like Tetris: failures accumulate, successes fade. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] aggregate() runs out of memory
I have a large data.frame Z (2,424,185,944 bytes, 10,256,441 rows, 17 columns). I want to get the result of table(aggregate(Z$V1, FUN = length, by = list(id=Z$V2))$x) alas, aggregate has been running for ~30 minute, RSS is 14G, VIRT is 24.3G, and no end in sight. both V1 and V2 are characters (not factors). Is there anything I could do to speed this up? Thanks. -- Sam Steingold (http://sds.podval.org/) on Ubuntu 12.04 (precise) X 11.0.11103000 http://www.childpsy.net/ http://www.PetitionOnline.com/tap12009/ http://dhimmi.com http://think-israel.org http://iris.org.il WinWord 6.0 UNinstall: Not enough disk space to uninstall WinWord __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] cannot read iso639 table
line 109 did not have 5 elements ... but it did! empty beginning of file ... but it's not! details: --8---cut here---start-8--- get.language.ISO.table - function () { socket - url(http://www.loc.gov/standards/iso639-2/ISO-639-2_utf-8.txt;, open=r,encoding=utf-8); data - read.table(socket, as.is = TRUE, sep = |, header = FALSE, col.names = c(a3bibliographic,a3terminologic, a2,english,french)); close(socket); data } language.ISO.table - get.language.ISO.table() Error in read.table(socket, as.is = TRUE, sep = |, header = FALSE, col.names = c(a3bibliographic, : empty beginning of file --8---cut here---end---8--- the first line is _not_ blank, as one can see by downloading the file with wget In addition: --8---cut here---start-8--- Warning messages: 1: In read.table(socket, as.is = TRUE, sep = |, header = FALSE, col.names = c(a3bibliographic, : invalid input found on input connection 'http://www.loc.gov/standards/iso639-2/ISO-639-2_utf-8.txt' --8---cut here---end---8--- what is invalid there? libreoffice calc opened the file just fine. --8---cut here---start-8--- 2: In read.table(socket, as.is = TRUE, sep = |, header = FALSE, col.names = c(a3bibliographic, : incomplete final line found by readTableHeader on 'http://www.loc.gov/standards/iso639-2/ISO-639-2_utf-8.txt' --8---cut here---end---8--- indeed the final NL is missing. why is this a big deal? when I download the file: --8---cut here---start-8--- read.table(ISO-639-2_utf-8.csv,encoding=utf-8, as.is = TRUE, sep = |, header = FALSE, col.names = c(a3bibliographic,a3terminologic, a2,english,french)) Error in scan(file, what, nmax, sep, dec, quote, skip, nlines, na.strings, : line 109 did not have 5 elements --8---cut here---end---8--- however --8---cut here---start-8--- l - readLines(ISO-639-2_utf-8.csv,encoding=utf-8) Warning message: In readLines(ISO-639-2_utf-8.csv, encoding = utf-8) : incomplete final line found on 'ISO-639-2_utf-8.csv' l[108:110] [1] dgr|||Dogrib|dogrib [2] din|||Dinka|dinka [3] div||dv|Divehi; Dhivehi; Maldivian|maldivien --8---cut here---end---8--- all lines look legit to me. so, why can't I read the file? thanks. ps. ubuntu; R 2.15.1 (2012-06-22) installed from cran using aptitude. -- Sam Steingold (http://sds.podval.org/) on Ubuntu 12.04 (precise) X 11.0.11103000 http://www.childpsy.net/ http://dhimmi.com http://memri.org http://ffii.org http://think-israel.org http://honestreporting.com The past is gone, the present is ephemeral, the future is a guess. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] cannot read iso639 table
* William Dunlap jqha...@gvopb.pbz [2012-09-13 19:50:21 +]: On Windows with R-2.15.1 in a 1252 locale, I had to read (and toss) out the initial 3 bytes (the byte-order mark?) to make things work: socket - url(http://www.loc.gov/standards/iso639-2/ISO-639-2_utf-8.txt,open=r,encoding=utf-8;) readChar(socket, nchars=3, useBytes=TRUE) [1]  confirmed - first 3 bytes are \357\273\277 d - read.table(socket, quote=, sep=|, stringsAsFactors=FALSE) dim(d) [1] 485 5 head(d) V1 V2 V3 V4 V5 1 aaraa Afarafar 2 abkab Abkhazian abkhaze 3 ace Achineseaceh 4 achAcoli acoli 5 ada Adangme adangme 6 ady Adyghe; Adygei adyghé alas, this is all I get: Warning message: In scan(file, what, nmax, sep, dec, quote, skip, nlines, na.strings, : invalid input found on input connection 'http://www.loc.gov/standards/iso639-2/ISO-639-2_utf-8.txt' a3bibliographic a3terminologic a2english french 1 aar NA aa Afarafar 2 abk NA ab Abkhazian abkhaze 3 ace NA Achineseaceh 4 ach NA Acoli acoli 5 ada NA Adangme adangme 6 ady NAAdyghe; Adygei adygh note that the first non-ASCII character terminates the input. so, I still cannot read the data from the URL. I can read the file though - with quote= (thanks Peter!) - except that the first record is \357\273\277aar. -- Sam Steingold (http://sds.podval.org/) on Ubuntu 12.04 (precise) X 11.0.11103000 http://www.childpsy.net/ http://thereligionofpeace.com http://mideasttruth.com http://iris.org.il http://jihadwatch.org The only thing worse than X Windows: (X Windows) - X __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] merge a list of data frames
* David Winsemius qjvafrz...@pbzpnfg.arg [2012-09-05 21:02:16 -0700]: On Sep 5, 2012, at 8:51 PM, Sam Steingold wrote: I have a list of data frames: str(data) List of 4 $ :'data.frame': 700773 obs. of 3 variables: ..$ V1: chr [1:700773] 200130446465779 200070050127778 200030633708779 200010587002779 ... ..$ V2: int [1:700773] 0 0 0 0 0 0 0 0 0 0 ... ..$ V3: num [1:700773] 1 1 1 1 1 ... $ :'data.frame': 700773 obs. of 3 variables: ..$ V1: chr [1:700773] 200130446465779 200070050127778 200030633708779 200010587002779 ... ..$ V2: int [1:700773] 0 0 0 0 0 0 0 0 0 0 ... ..$ V3: num [1:700773] 1 1 1 1 1 ... $ :'data.frame': 700773 obs. of 3 variables: ..$ V1: chr [1:700773] 200130446465779 200070050127778 200030633708779 200010587002779 ... ..$ V2: int [1:700773] 0 0 0 0 0 0 0 0 0 0 ... ..$ V3: num [1:700773] 1 1 1 1 1 ... $ :'data.frame': 700773 obs. of 3 variables: ..$ V1: chr [1:700773] 200160325893778 200130647544079 200130446465779 200120186959078 ... ..$ V2: int [1:700773] 0 0 0 0 0 0 0 0 0 0 ... ..$ V3: num [1:700773] 1 1 1 1 1 1 1 1 1 1 ... I want to merge them. Why? What are you expecting? these are the results of applying a model to the test data. the first column is the ID the second column is the actual value the third column is the model score after I will merge the frames, I will 1. check that all the V2 columns are identical and drop all but one (I guess I could just merge on c(V1,V2) instead, right?) 2. compute the sum (or the mean, whatever is easier) of all the V3 columns 3. sort by the sum/mean of the V3 columns and evaluate the combined model using the lift quality metric (http://dl.acm.org/citation.cfm?id=380995.381018) I have many more score files (not just 4), so it is not practical for me to rename the column to something unique. -- Sam Steingold (http://sds.podval.org/) on Ubuntu 12.04 (precise) X 11.0.11103000 http://www.childpsy.net/ http://www.memritv.org http://truepeace.org http://jihadwatch.org http://mideasttruth.com http://americancensorship.org To be popular with ladies one has to be smart, handsome rich. Or to be a cat. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] merge a list of data frames
* David Winsemius qjvafrz...@pbzpnfg.arg [2012-09-06 10:30:16 -0700]: these are the results of applying a model to the test data. the first column is the ID In which case you should be using the 'by' argument to `merge` I already do! see my initial message! 3. sort by the sum/mean of the V3 columns and evaluate the combined model using the lift quality metric (http://dl.acm.org/citation.cfm?id=380995.381018) That's going to require more background (or more money since they want $15.00 for a pdf. :-) that I have already implemented, works just fine: proficiency - function (actual, prediction) { proficiency1(ea = entropy(table(actual)), ep = entropy(table(prediction)), ej = entropy(table(actual,prediction))) } proficiency1 - function (ea, ep, ej) { mi - ea + ep - ej list(joint = ej, actual = ea, prediction = ep, mutual = mi, proficiency = mi / ea, dependency = mi / ej) } detector.statistics - function (tp,fn,fp,tn) { observationCount - tp + fn + fp + tn predictedPositive - tp + fp predictedNegative - fn + tn actualPositive - tp + fn actualNegative - fp + tn correct - tp + tn list(baseRate = actualPositive / observationCount, precision = if (tp == 0) 0 else tp / predictedPositive, specificity = if (tn == 0) 0 else tn / actualNegative, recall = if (tp == 0) 0 else tp / actualPositive, accuracy = correct / observationCount, lift = (tp * observationCount) / (predictedPositive * actualPositive), f1score = if (tp == 0) 0 else 2 * tp / (2 * tp + fp + fn), proficiency = proficiency1(ej = entropy(c(tp,fn,fp,tn)), ea = entropy(c(actualPositive,actualNegative)), ep = entropy(c(predictedPositive,predictedNegative } ## v should be vector of 01 sorted according to some model ## Gregory Piatetsky-Shapiro, Samuel Steingold ## Measuring Lift Quality in Database Marketing ## http://sds.podval.org/data/l-quality.pdf ## http://www.sigkdd.org/explorations/issues/2-2-2000-12/piatetsky-shapiro.pdf ## SIGKDD Explorations, Vol. 2:2, (2000), 81-86 ## tests: lift.quality(rbinom(1,size=1,prob=0.1)) == ~0 ##lift.quality(rev(round((1:1)/12000))) == 1 lift.quality - function (v, plot = TRUE, file = NULL, main = lift curve, thresholds = NULL) { target.count - sum(v) total.count - length(v) base.rate - target.count / total.count target.level - cumsum(v)/target.count lq - ((2*sum(target.level) - 1)/total.count - 1) / (1 - base.rate) if (plot) { if (!is.null(file)) { pdf(file = file) on.exit(dev.off()) } plot(x=(1:total.count)/total.count,y=target.level,type=l, main=paste(main,( lift quality ,lq,)), xlab=% cutoff,ylab=cumulative % hit) } if (is.null(thresholds)) thresholds = c(base.rate) list(lift.quality = lq, detector.statistics = sapply(thresholds, function (l) { cutoff - round(l * total.count) tp - round(target.level[cutoff] * target.count) # = sum(v[1:cutoff]) fn - target.count - tp fp - cutoff - tp tn - total.count - target.count - cutoff + tp detector.statistics(tp, fn, fp, tn) })) } I have many more score files (not just 4), so it is not practical for me to rename the column to something unique. Which column? the 3rd (score) column. Meanwhile I realised that the fastest way is actuall shell: sort+cut+paste produced the csv file which can be loaded into R much faster than the individual score files, so this issue is now purely academic. However, I appreciate the replies I got so far, it was quite educational, thanks! (I also appreciate comments on the code above) -- Sam Steingold (http://sds.podval.org/) on Ubuntu 12.04 (precise) X 11.0.11103000 http://www.childpsy.net/ http://www.memritv.org http://truepeace.org http://openvotingconsortium.org http://ffii.org http://mideasttruth.com Save your burned out bulbs for me, I'm building my own dark room. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] merge a list of data frames
I have a list of data frames: str(data) List of 4 $ :'data.frame': 700773 obs. of 3 variables: ..$ V1: chr [1:700773] 200130446465779 200070050127778 200030633708779 200010587002779 ... ..$ V2: int [1:700773] 0 0 0 0 0 0 0 0 0 0 ... ..$ V3: num [1:700773] 1 1 1 1 1 ... $ :'data.frame': 700773 obs. of 3 variables: ..$ V1: chr [1:700773] 200130446465779 200070050127778 200030633708779 200010587002779 ... ..$ V2: int [1:700773] 0 0 0 0 0 0 0 0 0 0 ... ..$ V3: num [1:700773] 1 1 1 1 1 ... $ :'data.frame': 700773 obs. of 3 variables: ..$ V1: chr [1:700773] 200130446465779 200070050127778 200030633708779 200010587002779 ... ..$ V2: int [1:700773] 0 0 0 0 0 0 0 0 0 0 ... ..$ V3: num [1:700773] 1 1 1 1 1 ... $ :'data.frame': 700773 obs. of 3 variables: ..$ V1: chr [1:700773] 200160325893778 200130647544079 200130446465779 200120186959078 ... ..$ V2: int [1:700773] 0 0 0 0 0 0 0 0 0 0 ... ..$ V3: num [1:700773] 1 1 1 1 1 1 1 1 1 1 ... I want to merge them. I tried to follow http://rwiki.sciviews.org/doku.php?id=tips%3adata-frames%3amerge and did: data.1 - Reduce(function(f1,f2) merge(f1,f2,by=c(V1),all=TRUE), data) Warning message: In merge.data.frame(f1, f2, by = c(V1), all = TRUE) : column names 'V2.x', 'V3.x', 'V2.y', 'V3.y' are duplicated in the result str(data.1) 'data.frame': 700773 obs. of 9 variables: $ V1 : chr 10001099079 10001254078 10001499078 10001541779 ... $ V2.x: int 0 0 0 0 0 0 0 0 0 0 ... $ V3.x: num 0.476 0.748 0.442 0.483 0.577 ... $ V2.y: int 0 0 0 0 0 0 0 0 0 0 ... $ V3.y: num 0.476 0.748 0.442 0.483 0.577 ... $ V2.x: int 0 0 0 0 0 0 0 0 0 0 ... $ V3.x: num 0.476 0.752 0.443 0.485 0.578 ... $ V2.y: int 0 0 0 0 0 0 0 0 0 0 ... $ V3.y: num 0.47 0.733 0.57 0.416 0.616 ... I don't like the warning and I don't like that I now have to use [n] to access identically named columns, but, I guess, this is better than this: library('reshape') data.1 - merge_all(data,by=V1,all=TRUE) Error in merge.data.frame(dfs[[1]], Recall(dfs[-1]), all = TRUE, sort = FALSE, : formal argument all matched by multiple actual arguments data.1 - merge_all(data,by=V1,sort=TRUE,all=TRUE) Error in merge.data.frame(dfs[[1]], Recall(dfs[-1]), all = TRUE, sort = FALSE, : formal argument all matched by multiple actual arguments data.1 - merge_all(data,by=V1,sort=TRUE) Error in merge.data.frame(dfs[[1]], Recall(dfs[-1]), all = TRUE, sort = FALSE, : formal argument sort matched by multiple actual arguments data.1 - merge_all(data,by=V1) Error in `[.data.frame`(df, , match(names(dfs[[1]]), names(df))) : undefined columns selected data.1 - merge_all(data,by=c(V1)) Error in `[.data.frame`(df, , match(names(dfs[[1]]), names(df))) : undefined columns selected what does 'formal argument sort matched by multiple actual arguments' mean? thanks. -- Sam Steingold (http://sds.podval.org/) on Ubuntu 12.04 (precise) X 11.0.11103000 http://www.childpsy.net/ http://ffii.org http://pmw.org.il http://dhimmi.com http://palestinefacts.org http://iris.org.il I just forgot my whole philosophy of life!!! __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] apply -- data.frame
* David Winsemius qjvafrz...@pbzpnfg.arg [2012-08-30 10:14:34 -0700]: str( as.data.frame( do.call(rbind, strsplit(c(a,1,b,2,c,3), ,) ) , stringsAsFactors=FALSE) ) 'data.frame': 3 obs. of 2 variables: $ V1: chr a b c $ V2: chr 1 2 3 do.call/rbind appeared to be TRT. I tried it and got a data frame with list columns (instead of vectors); as.data.frame(do.call(rbind,lapply(list.files(...), function (name) { c(name,list(num1,num2,num3), # num* come from some calculations above strsplit(sub([^-]*(train|test)[^-]*(-(S)?pca([0-9]*))?-s([0-9]*)c([0-9.]*)\\.score, \\1,\\3,\\4,\\5,\\6,name),,)[[1]]) })), stringsAsFactors = FALSE) 'data.frame': 2 obs. of 8 variables: $ file:List of 2 ..$ : chr zzz_test_0531_0630-Spca181-s0c10.score ..$ : chr zzz_train_0531_0630-Spca181-s0c10.score $ lift.quality:List of 2 ..$ : num 0.59 ..$ : num 0.621 $ proficiency :List of 2 ..$ : num 0.0472 ..$ : num 0.0472 $ set :List of 2 ..$ : chr test ..$ : chr train $ scale :List of 2 ..$ : chr S ..$ : chr S $ pca :List of 2 ..$ : chr 181 ..$ : chr 181 $ s :List of 2 ..$ : chr 0 ..$ : chr 0 $ c :List of 2 ..$ : chr 10 ..$ : chr 10 I guess the easiest way is to replace c(...list()...) with c(...) but that would mean converting num1,num2,num3 to string and back which I want to avoid for aesthetic reasons. Any better suggestions? thanks a lot! -- Sam Steingold (http://sds.podval.org/) on Ubuntu 12.04 (precise) X 11.0.11103000 http://www.childpsy.net/ http://jihadwatch.org http://thereligionofpeace.com http://palestinefacts.org http://ffii.org http://pmw.org.il I don't have an attitude problem. You have a perception problem. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] apply -- data.frame
* William Dunlap jqha...@gvopb.pbz [2012-08-31 18:38:52 +]: Is the following something like what you are doing? yes, absolutely, thanks a lot! -- Sam Steingold (http://sds.podval.org/) on Ubuntu 12.04 (precise) X 11.0.11103000 http://www.childpsy.net/ http://pmw.org.il http://dhimmi.com http://palestinefacts.org http://www.memritv.org http://mideasttruth.com char*a=char*a=%c%s%c;main(){printf(a,34,a,34);};main(){printf(a,34,a,34);} __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] apply -- data.frame
Is there a way for an apply-type function to return a data frame? the closest thing I think of is foo - as.data.frame(sapply(...)) names(foo) - c() is there a more elegant way? Thanks! -- Sam Steingold (http://sds.podval.org/) on Ubuntu 12.04 (precise) X 11.0.11103000 http://www.childpsy.net/ http://palestinefacts.org http://dhimmi.com http://honestreporting.com http://ffii.org http://mideasttruth.com Lisp: it's here to save your butt. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] apply -- data.frame
* Sam Steingold f...@tah.bet [2012-08-30 08:56:17 -0400]: Is there a way for an apply-type function to return a data frame? the closest thing I think of is foo - as.data.frame(t(sapply(...))) names(foo) - c() alas, this has a problem of creating a homogeneous data frame, i.e., all the columns are numbers or characters, because the function passed to sapply returns c() and c(1,2,a) [1] 1 2 a e.g., as.data.frame(t(sapply(c(a,1,b,2,c,3),function (n) strsplit(n,,)[[1]]))) V1 V2 a,1 a 1 b,2 b 2 c,3 c 3 'data.frame': 3 obs. of 2 variables: $ V1: Factor w/ 3 levels a,b,c: 1 2 3 ..- attr(*, names)= chr a,1 b,2 c,3 $ V2: Factor w/ 3 levels 1,2,3: 1 2 3 ..- attr(*, names)= chr a,1 b,2 c,3 I wanted the V1 column to be a string, and V2 to be a number. (I know stringsAsFactors=FALSE would replace factors with strings, but I need a string and a number) I could, of course, do ret$V2 - as.numeric(ret$V2) but this would mean a double conversion: from number to string first (by c()) and then back. thanks. -- Sam Steingold (http://sds.podval.org/) on Ubuntu 12.04 (precise) X 11.0.11103000 http://www.childpsy.net/ http://mideasttruth.com http://truepeace.org http://openvotingconsortium.org http://ffii.org http://www.memritv.org Diplomacy is the art of saying nice doggy until you can find a nice rock. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] apply -- data.frame
* Bert Gunter thagre.ore...@trar.pbz [2012-08-30 09:59:46 -0700]: You really should spend a little more time with the docs figuring out what R _does_ and a little less complaining about what you think R cannot do. The only thing I think R cannot do is compact its memory, thus, effectively, leaking it in _some_ situations. The rest are just my humble questions... PS. speaking about complaining, my pet peeve atm is the speed (or, rather, lack thereof) of e1071 functions read.matrix.csr and write.matrix.csr (they are implemented in R, not in C, and do a lot of string manipulation, so they slowness is not surprising) -- Sam Steingold (http://sds.podval.org/) on Ubuntu 12.04 (precise) X 11.0.11103000 http://www.childpsy.net/ http://pmw.org.il http://thereligionofpeace.com http://truepeace.org http://openvotingconsortium.org http://ffii.org The best propaganda of atheism is done by organized religion. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] apply -- data.frame
* William Dunlap jqha...@gvopb.pbz [2012-08-30 17:35:08 +]: I don't agree with your analysis of what went wrong with your example a double conversion: from number to string first (by c()) and then back. I did not make myself quite clear, sorry. I should have written something like c(1,2,a) == 1 2 a =[as.numeric]= 1 2 a -- Sam Steingold (http://sds.podval.org/) on Ubuntu 12.04 (precise) X 11.0.11103000 http://www.childpsy.net/ http://openvotingconsortium.org http://www.memritv.org http://ffii.org http://truepeace.org http://palestinefacts.org Those who can't write, write manuals. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] variable scope
* Duncan Murdoch zheqbpu.qha...@tznvy.pbz [2012-08-29 10:30:10 -0400]: On 29/08/2012 12:50 AM, Sam Steingold wrote: * Duncan Murdoch zheqbpu.qha...@tznvy.pbz [2012-08-28 21:06:33 -0400]: On 12-08-28 5:55 PM, Sam Steingold wrote: my observation is that gc in R sucks. (it cannot release small objects). this is not specific to R; ocaml suffers too. Sorry, I didn't realize you were just a troll I am not. I am referring here to a very specific deficiency which plagues all non-moving GCs. I guess non-compacting GC might be a more common expression. I think you're a troll because you're making false statements, such as that gc in R cannot release small objects, without any evidence in support of them. This is common knowledge, discussed, e.g., here: http://article.gmane.org/gmane.comp.lang.r.general:256174 Whether R GC cannot release small objects or cannot reuse the fragmented memory after it releases the small objects is inconsequential: R consumes RAM which it cannot use. Again, this is a common deficiency in all memory management systems which do not compact their storage; something studied in CS101. -- Sam Steingold (http://sds.podval.org/) on Ubuntu 12.04 (precise) X 11.0.11103000 http://www.childpsy.net/ http://iris.org.il http://americancensorship.org http://dhimmi.com http://openvotingconsortium.org http://truepeace.org Never underestimate the power of stupid people in large groups. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] variable scope
At the end of a for loop its variables are still present: for (i in 1:10) { x - vector(length=1) } ls() will print i and x. this means that at the end of the for loop body I have to write rm(x) gc() is there a more elegant way to handle this? Thanks. -- Sam Steingold (http://sds.podval.org/) on Ubuntu 12.04 (precise) X 11.0.11103000 http://www.childpsy.net/ http://camera.org http://palestinefacts.org http://iris.org.il http://www.PetitionOnline.com/tap12009/ http://truepeace.org Computers are like air conditioners: they don't work with open windows! __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.