Re: [ANN] Leiningen 2.1.1 released
By the way, is there any place to get a full tarball (or zip) of leiningen and its dependencies? Not all of the machines I'm working on have external internet access, so I can't bootstrap as usual. On Thursday, March 21, 2013 6:44:09 PM UTC-4, Phil Hagelberg wrote: Hello folks. I've just pushed out version 2.1.1 of Leiningen, which contains a handful of bug fixes from 2.1.0. * Add `:test-paths` to directories shared by checkout deps. (Phil Hagelberg) * Allow `run` task to function outside projects. (Phil Hagelberg) * Fix a bug preventing `with-profiles` working outside projects. (Colin Jones) * Fix a bug in trampolined `repl`. (Colin Jones) * Fix a bug in `update-in` task causing stack overflow. (David Powell) * Fix a bug in `lein upgrade`. (Phil Hagelberg) This should address a few issues people came across in 2.1.0, but there's nothing terribly exciting. That is all. -Phil -- -- You received this message because you are subscribed to the Google Groups Clojure group. To post to this group, send email to clojure@googlegroups.com Note that posts from new members are moderated - please be patient with your first post. To unsubscribe from this group, send email to clojure+unsubscr...@googlegroups.com For more options, visit this group at http://groups.google.com/group/clojure?hl=en --- You received this message because you are subscribed to the Google Groups Clojure group. To unsubscribe from this group and stop receiving emails from it, send an email to clojure+unsubscr...@googlegroups.com. For more options, visit https://groups.google.com/groups/opt_out.
Re: statistics library?
Lee Spector lspec...@hampshire.edu writes: I need to do some pretty simple statistics in a Clojure program and Incanter produces results that I think must be wrong (details below). So I don't think I can trust it. I agree, those all look weird to me. Is there other code for statistical testing out there? I'd reach for commons-math, but I don't have much experience. If I understand correctly the t-test should produce a p-value which ranges from 0 to 1. If it's less than 0.05 we can say that the means differ. (Again, there would be more to say here about what's statistically meaningful, but that discussion isn't relevant to my question). This is true. = (t-test (range 1 11) :mu 0) {:conf-int [3.33414941027723 7.66585058972277], :x-mean 5.5, :t-stat 5.744562646538029, :p-value 1.9997218039889517, :n1 10, :df 9, :n2 nil, :y-var nil, :x-var 9.166, :y-mean nil} This looks wrong to me. At least according to R, the p-value is 0.00278. Interestingly, this is 2 - [incanter's p]. = (t-test '(40 5 2) :y '(1 5 1)) {:conf-int [-39.46068349230474 66.12735015897141], :x-mean 15.666, :t-stat 1.0866516498483223, :p-value 1.6115506955016772, :n1 3, :df 2.0477900396893336, :n2 3, :y-var 5.332, :x-var 446.37, :y-mean 2.3335} R gives 0.3884, which is again 2 - [incanter's p]. Fishy. I would say that there's a bug in Incanter's distribution function, at least when calculating values in the tails. If not, then does anyone have a pointer to more reliable statistics code in Clojure? Or pointers to using a Java library? I see that there are libraries out there -- e.g. http://commons.apache.org/math/api-1.2/org/apache/commons/math/stat/inference/TTest.html -- but Java interop is not my strong suit and I'm not sure how to call this from my Clojure code. There may be an easier way to do this, but this worked for me: user= (org.apache.commons.math.stat.inference.TestUtils/tTest (into-array Double/TYPE [40 5 2]) (into-array Double/TYPE [1 5 1])) 0.3884493044983227 Hope that helps, Johann -- You received this message because you are subscribed to the Google Groups Clojure group. To post to this group, send email to clojure@googlegroups.com Note that posts from new members are moderated - please be patient with your first post. To unsubscribe from this group, send email to clojure+unsubscr...@googlegroups.com For more options, visit this group at http://groups.google.com/group/clojure?hl=en
Re: statistics library?
Johann Hibschman joha...@gmail.com writes: There may be an easier way to do this, but this worked for me: user= (org.apache.commons.math.stat.inference.TestUtils/tTest (into-array Double/TYPE [40 5 2]) (into-array Double/TYPE [1 5 1])) 0.3884493044983227 I should have used (double-array [40 5 2]) here, but for some reason I couldn't remember it until I hit send. -- You received this message because you are subscribed to the Google Groups Clojure group. To post to this group, send email to clojure@googlegroups.com Note that posts from new members are moderated - please be patient with your first post. To unsubscribe from this group, send email to clojure+unsubscr...@googlegroups.com For more options, visit this group at http://groups.google.com/group/clojure?hl=en
Re: Clojure, Parallel programming and Leslie Lamport
Konrad Hinsen konrad.hin...@fastmail.net writes: Thanks for the link! Judging from the example in the README, it's a library for task farming in Clojure. While that's a limited form of parallelism, there are still lots of applications where it is useful, so I'd say this library is definitely worth a closer look. However, it doesn't seem to deal with distributed data. Distributed data is hard, though, partly because kind of distribution you need depends on your calculation. Every time I've had to do a distributed calculations, I've always just used the filesystem for data. I see a lot of frameworks that assume the data is small and can be entirely contained in the message, while I need some kind of data affinity. (I do model estimation on large data sets, so I'd like to send a lump of data to different nodes, leave it there, then exchange parameter vectors and error scores with a controller.) In today's world, I've found I get more done faster with a single 8-core machine with a lot of RAM (96 GB now; at a previous employer I had access to a 512 GB monster) than I would with a farm of machines with only 4 GB or 8 GB, so I'm back to concurrency. Of course, that's just because my data is large, but not too large. I come from the scientific computing community .. the likes of Computation Fluid Dynamics and related topics.. large matrix operations and such stuff.. My background is somewhat similar: molecular simulations and analysis of large data sets. I did astronomy, but mostly small-scale stuff. Integration, cascade calculations, the like. These days, though, I'm doing finance, mortgages in particular. That's a field that's been fun for the past few years. -Johann -- You received this message because you are subscribed to the Google Groups Clojure group. To post to this group, send email to clojure@googlegroups.com Note that posts from new members are moderated - please be patient with your first post. To unsubscribe from this group, send email to clojure+unsubscr...@googlegroups.com For more options, visit this group at http://groups.google.com/group/clojure?hl=en
Re: Moderately off-topic: installing emacs on OSX
javajosh javaj...@gmail.com writes: Ok, I decided to nuke ports, fink, and delete every package they ever installed. I successfully installed emacs 23.2 via homebrew (there's a good overview of homebrew here http://ascarter.net/2010/02/22/homebrew-for-os-x.html). I'm coming late to this game, and I see that you've already figured things out, but I thought I'd chime in for anyone else listening. Since I'm too lazy to build emacs for myself, I've been using Vincent Goulet's build, http://vgoulet.act.ulaval.ca/en/ressources/emacs/mac , with success. I've not tried to run it with -nw, though. Also tried the elpa self-install script in Aquamacs (where the scratch buffer comes up first thing), but C-j (the 'eval' keystroke) had no discernible effect. I tried M-x package-list-packages but it said [No Match]. So I assume nothing happened. Yes. Aquamacs's scratch buffer is in text-mode, not in lisp-interaction-mode. Plus, it comes with old versions of SLIME that I couldn't figure out how to remove. Partly based on tricks like that, partly due to strange font behavior, I don't like Aquamacs. Regards, Johann -- You received this message because you are subscribed to the Google Groups Clojure group. To post to this group, send email to clojure@googlegroups.com Note that posts from new members are moderated - please be patient with your first post. To unsubscribe from this group, send email to clojure+unsubscr...@googlegroups.com For more options, visit this group at http://groups.google.com/group/clojure?hl=en
Re: math utilities question
Robert McIntyre r...@mit.edu writes: I'm wondering if people have had experience with java libraries of that sort and might have some recommendations. Anyone use clojure for scientific data analysis? What do you find helpful to use? I'm still just evaluating clojure for scientific data analysis, but I can share what I've found so far. First of all, Incanter. I like the idea of Incanter, but I don't like its decision to have matrices be the fundamental data object. Matrices are great, but they not the be-all and end-all. Multidimensional arrays are better, like in numpy or APL or J. It's a pet peeve about R that it doesn't distinguish scalars from vectors of length 1. (Konrad Hinsen had started some work on multiarrays in Clojure, but I've not been following his progress.) Also, Incanter seems very tuned to a row-wise view of data sets, while I've spent enough time with R and kdb+/q to prefer a column-wise view of data. (This is just based on reading the Incanter docs quickly; I may be misrepresenting the package.) As far as matrix libraries go, I've settled on EJML, since it seems reasonably fast, and I can understand what it's doing. Bradford Cross blogged a comparison of different libraries at: http://measuringmeasures.com/blog/2010/3/28/matrix-benchmarks-fast-linear-algebra-on-the-jvm.html I can't seem to find a good Java multiarray library, but I have some hope that I could beat EJML into shape, since its representation is just a basic array of doubles. I've built the Java interface to HDF5, and I've been using that for data storage. I would prefer to use a pure-Java solution, but I can't find anything that's nearly as good. Maybe I'm not reading the right news, but I've not seen all that much on using Java for scientific work for a while now. The NIST JavaNumerics guys seem to have given up, but if I remember correctly their conclusions were that Java really needed complex numbers as a value/stack-allocated type. This is a bit of a disjointed ramble, but I'd love to hear what you settle on. Regards, Johann -- You received this message because you are subscribed to the Google Groups Clojure group. To post to this group, send email to clojure@googlegroups.com Note that posts from new members are moderated - please be patient with your first post. To unsubscribe from this group, send email to clojure+unsubscr...@googlegroups.com For more options, visit this group at http://groups.google.com/group/clojure?hl=en
Re: Map with multiple keys?
Base basselh...@gmail.com writes: So this may be an extraordinary dumb question (even for me...) but is there such a thing as a map with compound keys? [...] I could do map - in - map, or do something like a (str cat gender) to amalgamate 2 fields to set the key but I was just wondering if this even existed. I don't know of anything built-in, but I would prefer [cat gender] over (str cat gender) as keys for a map. -Johann -- You received this message because you are subscribed to the Google Groups Clojure group. To post to this group, send email to clojure@googlegroups.com Note that posts from new members are moderated - please be patient with your first post. To unsubscribe from this group, send email to clojure+unsubscr...@googlegroups.com For more options, visit this group at http://groups.google.com/group/clojure?hl=en
Style for mutable data
Does anyone have style suggestions for distinguishing the states from the refs to mutable data? Let's say I'm manipulating a cell in a lattice, or doing dynamic programming, or something. In any case, I have a cell. ;; Current convention: use cell- as the type of the state of a cell. (defstruct cell- :location :data) (defn make-cell [location data] (ref (struct-map cell- :location location :data data)) (defn print-cell- [cell-state] (prn cell-state)) (defn print-cell [cell] (print-cell- @cell)) The details don't matter that much, but what would people name these arguments? Is the cell- convention good? I'd use something like cell % if I were in scheme, but that's not legal in Clojure. What should I name function arguments to distinguish the ones that take the refs from the ones that take the states? Clearly, I can come up with something that keeps me happy, but I was wondering if the community's evolved a standard or has an opinion. Thanks, Johann -- You received this message because you are subscribed to the Google Groups Clojure group. To post to this group, send email to clojure@googlegroups.com Note that posts from new members are moderated - please be patient with your first post. To unsubscribe from this group, send email to clojure+unsubscr...@googlegroups.com For more options, visit this group at http://groups.google.com/group/clojure?hl=en
Re: Style for mutable data
On Jan 30, 4:35 pm, ataggart alex.tagg...@gmail.com wrote: Akin to what Johann said, why bother with the functions that deal with the value/state? Put another way, the cell has identity over time, thus implemented as a ref. A function that, say, prints a cell, should take a cell/ref as its arg. This is my general approach with refs: most of the time, I treat the ref as the fundamental object. The biggest question there, for me at least, is how to come up with a good naming convention to distinguish functions that must be called from within a transaction from those that create the transaction themselves. So far, I've just been putting a comment at the start of each function that assumes a transaction, and that seems fine. The ref/state issue did just come up with me when trying to marshal some refs. When marshalling the ref, I have to store the pointer in a table, so if I come upon it again, I can just return a reference to it. Marshalling the state of the ref, however, is just deciding on my data representation. This led to two functions, which led to me scratching my head trying to figure out what I should name them. marshal-cell and marshal-cell-? marshal-cell-state? The state vs. ref question becomes more important, for me, with agents. I tend to spend about as much time manipulating the agent state as I do moving around the agents themselves and deciding which one to call, so it's less clear to me which one I should treat as fundamental. Probably more than you need, but I highly recommend Rich's talk on the subject of identity and state:http://www.infoq.com/presentations/Are-We-There-Yet-Rich-Hickey I keep meaning to watch that, but I'm too impatient to watch most video on the internet. Clearly, I need a commute. -Johann -- You received this message because you are subscribed to the Google Groups Clojure group. To post to this group, send email to clojure@googlegroups.com Note that posts from new members are moderated - please be patient with your first post. To unsubscribe from this group, send email to clojure+unsubscr...@googlegroups.com For more options, visit this group at http://groups.google.com/group/clojure?hl=en
Re: Space usage of lazy seqs
On Dec 2, 9:59 pm, Johann Hibschman joha...@gmail.com wrote: On Dec 2, 9:09 pm, David Brown cloj...@davidb.org wrote: You can tune the max with -Xmx1G for example, to limit it to one GB. That's a good idea; then I'll know for sure if it's keeping a handle to the entire file. Ok, that's a relief. First of all, -Xmx1G isn't legal, at least for java 1.6; I had to specify -Xmx1024m. Second, once I did that, the memory use of the obvious parallel version, (reduce + (pmap ...)), remained within reason. Clojure is good, everything is happy, fuzzy bunnies and kittens frolic with abandon. So, all of this is a lot of hot air over nothing. Thanks for pointing me in the right direction. Cheers, Johann -- You received this message because you are subscribed to the Google Groups Clojure group. To post to this group, send email to clojure@googlegroups.com Note that posts from new members are moderated - please be patient with your first post. To unsubscribe from this group, send email to clojure+unsubscr...@googlegroups.com For more options, visit this group at http://groups.google.com/group/clojure?hl=en
Space usage of lazy seqs
I don't understand Clojure's space requirements when processing lazy sequences. Are there some rules-of-thumb that I could use to better predict what will use a lot of space? I have a 5.5 GB pipe-delimited data file, containing mostly floats (14 M rows of 40 cols). I'd like to stream over that file, processing columns as I go, without holding the whole thing in RAM. As a first test, I'm trying to just split each row and count the total number of fields. Why does reduce seem to load in the whole file, yet test-split-4 not? Why does the if-let in test-split-3 vs test-split-3b make such a difference? And finally, is there any way I can parallelize this to use multiple cores without slurping in the whole file? If it matters, I'm using a snapshot of 1.1.0-alpha; the jar included with incanter. Here's the code: (defn afile /path/to/big/file) ;; Count the lines in the file. ;; 12.8 s, light memory use (0.8 GB). (defn test-count [] (with-open [rdr (duck-streams/reader afile)] (count (line-seq rdr ;; Split and count. ;; 183.2 s, heavy memory use (8.6 GB). (defn test-split [] (with-open [rdr (duck-streams/reader afile)] (reduce + (map #(count (.split %1 \\|)) (line-seq rdr) ;; 190.8 s, heavy memory use (8.8 GB). (defn test-split-2 [] (with-open [rdr (duck-streams/reader afile)] (loop [counts (seq (map #(count (.split %1 \\|)) (line-seq rdr))) cnt 0] (if counts (recur (next counts) (+ cnt (first counts))) cnt ;; Use rest instead, if-let (following http://clojure.org/lazy.) ;; 166.1 s, light memory use (1.4 GB) (defn test-split-3 [] (with-open [rdr (duck-streams/reader afile)] (loop [counts (map #(count (.split %1 \\|)) (line-seq rdr)) cnt 0] (if-let [s (seq counts)] (recur (rest s) (+ cnt (first s))) cnt ;; Try without the if-let. ;; 211.6 s, heavy memory use (8.7 GB). Surprise! (defn test-split-3b [] (with-open [rdr (duck-streams/reader afile)] (loop [counts (map #(count (.split %1 \\|)) (line-seq rdr)) cnt 0] (if (seq counts) (recur (rest counts) (+ cnt (first counts))) cnt ;; 160 s, light memory use. (1.5 GB) (defn test-split-4 [] (with-open [rdr (duck-streams/reader afile)] (loop [lines (line-seq rdr) cnt 0] (if lines (recur (next lines) (+ cnt (count (.split (first lines) \\| cnt ;; Parallel split and count. ;; Based on test-split-3, but using pmap. ;; 95.1 s, heavy memory use (8.7 GB) (defn test-psplit-1 [] (with-open [rdr (duck-streams/reader afile)] (loop [counts (pmap #(count (.split %1 \\|)) (line-seq rdr)) cnt 0] (if-let [s (seq counts)] (recur (rest s) (+ cnt (first s))) cnt -- You received this message because you are subscribed to the Google Groups Clojure group. To post to this group, send email to clojure@googlegroups.com Note that posts from new members are moderated - please be patient with your first post. To unsubscribe from this group, send email to clojure+unsubscr...@googlegroups.com For more options, visit this group at http://groups.google.com/group/clojure?hl=en
Re: Space usage of lazy seqs
On Dec 2, 9:09 pm, David Brown cloj...@davidb.org wrote: How much memory do you have on your machine. A recent Sun JVM on a machine with a bunch of memory will consider it to be a server machine. It will set the heap max to 1/4 of total physical memory (which suggests you might have 16GB of RAM). I have 96 GB, so I'm not in danger of running out. I just want to understand if I'm using the sequence functions properly, so that I can run a few instances of this, plus some R, etc. You can tune the max with -Xmx1G for example, to limit it to one GB. That's a good idea; then I'll know for sure if it's keeping a handle to the entire file. If you're running JDK 6, you can run the virtualvm, or jconsole to get a better handle on the memory usage, and even dig into what it might used for. Ah, I'd forgotten about jconsole. Well, I'll muddle around and see what I can figure out. Thanks, Johann -- You received this message because you are subscribed to the Google Groups Clojure group. To post to this group, send email to clojure@googlegroups.com Note that posts from new members are moderated - please be patient with your first post. To unsubscribe from this group, send email to clojure+unsubscr...@googlegroups.com For more options, visit this group at http://groups.google.com/group/clojure?hl=en