Re: [ANN] Leiningen 2.1.1 released

2013-03-22 Thread Johann Hibschman
By the way, is there any place to get a full tarball (or zip) of leiningen 
and its dependencies? Not all of the machines I'm working on have external 
internet access, so I can't bootstrap as usual.

On Thursday, March 21, 2013 6:44:09 PM UTC-4, Phil Hagelberg wrote:


 Hello folks. 

 I've just pushed out version 2.1.1 of Leiningen, which contains a 
 handful of bug fixes from 2.1.0. 

 * Add `:test-paths` to directories shared by checkout deps. (Phil 
 Hagelberg) 
 * Allow `run` task to function outside projects. (Phil Hagelberg) 
 * Fix a bug preventing `with-profiles` working outside projects. (Colin 
 Jones) 
 * Fix a bug in trampolined `repl`. (Colin Jones) 
 * Fix a bug in `update-in` task causing stack overflow. (David Powell) 
 * Fix a bug in `lein upgrade`. (Phil Hagelberg) 

 This should address a few issues people came across in 2.1.0, but 
 there's nothing terribly exciting. 

 That is all. 

 -Phil 


-- 
-- 
You received this message because you are subscribed to the Google
Groups Clojure group.
To post to this group, send email to clojure@googlegroups.com
Note that posts from new members are moderated - please be patient with your 
first post.
To unsubscribe from this group, send email to
clojure+unsubscr...@googlegroups.com
For more options, visit this group at
http://groups.google.com/group/clojure?hl=en
--- 
You received this message because you are subscribed to the Google Groups 
Clojure group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to clojure+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.




Re: statistics library?

2011-09-27 Thread Johann Hibschman
Lee Spector lspec...@hampshire.edu writes:

 I need to do some pretty simple statistics in a Clojure program and
 Incanter produces results that I think must be wrong (details
 below). So I don't think I can trust it.

I agree, those all look weird to me.

 Is there other code for statistical testing out there?

I'd reach for commons-math, but I don't have much experience.

 If I understand correctly the t-test should produce a p-value which
 ranges from 0 to 1. If it's less than 0.05 we can say that the means
 differ. (Again, there would be more to say here about what's
 statistically meaningful, but that discussion isn't relevant to my
 question).

This is true.

 = (t-test (range 1 11) :mu 0)
 {:conf-int [3.33414941027723 7.66585058972277],
 :x-mean 5.5,
 :t-stat 5.744562646538029,
 :p-value 1.9997218039889517,
 :n1 10,
 :df 9,
 :n2 nil,
 :y-var nil,
 :x-var 9.166,
 :y-mean nil}

This looks wrong to me.  At least according to R, the p-value is
0.00278.  Interestingly, this is 2 - [incanter's p].

 = (t-test '(40 5 2) :y '(1 5 1))
 {:conf-int [-39.46068349230474 66.12735015897141],
  :x-mean 15.666,
  :t-stat 1.0866516498483223,
  :p-value 1.6115506955016772,
  :n1 3,
  :df 2.0477900396893336,
  :n2 3,
  :y-var 5.332,
  :x-var 446.37,
  :y-mean 2.3335}

R gives 0.3884, which is again 2 - [incanter's p].  Fishy.

I would say that there's a bug in Incanter's distribution function, at
least when calculating values in the tails.

 If not, then does anyone have a pointer to more reliable statistics
 code in Clojure? Or pointers to using a Java library? I see that there
 are libraries out there --
 e.g. 
 http://commons.apache.org/math/api-1.2/org/apache/commons/math/stat/inference/TTest.html
 -- but Java interop is not my strong suit and I'm not sure how to call
 this from my Clojure code.

There may be an easier way to do this, but this worked for me:

  user= (org.apache.commons.math.stat.inference.TestUtils/tTest
(into-array Double/TYPE [40 5 2]) (into-array Double/TYPE [1 5 1]))
  0.3884493044983227

Hope that helps,
Johann

-- 
You received this message because you are subscribed to the Google
Groups Clojure group.
To post to this group, send email to clojure@googlegroups.com
Note that posts from new members are moderated - please be patient with your 
first post.
To unsubscribe from this group, send email to
clojure+unsubscr...@googlegroups.com
For more options, visit this group at
http://groups.google.com/group/clojure?hl=en


Re: statistics library?

2011-09-27 Thread Johann Hibschman
Johann Hibschman joha...@gmail.com writes:

 There may be an easier way to do this, but this worked for me:

   user= (org.apache.commons.math.stat.inference.TestUtils/tTest
 (into-array Double/TYPE [40 5 2]) (into-array Double/TYPE [1 5 1]))
   0.3884493044983227

I should have used (double-array [40 5 2]) here, but for some reason I
couldn't remember it until I hit send.

-- 
You received this message because you are subscribed to the Google
Groups Clojure group.
To post to this group, send email to clojure@googlegroups.com
Note that posts from new members are moderated - please be patient with your 
first post.
To unsubscribe from this group, send email to
clojure+unsubscr...@googlegroups.com
For more options, visit this group at
http://groups.google.com/group/clojure?hl=en


Re: Clojure, Parallel programming and Leslie Lamport

2010-12-22 Thread Johann Hibschman
Konrad Hinsen konrad.hin...@fastmail.net writes:

 Thanks for the link! Judging from the example in the README, it's a
 library for task farming in Clojure. While that's a limited form of
 parallelism, there are still lots of applications where it is useful,
 so I'd say this library is definitely worth a closer look. However, it
 doesn't seem to deal with distributed data.

Distributed data is hard, though, partly because kind of distribution
you need depends on your calculation. Every time I've had to do a
distributed calculations, I've always just used the filesystem for data.

I see a lot of frameworks that assume the data is small and can be
entirely contained in the message, while I need some kind of data
affinity. (I do model estimation on large data sets, so I'd like to send
a lump of data to different nodes, leave it there, then exchange
parameter vectors and error scores with a controller.)

In today's world, I've found I get more done faster with a single 8-core
machine with a lot of RAM (96 GB now; at a previous employer I had
access to a 512 GB monster) than I would with a farm of machines with
only 4 GB or 8 GB, so I'm back to concurrency.  Of course, that's just
because my data is large, but not too large.


  I come from the scientific computing community .. the likes of
 Computation Fluid Dynamics and related topics.. large matrix
 operations and such stuff..

 My background is somewhat similar: molecular simulations and analysis
 of large data sets.

I did astronomy, but mostly small-scale stuff.  Integration, cascade
calculations, the like.  These days, though, I'm doing finance,
mortgages in particular.  That's a field that's been fun for the past
few years.

-Johann

-- 
You received this message because you are subscribed to the Google
Groups Clojure group.
To post to this group, send email to clojure@googlegroups.com
Note that posts from new members are moderated - please be patient with your 
first post.
To unsubscribe from this group, send email to
clojure+unsubscr...@googlegroups.com
For more options, visit this group at
http://groups.google.com/group/clojure?hl=en


Re: Moderately off-topic: installing emacs on OSX

2010-12-14 Thread Johann Hibschman
javajosh javaj...@gmail.com writes:

 Ok, I decided to nuke ports, fink, and delete every package they ever
 installed. I successfully installed emacs 23.2 via homebrew (there's a
 good overview of homebrew here
 http://ascarter.net/2010/02/22/homebrew-for-os-x.html).

I'm coming late to this game, and I see that you've already figured
things out, but I thought I'd chime in for anyone else listening.  Since
I'm too lazy to build emacs for myself, I've been using Vincent Goulet's
build,

  http://vgoulet.act.ulaval.ca/en/ressources/emacs/mac ,

with success.  I've not tried to run it with -nw, though.

 Also tried the elpa self-install script in Aquamacs (where the scratch
 buffer comes up first thing), but C-j (the 'eval' keystroke) had no
 discernible effect. I tried M-x package-list-packages but it said [No
 Match]. So I assume nothing happened.

Yes.  Aquamacs's scratch buffer is in text-mode, not in
lisp-interaction-mode.  Plus, it comes with old versions of SLIME that I
couldn't figure out how to remove.  Partly based on tricks like that,
partly due to strange font behavior, I don't like Aquamacs.

Regards,
Johann

-- 
You received this message because you are subscribed to the Google
Groups Clojure group.
To post to this group, send email to clojure@googlegroups.com
Note that posts from new members are moderated - please be patient with your 
first post.
To unsubscribe from this group, send email to
clojure+unsubscr...@googlegroups.com
For more options, visit this group at
http://groups.google.com/group/clojure?hl=en


Re: math utilities question

2010-12-06 Thread Johann Hibschman
Robert McIntyre r...@mit.edu writes:

 I'm wondering if people have had experience with java libraries of
 that sort and might have some recommendations.

 Anyone use clojure for scientific data analysis? What do you find
 helpful to use?

I'm still just evaluating clojure for scientific data analysis, but I
can share what I've found so far.

First of all, Incanter.  I like the idea of Incanter, but I don't like
its decision to have matrices be the fundamental data object.  Matrices
are great, but they not the be-all and end-all.  Multidimensional arrays
are better, like in numpy or APL or J.  It's a pet peeve about R that it
doesn't distinguish scalars from vectors of length 1.

(Konrad Hinsen had started some work on multiarrays in Clojure, but I've
not been following his progress.)

Also, Incanter seems very tuned to a row-wise view of data sets, while
I've spent enough time with R and kdb+/q to prefer a column-wise view of
data.  (This is just based on reading the Incanter docs quickly; I may
be misrepresenting the package.)

As far as matrix libraries go, I've settled on EJML, since it seems
reasonably fast, and I can understand what it's doing.  Bradford Cross
blogged a comparison of different libraries at:

http://measuringmeasures.com/blog/2010/3/28/matrix-benchmarks-fast-linear-algebra-on-the-jvm.html

I can't seem to find a good Java multiarray library, but I have some
hope that I could beat EJML into shape, since its representation is just
a basic array of doubles.

I've built the Java interface to HDF5, and I've been using that for
data storage.  I would prefer to use a pure-Java solution, but I can't
find anything that's nearly as good.

Maybe I'm not reading the right news, but I've not seen all that much on
using Java for scientific work for a while now.  The NIST JavaNumerics
guys seem to have given up, but if I remember correctly their
conclusions were that Java really needed complex numbers as a
value/stack-allocated type.

This is a bit of a disjointed ramble, but I'd love to hear what you
settle on.

Regards,
Johann

-- 
You received this message because you are subscribed to the Google
Groups Clojure group.
To post to this group, send email to clojure@googlegroups.com
Note that posts from new members are moderated - please be patient with your 
first post.
To unsubscribe from this group, send email to
clojure+unsubscr...@googlegroups.com
For more options, visit this group at
http://groups.google.com/group/clojure?hl=en


Re: Map with multiple keys?

2010-02-24 Thread Johann Hibschman
Base basselh...@gmail.com writes:

 So this may be an extraordinary dumb question (even for me...) but is
 there such a thing as a map with compound keys?

[...]

 I could do map - in  - map, or do something like a (str cat gender) to
 amalgamate 2 fields to set the key but I was just wondering if this
 even existed.

I don't know of anything built-in, but I would prefer [cat gender] over
(str cat gender) as keys for a map.

-Johann

-- 
You received this message because you are subscribed to the Google
Groups Clojure group.
To post to this group, send email to clojure@googlegroups.com
Note that posts from new members are moderated - please be patient with your 
first post.
To unsubscribe from this group, send email to
clojure+unsubscr...@googlegroups.com
For more options, visit this group at
http://groups.google.com/group/clojure?hl=en


Style for mutable data

2010-01-30 Thread Johann Hibschman
Does anyone have style suggestions for distinguishing the states from
the refs to mutable data?

Let's say I'm manipulating a cell in a lattice, or doing dynamic
programming, or something. In any case, I have a cell.

;; Current convention: use cell- as the type of the state of a
cell.
(defstruct cell- :location :data)

(defn make-cell [location data]
  (ref (struct-map cell- :location location :data data))

(defn print-cell- [cell-state]
  (prn cell-state))

(defn print-cell [cell]
  (print-cell- @cell))

The details don't matter that much, but what would people name these
arguments? Is the cell- convention good? I'd use something like cell
% if I were in scheme, but that's not legal in Clojure. What should I
name function arguments to distinguish the ones that take the refs
from the ones that take the states?

Clearly, I can come up with something that keeps me happy, but I was
wondering if the community's evolved a standard or has an opinion.

Thanks,
Johann

-- 
You received this message because you are subscribed to the Google
Groups Clojure group.
To post to this group, send email to clojure@googlegroups.com
Note that posts from new members are moderated - please be patient with your 
first post.
To unsubscribe from this group, send email to
clojure+unsubscr...@googlegroups.com
For more options, visit this group at
http://groups.google.com/group/clojure?hl=en


Re: Style for mutable data

2010-01-30 Thread Johann Hibschman
On Jan 30, 4:35 pm, ataggart alex.tagg...@gmail.com wrote:
 Akin to what Johann said, why bother with the functions that deal with
 the value/state? Put another way, the cell has identity over time,
 thus implemented as a ref. A function that, say, prints a cell, should
 take a cell/ref as its arg.

This is my general approach with refs: most of the time, I treat the
ref as the fundamental object. The biggest question there, for me at
least, is how to come up with a good naming convention to distinguish
functions that must be called from within a transaction from those
that create the transaction themselves.  So far, I've just been
putting a comment at the start of each function that assumes a
transaction, and that seems fine.

The ref/state issue did just come up with me when trying to marshal
some refs. When marshalling the ref, I have to store the pointer in a
table, so if I come upon it again, I can just return a reference to
it. Marshalling the state of the ref, however, is just deciding on my
data representation. This led to two functions, which led to me
scratching my head trying to figure out what I should name them.
marshal-cell and marshal-cell-? marshal-cell-state?

The state vs. ref question becomes more important, for me, with
agents. I tend to spend about as much time manipulating the agent
state as I do moving around the agents themselves and deciding which
one to call, so it's less clear to me which one I should treat as
fundamental.

 Probably more than you need, but I highly recommend Rich's talk on the
 subject of identity and 
 state:http://www.infoq.com/presentations/Are-We-There-Yet-Rich-Hickey

I keep meaning to watch that, but I'm too impatient to watch most
video on the internet. Clearly, I need a commute.

-Johann

-- 
You received this message because you are subscribed to the Google
Groups Clojure group.
To post to this group, send email to clojure@googlegroups.com
Note that posts from new members are moderated - please be patient with your 
first post.
To unsubscribe from this group, send email to
clojure+unsubscr...@googlegroups.com
For more options, visit this group at
http://groups.google.com/group/clojure?hl=en


Re: Space usage of lazy seqs

2009-12-03 Thread Johann Hibschman
On Dec 2, 9:59 pm, Johann Hibschman joha...@gmail.com wrote:
 On Dec 2, 9:09 pm, David Brown cloj...@davidb.org wrote:

  You can tune the max with -Xmx1G for example, to limit it to one GB.

 That's a good idea; then I'll know for sure if it's keeping a handle
 to the entire file.

Ok, that's a relief.

First of all, -Xmx1G isn't legal, at least for java 1.6; I had to
specify -Xmx1024m. Second, once I did that, the memory use of the
obvious parallel version, (reduce + (pmap ...)), remained within
reason. Clojure is good, everything is happy, fuzzy bunnies and
kittens frolic with abandon.

So, all of this is a lot of hot air over nothing. Thanks for pointing
me in the right direction.

Cheers,
Johann

-- 
You received this message because you are subscribed to the Google
Groups Clojure group.
To post to this group, send email to clojure@googlegroups.com
Note that posts from new members are moderated - please be patient with your 
first post.
To unsubscribe from this group, send email to
clojure+unsubscr...@googlegroups.com
For more options, visit this group at
http://groups.google.com/group/clojure?hl=en


Space usage of lazy seqs

2009-12-02 Thread Johann Hibschman
I don't understand Clojure's space requirements when processing lazy
sequences. Are there some rules-of-thumb that I could use to better
predict what will use a lot of space?

I have a 5.5 GB pipe-delimited data file, containing mostly floats (14
M rows of 40 cols). I'd like to stream over that file, processing
columns as I go, without holding the whole thing in RAM. As a first
test, I'm trying to just split each row and count the total number of
fields.

Why does reduce seem to load in the whole file, yet test-split-4 not?
Why does the if-let in test-split-3 vs test-split-3b make such a
difference? And finally, is there any way I can parallelize this to
use multiple cores without slurping in the whole file?

If it matters, I'm using a snapshot of 1.1.0-alpha; the jar included
with incanter.

Here's the code:

(defn afile /path/to/big/file)

;; Count the lines in the file.
;; 12.8 s, light memory use (0.8 GB).
(defn test-count []
  (with-open [rdr (duck-streams/reader afile)]
(count (line-seq rdr

;; Split and count.
;; 183.2 s, heavy memory use (8.6 GB).
(defn test-split []
  (with-open [rdr (duck-streams/reader afile)]
(reduce + (map #(count (.split %1 \\|)) (line-seq rdr)

;; 190.8 s, heavy memory use (8.8 GB).
(defn test-split-2 []
  (with-open [rdr (duck-streams/reader afile)]
(loop [counts (seq (map #(count (.split %1 \\|)) (line-seq
rdr)))
   cnt 0]
  (if counts
(recur (next counts) (+ cnt (first counts)))
cnt

;; Use rest instead, if-let (following http://clojure.org/lazy.)
;; 166.1 s, light memory use (1.4 GB)
(defn test-split-3 []
  (with-open [rdr (duck-streams/reader afile)]
(loop [counts (map #(count (.split %1 \\|)) (line-seq rdr))
   cnt 0]
  (if-let [s (seq counts)]
(recur (rest s) (+ cnt (first s)))
cnt

;; Try without the if-let.
;; 211.6 s, heavy memory use (8.7 GB). Surprise!
(defn test-split-3b []
  (with-open [rdr (duck-streams/reader afile)]
(loop [counts (map #(count (.split %1 \\|)) (line-seq rdr))
   cnt 0]
  (if (seq counts)
(recur (rest counts) (+ cnt (first counts)))
cnt

;; 160 s, light memory use. (1.5 GB)
(defn test-split-4 []
  (with-open [rdr (duck-streams/reader afile)]
(loop [lines (line-seq rdr)
   cnt 0]
  (if lines
(recur (next lines)
   (+ cnt (count (.split (first lines) \\|
cnt

;; Parallel split and count.
;; Based on test-split-3, but using pmap.
;; 95.1 s, heavy memory use (8.7 GB)
(defn test-psplit-1 []
  (with-open [rdr (duck-streams/reader afile)]
(loop [counts (pmap #(count (.split %1 \\|)) (line-seq rdr))
   cnt 0]
  (if-let [s (seq counts)]
(recur (rest s) (+ cnt (first s)))
cnt

-- 
You received this message because you are subscribed to the Google
Groups Clojure group.
To post to this group, send email to clojure@googlegroups.com
Note that posts from new members are moderated - please be patient with your 
first post.
To unsubscribe from this group, send email to
clojure+unsubscr...@googlegroups.com
For more options, visit this group at
http://groups.google.com/group/clojure?hl=en


Re: Space usage of lazy seqs

2009-12-02 Thread Johann Hibschman
On Dec 2, 9:09 pm, David Brown cloj...@davidb.org wrote:
 How much memory do you have on your machine.  A recent Sun JVM on a
 machine with a bunch of memory will consider it to be a server
 machine.  It will set the heap max to 1/4 of total physical memory
 (which suggests you might have 16GB of RAM).

I have 96 GB, so I'm not in danger of running out. I just want to
understand if I'm using the sequence functions properly, so that I can
run a few instances of this, plus some R, etc.

 You can tune the max with -Xmx1G for example, to limit it to one GB.

That's a good idea; then I'll know for sure if it's keeping a handle
to the entire file.

 If you're running JDK 6, you can run the virtualvm, or jconsole to get
 a better handle on the memory usage, and even dig into what it might
 used for.

Ah, I'd forgotten about jconsole. Well, I'll muddle around and see
what I can figure out.

Thanks,
Johann

-- 
You received this message because you are subscribed to the Google
Groups Clojure group.
To post to this group, send email to clojure@googlegroups.com
Note that posts from new members are moderated - please be patient with your 
first post.
To unsubscribe from this group, send email to
clojure+unsubscr...@googlegroups.com
For more options, visit this group at
http://groups.google.com/group/clojure?hl=en