Re: fold over a sequence

2013-03-13 Thread Paul Butcher
Ah - sorry, missed that. I've just released version 0.2, which depends on 1.5.1. -- paul.butcher->msgCount++ Snetterton, Castle Combe, Cadwell Park... Who says I have a one track mind? http://www.paulbutcher.com/ LinkedIn: http://www.linkedin.com/in/paulbutcher MSN: p...@paulbutcher.com AIM: pau

Re: fold over a sequence

2013-03-13 Thread Jim foo.bar
there was a memory leak hence the 1.5.1 release the next day... Jim On 13/03/13 14:12, Paul Butcher wrote: On 13 Mar 2013, at 14:05, "Jim foo.bar" > wrote: how come your project depends on the problematic version 1.5.0? 1.5.0 is problematic? -- paul.butcher->ms

Re: fold over a sequence

2013-03-13 Thread Paul Butcher
On 13 Mar 2013, at 14:05, "Jim foo.bar" wrote: > how come your project depends on the problematic version 1.5.0? 1.5.0 is problematic? -- paul.butcher->msgCount++ Snetterton, Castle Combe, Cadwell Park... Who says I have a one track mind? http://www.paulbutcher.com/ LinkedIn: http://www.linke

Re: fold over a sequence

2013-03-13 Thread Jim foo.bar
how come your project depends on the problematic version 1.5.0? Jim On 13/03/13 14:03, Paul Butcher wrote: Thanks Stuart - my Contributor Agreement is on its way. In the meantime, I've published foldable-seq as a library: https://clojars.org/foldable-seq I'd be very interested in any feedbac

Re: fold over a sequence

2013-03-13 Thread Paul Butcher
Thanks Stuart - my Contributor Agreement is on its way. In the meantime, I've published foldable-seq as a library: https://clojars.org/foldable-seq I'd be very interested in any feedback on the code or how it works. -- paul.butcher->msgCount++ Snetterton, Castle Combe, Cadwell Park... Who say

Re: fold over a sequence

2013-03-12 Thread Stuart Sierra
See clojure.org/contributing it's all there. On Tuesday, March 12, 2013, Paul Butcher wrote: > On 12 Mar 2013, at 18:26, Stuart Sierra > 'the.stuart.sie...@gmail.com');>> > wrote: > > This might be an interesting contribution to clojure.core.reducers. I > haven't looked at your code in detail,

Re: fold over a sequence

2013-03-12 Thread Paul Butcher
On 12 Mar 2013, at 18:26, Stuart Sierra wrote: > This might be an interesting contribution to clojure.core.reducers. I haven't > looked at your code in detail, so I can't say for sure, but being able to do > parallel fold over semi-lazy sequences would be very useful. I'd be delighted if this

Re: fold over a sequence

2013-03-12 Thread Stuart Sierra
Hi Paul, This might be an interesting contribution to clojure.core.reducers. I haven't looked at your code in detail, so I can't say for sure, but being able to do parallel fold over semi-lazy sequences would be very useful. -S On Tuesday, March 12, 2013 9:34:43 AM UTC-4, Paul Butcher wrote:

Re: fold over a sequence

2013-03-12 Thread Paul Butcher
On 12 Mar 2013, at 15:55, Alan Busby wrote: > If Paul wouldn't mind I'd like to add a a similar "seq" function to Iota that > would allow for index-less processing like he did in foldable-seq. Paul would be delighted :-) -- paul.butcher->msgCount++ Snetterton, Castle Combe, Cadwell Park... Wh

Re: fold over a sequence

2013-03-12 Thread Alan Busby
On Tue, Mar 12, 2013 at 11:00 PM, Paul Butcher wrote: > On 12 Mar 2013, at 13:49, Adam Clements wrote: > > How would feeding a line-seq into this compare to iota? And how would that > compare to a version of iota tweaked to work in a slightly less eager > fashion? > > > It'll not suffer from the

Re: fold over a sequence

2013-03-12 Thread Paul Butcher
On 12 Mar 2013, at 13:49, Adam Clements wrote: > How would feeding a line-seq into this compare to iota? And how would that > compare to a version of iota tweaked to work in a slightly less eager fashion? It'll not suffer from the problem of having to drag the whole file into memory, but will

Re: fold over a sequence

2013-03-12 Thread Paul Butcher
On 12 Mar 2013, at 13:52, Marko Topolnik wrote: > That's what I meant, succeed by relying on the way f/j is used by the > reducers public API, without copy-pasting the internals and using them > directly. So I guess the answer is "no". I don't believe that I could - the CollFold implementation

Re: fold over a sequence

2013-03-12 Thread Marko Topolnik
On Tuesday, March 12, 2013 2:48:52 PM UTC+1, Paul Butcher wrote: > On 12 Mar 2013, at 13:45, Marko Topolnik > > wrote: > > Nice going :) Is it really impossible to somehow do this from the outside, > through the public API? > > > I think that it *does* do it from the outside through the public A

Re: fold over a sequence

2013-03-12 Thread Adam Clements
I've had exactly this problem trying to use reducers over a large file that wouldn't fit in memory. I tried iota, but had the issue that it was still scanning and memory mapping the entire file before it would start doing anything (pulling the whole thing through ram and taking a fair few minut

Re: fold over a sequence

2013-03-12 Thread Paul Butcher
On 12 Mar 2013, at 13:45, Marko Topolnik wrote: > Nice going :) Is it really impossible to somehow do this from the outside, > through the public API? I think that it *does* do it from the outside through the public API :-) I'm just reifying the (public) CollFold protocol. I do copy a bunch

Re: fold over a sequence

2013-03-12 Thread Marko Topolnik
Nice going :) Is it really impossible to somehow do this from the outside, through the public API? On Tuesday, March 12, 2013 2:34:43 PM UTC+1, Paul Butcher wrote: > > So this turned out to be pretty easy. I've implemented a function called > "foldable-seq" that takes a lazy sequence and turns i

Re: fold over a sequence

2013-03-12 Thread Paul Butcher
So this turned out to be pretty easy. I've implemented a function called "foldable-seq" that takes a lazy sequence and turns it into something that can be folded in parallel. I've checked an example program that uses it to count words in a Wikipedia XML dump into GitHub: https://github.com/paul

Re: fold over a sequence

2013-03-11 Thread Paul Butcher
On 11 Mar 2013, at 11:00, Marko Topolnik wrote: > The idea is to transform into a lazy sequence of eager chunks. That approach > should work. Exactly. Right - I guess I should put my money where my mouth is and see if I can get it working... -- paul.butcher->msgCount++ Snetterton, Castle Com

Re: fold over a sequence

2013-03-11 Thread Paul Butcher
On 11 Mar 2013, at 10:40, Jim foo.bar wrote: > why can't you 'vec' the result of xml/parse and then use fold on that? Is it > a massive seq? In my case, it's the Wikipedia XML dump, so around 40GiB (so no, that wouldn't work :-) -- paul.butcher->msgCount++ Snetterton, Castle Combe, Cadwell P

Re: fold over a sequence

2013-03-11 Thread Marko Topolnik
The idea is to transform into a lazy sequence of eager chunks. That approach should work. On Monday, March 11, 2013 11:40:01 AM UTC+1, Jim foo.bar wrote: > > I don't think you will be able to do a parallel fold on a lazy-seq which > is what clojure.data.xml/parse returns. Vectors are the only p

Re: fold over a sequence

2013-03-11 Thread Jim foo.bar
I don't think you will be able to do a parallel fold on a lazy-seq which is what clojure.data.xml/parse returns. Vectors are the only persistent collection that supports parallel fold and something tells me it's because they are NOT lazy... why can't you 'vec' the result of xml/parse and then

Re: fold over a sequence

2013-03-10 Thread Alan Busby
On Mon, Mar 11, 2013 at 9:40 AM, Paul Butcher wrote: > I'm currently working on code that processes XML generated by > clojure.data.xml/parse, and would love to do so in parallel. I can't > immediately see any reason why it wouldn't be possible to create a version > of CollFold that takes a seque

Re: fold over a sequence

2013-03-10 Thread Rich Morin
On Mar 10, 2013, at 17:40, Paul Butcher wrote: > As things currently stand, fold can be used on a sequence- > based reducible collection, but won't be parallel. > > I'm currently working on code that processes XML generated by > clojure.data.xml/parse, and would love to do so in parallel. > I can

fold over a sequence

2013-03-10 Thread Paul Butcher
As things currently stand, fold can be used on a sequence-based reducible collection, but won't be parallel. I'm currently working on code that processes XML generated by clojure.data.xml/parse, and would love to do so in parallel. I can't immediately see any reason why it wouldn't be possible