Re: fold over a sequence
Thanks Stuart - my Contributor Agreement is on its way. In the meantime, I've published foldable-seq as a library: https://clojars.org/foldable-seq I'd be very interested in any feedback on the code or how it works. -- paul.butcher-msgCount++ Snetterton, Castle Combe, Cadwell Park... Who says I have a one track mind? http://www.paulbutcher.com/ LinkedIn: http://www.linkedin.com/in/paulbutcher MSN: p...@paulbutcher.com AIM: paulrabutcher Skype: paulrabutcher On 12 Mar 2013, at 21:48, Stuart Sierra m...@stuartsierra.com wrote: See clojure.org/contributing it's all there. On Tuesday, March 12, 2013, Paul Butcher wrote: On 12 Mar 2013, at 18:26, Stuart Sierra the.stuart.sie...@gmail.com wrote: This might be an interesting contribution to clojure.core.reducers. I haven't looked at your code in detail, so I can't say for sure, but being able to do parallel fold over semi-lazy sequences would be very useful. I'd be delighted if this (or something like it) could make it into clojure.core.reducers, Stuart. What would I need to do to make this happen? -- paul.butcher-msgCount++ Snetterton, Castle Combe, Cadwell Park... Who says I have a one track mind? http://www.paulbutcher.com/ LinkedIn: http://www.linkedin.com/in/paulbutcher MSN: p...@paulbutcher.com AIM: paulrabutcher Skype: paulrabutcher -- -- You received this message because you are subscribed to the Google Groups Clojure group. To post to this group, send email to clojure@googlegroups.com Note that posts from new members are moderated - please be patient with your first post. To unsubscribe from this group, send email to clojure+unsubscr...@googlegroups.com For more options, visit this group at http://groups.google.com/group/clojure?hl=en --- You received this message because you are subscribed to a topic in the Google Groups Clojure group. To unsubscribe from this topic, visit https://groups.google.com/d/topic/clojure/8RKCjF00ukQ/unsubscribe?hl=en. To unsubscribe from this group and all its topics, send an email to clojure+unsubscr...@googlegroups.com. For more options, visit https://groups.google.com/groups/opt_out. -- -- You received this message because you are subscribed to the Google Groups Clojure group. To post to this group, send email to clojure@googlegroups.com Note that posts from new members are moderated - please be patient with your first post. To unsubscribe from this group, send email to clojure+unsubscr...@googlegroups.com For more options, visit this group at http://groups.google.com/group/clojure?hl=en --- You received this message because you are subscribed to the Google Groups Clojure group. To unsubscribe from this group and stop receiving emails from it, send an email to clojure+unsubscr...@googlegroups.com. For more options, visit https://groups.google.com/groups/opt_out. -- -- You received this message because you are subscribed to the Google Groups Clojure group. To post to this group, send email to clojure@googlegroups.com Note that posts from new members are moderated - please be patient with your first post. To unsubscribe from this group, send email to clojure+unsubscr...@googlegroups.com For more options, visit this group at http://groups.google.com/group/clojure?hl=en --- You received this message because you are subscribed to the Google Groups Clojure group. To unsubscribe from this group and stop receiving emails from it, send an email to clojure+unsubscr...@googlegroups.com. For more options, visit https://groups.google.com/groups/opt_out.
Re: fold over a sequence
how come your project depends on the problematic version 1.5.0? Jim On 13/03/13 14:03, Paul Butcher wrote: Thanks Stuart - my Contributor Agreement is on its way. In the meantime, I've published foldable-seq as a library: https://clojars.org/foldable-seq I'd be very interested in any feedback on the code or how it works. -- paul.butcher-msgCount++ Snetterton, Castle Combe, Cadwell Park... Who says I have a one track mind? http://www.paulbutcher.com http://www.paulbutcher.com// LinkedIn: http://www.linkedin.com/in/paulbutcher MSN: p...@paulbutcher.com mailto:p...@paulbutcher.com AIM: paulrabutcher Skype: paulrabutcher On 12 Mar 2013, at 21:48, Stuart Sierra m...@stuartsierra.com mailto:m...@stuartsierra.com wrote: See clojure.org/contributing http://clojure.org/contributing it's all there. On Tuesday, March 12, 2013, Paul Butcher wrote: On 12 Mar 2013, at 18:26, Stuart Sierra the.stuart.sie...@gmail.com javascript:_e({}, 'cvml', 'the.stuart.sie...@gmail.com'); wrote: This might be an interesting contribution to clojure.core.reducers. I haven't looked at your code in detail, so I can't say for sure, but being able to do parallel fold over semi-lazy sequences would be very useful. I'd be delighted if this (or something like it) could make it into clojure.core.reducers, Stuart. What would I need to do to make this happen? -- paul.butcher-msgCount++ Snetterton, Castle Combe, Cadwell Park... Who says I have a one track mind? http://www.paulbutcher.com http://www.paulbutcher.com// LinkedIn: http://www.linkedin.com/in/paulbutcher MSN: p...@paulbutcher.com javascript:_e({}, 'cvml', 'p...@paulbutcher.com'); AIM: paulrabutcher Skype: paulrabutcher -- -- You received this message because you are subscribed to the Google Groups Clojure group. To post to this group, send email to clojure@googlegroups.com javascript:_e({}, 'cvml', 'clojure@googlegroups.com'); Note that posts from new members are moderated - please be patient with your first post. To unsubscribe from this group, send email to clojure+unsubscr...@googlegroups.com javascript:_e({}, 'cvml', 'clojure%2bunsubscr...@googlegroups.com'); For more options, visit this group at http://groups.google.com/group/clojure?hl=en --- You received this message because you are subscribed to a topic in the Google Groups Clojure group. To unsubscribe from this topic, visit https://groups.google.com/d/topic/clojure/8RKCjF00ukQ/unsubscribe?hl=en. To unsubscribe from this group and all its topics, send an email to clojure+unsubscr...@googlegroups.com javascript:_e({}, 'cvml', 'clojure%2bunsubscr...@googlegroups.com');. For more options, visit https://groups.google.com/groups/opt_out. -- -- You received this message because you are subscribed to the Google Groups Clojure group. To post to this group, send email to clojure@googlegroups.com mailto:clojure@googlegroups.com Note that posts from new members are moderated - please be patient with your first post. To unsubscribe from this group, send email to clojure+unsubscr...@googlegroups.com mailto:clojure+unsubscr...@googlegroups.com For more options, visit this group at http://groups.google.com/group/clojure?hl=en --- You received this message because you are subscribed to the Google Groups Clojure group. To unsubscribe from this group and stop receiving emails from it, send an email to clojure+unsubscr...@googlegroups.com mailto:clojure+unsubscr...@googlegroups.com. For more options, visit https://groups.google.com/groups/opt_out. -- -- You received this message because you are subscribed to the Google Groups Clojure group. To post to this group, send email to clojure@googlegroups.com Note that posts from new members are moderated - please be patient with your first post. To unsubscribe from this group, send email to clojure+unsubscr...@googlegroups.com For more options, visit this group at http://groups.google.com/group/clojure?hl=en --- You received this message because you are subscribed to the Google Groups Clojure group. To unsubscribe from this group and stop receiving emails from it, send an email to clojure+unsubscr...@googlegroups.com. For more options, visit https://groups.google.com/groups/opt_out. -- -- You received this message because you are subscribed to the Google Groups Clojure group. To post to this group, send email to clojure@googlegroups.com Note that posts from new members are moderated - please be patient with your first post. To unsubscribe from this group, send email to clojure+unsubscr...@googlegroups.com For more options, visit this group at http://groups.google.com/group/clojure?hl=en --- You received this message because you are subscribed to the Google Groups Clojure group. To unsubscribe from this group and stop receiving emails from it, send an email to
Re: fold over a sequence
On 13 Mar 2013, at 14:05, Jim foo.bar jimpil1...@gmail.com wrote: how come your project depends on the problematic version 1.5.0? 1.5.0 is problematic? -- paul.butcher-msgCount++ Snetterton, Castle Combe, Cadwell Park... Who says I have a one track mind? http://www.paulbutcher.com/ LinkedIn: http://www.linkedin.com/in/paulbutcher MSN: p...@paulbutcher.com AIM: paulrabutcher Skype: paulrabutcher -- -- You received this message because you are subscribed to the Google Groups Clojure group. To post to this group, send email to clojure@googlegroups.com Note that posts from new members are moderated - please be patient with your first post. To unsubscribe from this group, send email to clojure+unsubscr...@googlegroups.com For more options, visit this group at http://groups.google.com/group/clojure?hl=en --- You received this message because you are subscribed to the Google Groups Clojure group. To unsubscribe from this group and stop receiving emails from it, send an email to clojure+unsubscr...@googlegroups.com. For more options, visit https://groups.google.com/groups/opt_out.
Re: fold over a sequence
there was a memory leak hence the 1.5.1 release the next day... Jim On 13/03/13 14:12, Paul Butcher wrote: On 13 Mar 2013, at 14:05, Jim foo.bar jimpil1...@gmail.com mailto:jimpil1...@gmail.com wrote: how come your project depends on the problematic version 1.5.0? 1.5.0 is problematic? -- paul.butcher-msgCount++ Snetterton, Castle Combe, Cadwell Park... Who says I have a one track mind? http://www.paulbutcher.com/ LinkedIn: http://www.linkedin.com/in/paulbutcher MSN: p...@paulbutcher.com mailto:p...@paulbutcher.com AIM: paulrabutcher Skype: paulrabutcher -- -- You received this message because you are subscribed to the Google Groups Clojure group. To post to this group, send email to clojure@googlegroups.com Note that posts from new members are moderated - please be patient with your first post. To unsubscribe from this group, send email to clojure+unsubscr...@googlegroups.com For more options, visit this group at http://groups.google.com/group/clojure?hl=en --- You received this message because you are subscribed to the Google Groups Clojure group. To unsubscribe from this group and stop receiving emails from it, send an email to clojure+unsubscr...@googlegroups.com. For more options, visit https://groups.google.com/groups/opt_out. -- -- You received this message because you are subscribed to the Google Groups Clojure group. To post to this group, send email to clojure@googlegroups.com Note that posts from new members are moderated - please be patient with your first post. To unsubscribe from this group, send email to clojure+unsubscr...@googlegroups.com For more options, visit this group at http://groups.google.com/group/clojure?hl=en --- You received this message because you are subscribed to the Google Groups Clojure group. To unsubscribe from this group and stop receiving emails from it, send an email to clojure+unsubscr...@googlegroups.com. For more options, visit https://groups.google.com/groups/opt_out.
Re: fold over a sequence
Ah - sorry, missed that. I've just released version 0.2, which depends on 1.5.1. -- paul.butcher-msgCount++ Snetterton, Castle Combe, Cadwell Park... Who says I have a one track mind? http://www.paulbutcher.com/ LinkedIn: http://www.linkedin.com/in/paulbutcher MSN: p...@paulbutcher.com AIM: paulrabutcher Skype: paulrabutcher On 13 Mar 2013, at 14:13, Jim foo.bar jimpil1...@gmail.com wrote: there was a memory leak hence the 1.5.1 release the next day... Jim On 13/03/13 14:12, Paul Butcher wrote: On 13 Mar 2013, at 14:05, Jim foo.bar jimpil1...@gmail.com wrote: how come your project depends on the problematic version 1.5.0? 1.5.0 is problematic? -- paul.butcher-msgCount++ Snetterton, Castle Combe, Cadwell Park... Who says I have a one track mind? http://www.paulbutcher.com/ LinkedIn: http://www.linkedin.com/in/paulbutcher MSN: p...@paulbutcher.com AIM: paulrabutcher Skype: paulrabutcher -- -- You received this message because you are subscribed to the Google Groups Clojure group. To post to this group, send email to clojure@googlegroups.com Note that posts from new members are moderated - please be patient with your first post. To unsubscribe from this group, send email to clojure+unsubscr...@googlegroups.com For more options, visit this group at http://groups.google.com/group/clojure?hl=en --- You received this message because you are subscribed to the Google Groups Clojure group. To unsubscribe from this group and stop receiving emails from it, send an email to clojure+unsubscr...@googlegroups.com. For more options, visit https://groups.google.com/groups/opt_out. -- -- You received this message because you are subscribed to the Google Groups Clojure group. To post to this group, send email to clojure@googlegroups.com Note that posts from new members are moderated - please be patient with your first post. To unsubscribe from this group, send email to clojure+unsubscr...@googlegroups.com For more options, visit this group at http://groups.google.com/group/clojure?hl=en --- You received this message because you are subscribed to the Google Groups Clojure group. To unsubscribe from this group and stop receiving emails from it, send an email to clojure+unsubscr...@googlegroups.com. For more options, visit https://groups.google.com/groups/opt_out. -- -- You received this message because you are subscribed to the Google Groups Clojure group. To post to this group, send email to clojure@googlegroups.com Note that posts from new members are moderated - please be patient with your first post. To unsubscribe from this group, send email to clojure+unsubscr...@googlegroups.com For more options, visit this group at http://groups.google.com/group/clojure?hl=en --- You received this message because you are subscribed to the Google Groups Clojure group. To unsubscribe from this group and stop receiving emails from it, send an email to clojure+unsubscr...@googlegroups.com. For more options, visit https://groups.google.com/groups/opt_out.
Re: fold over a sequence
So this turned out to be pretty easy. I've implemented a function called foldable-seq that takes a lazy sequence and turns it into something that can be folded in parallel. I've checked an example program that uses it to count words in a Wikipedia XML dump into GitHub: https://github.com/paulbutcher/foldable-seq The code for foldable-seq is here: https://github.com/paulbutcher/foldable-seq/blob/master/src/wordcount/reducers.clj#L60 On my 4-core MacBook Pro, I see a 40 second runtime without parallel-seq, 13 seconds with. -- paul.butcher-msgCount++ Snetterton, Castle Combe, Cadwell Park... Who says I have a one track mind? http://www.paulbutcher.com/ LinkedIn: http://www.linkedin.com/in/paulbutcher MSN: p...@paulbutcher.com AIM: paulrabutcher Skype: paulrabutcher On 11 Mar 2013, at 13:38, Paul Butcher p...@paulbutcher.com wrote: On 11 Mar 2013, at 11:00, Marko Topolnik marko.topol...@gmail.com wrote: The idea is to transform into a lazy sequence of eager chunks. That approach should work. Exactly. Right - I guess I should put my money where my mouth is and see if I can get it working... -- paul.butcher-msgCount++ Snetterton, Castle Combe, Cadwell Park... Who says I have a one track mind? http://www.paulbutcher.com/ LinkedIn: http://www.linkedin.com/in/paulbutcher MSN: p...@paulbutcher.com AIM: paulrabutcher Skype: paulrabutcher -- -- You received this message because you are subscribed to the Google Groups Clojure group. To post to this group, send email to clojure@googlegroups.com Note that posts from new members are moderated - please be patient with your first post. To unsubscribe from this group, send email to clojure+unsubscr...@googlegroups.com For more options, visit this group at http://groups.google.com/group/clojure?hl=en --- You received this message because you are subscribed to the Google Groups Clojure group. To unsubscribe from this group and stop receiving emails from it, send an email to clojure+unsubscr...@googlegroups.com. For more options, visit https://groups.google.com/groups/opt_out.
Re: fold over a sequence
Nice going :) Is it really impossible to somehow do this from the outside, through the public API? On Tuesday, March 12, 2013 2:34:43 PM UTC+1, Paul Butcher wrote: So this turned out to be pretty easy. I've implemented a function called foldable-seq that takes a lazy sequence and turns it into something that can be folded in parallel. I've checked an example program that uses it to count words in a Wikipedia XML dump into GitHub: https://github.com/paulbutcher/foldable-seq The code for foldable-seq is here: https://github.com/paulbutcher/foldable-seq/blob/master/src/wordcount/reducers.clj#L60 On my 4-core MacBook Pro, I see a 40 second runtime without parallel-seq, 13 seconds with. -- paul.butcher-msgCount++ Snetterton, Castle Combe, Cadwell Park... Who says I have a one track mind? http://www.paulbutcher.com/ LinkedIn: http://www.linkedin.com/in/paulbutcher MSN: pa...@paulbutcher.com javascript: AIM: paulrabutcher Skype: paulrabutcher On 11 Mar 2013, at 13:38, Paul Butcher pa...@paulbutcher.comjavascript: wrote: On 11 Mar 2013, at 11:00, Marko Topolnik marko.t...@gmail.comjavascript: wrote: The idea is to transform into a lazy sequence of eager chunks. That approach should work. Exactly. Right - I guess I should put my money where my mouth is and see if I can get it working... -- paul.butcher-msgCount++ Snetterton, Castle Combe, Cadwell Park... Who says I have a one track mind? http://www.paulbutcher.com/ LinkedIn: http://www.linkedin.com/in/paulbutcher MSN: pa...@paulbutcher.com javascript: AIM: paulrabutcher Skype: paulrabutcher -- -- You received this message because you are subscribed to the Google Groups Clojure group. To post to this group, send email to clojure@googlegroups.com Note that posts from new members are moderated - please be patient with your first post. To unsubscribe from this group, send email to clojure+unsubscr...@googlegroups.com For more options, visit this group at http://groups.google.com/group/clojure?hl=en --- You received this message because you are subscribed to the Google Groups Clojure group. To unsubscribe from this group and stop receiving emails from it, send an email to clojure+unsubscr...@googlegroups.com. For more options, visit https://groups.google.com/groups/opt_out.
Re: fold over a sequence
On 12 Mar 2013, at 13:45, Marko Topolnik marko.topol...@gmail.com wrote: Nice going :) Is it really impossible to somehow do this from the outside, through the public API? I think that it *does* do it from the outside through the public API :-) I'm just reifying the (public) CollFold protocol. I do copy a bunch of helper functions related to fork/join which are private within the reducers namespace (which isn't particularly nice). But I guess that this will go away when (if?) Clojure has a proper API to support fork/join. But I thought it better to do that than reinvent the wheel. Or am I missing something? -- paul.butcher-msgCount++ Snetterton, Castle Combe, Cadwell Park... Who says I have a one track mind? http://www.paulbutcher.com/ LinkedIn: http://www.linkedin.com/in/paulbutcher MSN: p...@paulbutcher.com AIM: paulrabutcher Skype: paulrabutcher -- -- You received this message because you are subscribed to the Google Groups Clojure group. To post to this group, send email to clojure@googlegroups.com Note that posts from new members are moderated - please be patient with your first post. To unsubscribe from this group, send email to clojure+unsubscr...@googlegroups.com For more options, visit this group at http://groups.google.com/group/clojure?hl=en --- You received this message because you are subscribed to the Google Groups Clojure group. To unsubscribe from this group and stop receiving emails from it, send an email to clojure+unsubscr...@googlegroups.com. For more options, visit https://groups.google.com/groups/opt_out.
Re: fold over a sequence
I've had exactly this problem trying to use reducers over a large file that wouldn't fit in memory. I tried iota, but had the issue that it was still scanning and memory mapping the entire file before it would start doing anything (pulling the whole thing through ram and taking a fair few minutes). How would feeding a line-seq into this compare to iota? And how would that compare to a version of iota tweaked to work in a slightly less eager fashion? Adam On Tuesday, March 12, 2013 1:34:43 PM UTC, Paul Butcher wrote: So this turned out to be pretty easy. I've implemented a function called foldable-seq that takes a lazy sequence and turns it into something that can be folded in parallel. I've checked an example program that uses it to count words in a Wikipedia XML dump into GitHub: https://github.com/paulbutcher/foldable-seq The code for foldable-seq is here: https://github.com/paulbutcher/foldable-seq/blob/master/src/wordcount/reducers.clj#L60 On my 4-core MacBook Pro, I see a 40 second runtime without parallel-seq, 13 seconds with. -- paul.butcher-msgCount++ Snetterton, Castle Combe, Cadwell Park... Who says I have a one track mind? http://www.paulbutcher.com/ LinkedIn: http://www.linkedin.com/in/paulbutcher MSN: pa...@paulbutcher.com javascript: AIM: paulrabutcher Skype: paulrabutcher On 11 Mar 2013, at 13:38, Paul Butcher pa...@paulbutcher.comjavascript: wrote: On 11 Mar 2013, at 11:00, Marko Topolnik marko.t...@gmail.comjavascript: wrote: The idea is to transform into a lazy sequence of eager chunks. That approach should work. Exactly. Right - I guess I should put my money where my mouth is and see if I can get it working... -- paul.butcher-msgCount++ Snetterton, Castle Combe, Cadwell Park... Who says I have a one track mind? http://www.paulbutcher.com/ LinkedIn: http://www.linkedin.com/in/paulbutcher MSN: pa...@paulbutcher.com javascript: AIM: paulrabutcher Skype: paulrabutcher -- -- You received this message because you are subscribed to the Google Groups Clojure group. To post to this group, send email to clojure@googlegroups.com Note that posts from new members are moderated - please be patient with your first post. To unsubscribe from this group, send email to clojure+unsubscr...@googlegroups.com For more options, visit this group at http://groups.google.com/group/clojure?hl=en --- You received this message because you are subscribed to the Google Groups Clojure group. To unsubscribe from this group and stop receiving emails from it, send an email to clojure+unsubscr...@googlegroups.com. For more options, visit https://groups.google.com/groups/opt_out.
Re: fold over a sequence
On Tuesday, March 12, 2013 2:48:52 PM UTC+1, Paul Butcher wrote: On 12 Mar 2013, at 13:45, Marko Topolnik marko.t...@gmail.comjavascript: wrote: Nice going :) Is it really impossible to somehow do this from the outside, through the public API? I think that it *does* do it from the outside through the public API :-) I'm just reifying the (public) CollFold protocol. I do copy a bunch of helper functions related to fork/join which are private within the reducers namespace (which isn't particularly nice). But I guess that this will go away when (if?) Clojure has a proper API to support fork/join. But I thought it better to do that than reinvent the wheel. That's what I meant, succeed by relying on the way f/j is used by the reducers public API, without copy-pasting the internals and using them directly. So I guess the answer is no. -Marko -- -- You received this message because you are subscribed to the Google Groups Clojure group. To post to this group, send email to clojure@googlegroups.com Note that posts from new members are moderated - please be patient with your first post. To unsubscribe from this group, send email to clojure+unsubscr...@googlegroups.com For more options, visit this group at http://groups.google.com/group/clojure?hl=en --- You received this message because you are subscribed to the Google Groups Clojure group. To unsubscribe from this group and stop receiving emails from it, send an email to clojure+unsubscr...@googlegroups.com. For more options, visit https://groups.google.com/groups/opt_out.
Re: fold over a sequence
On 12 Mar 2013, at 13:52, Marko Topolnik marko.topol...@gmail.com wrote: That's what I meant, succeed by relying on the way f/j is used by the reducers public API, without copy-pasting the internals and using them directly. So I guess the answer is no. I don't believe that I could - the CollFold implementation effectively does a binary chop on the sequence which would force the whole sequence to be realised if I did the same. It's a sensible strategy for a fully realised data structure that already exists in memory, but not for a lazy sequence. -- paul.butcher-msgCount++ Snetterton, Castle Combe, Cadwell Park... Who says I have a one track mind? http://www.paulbutcher.com/ LinkedIn: http://www.linkedin.com/in/paulbutcher MSN: p...@paulbutcher.com AIM: paulrabutcher Skype: paulrabutcher -- -- You received this message because you are subscribed to the Google Groups Clojure group. To post to this group, send email to clojure@googlegroups.com Note that posts from new members are moderated - please be patient with your first post. To unsubscribe from this group, send email to clojure+unsubscr...@googlegroups.com For more options, visit this group at http://groups.google.com/group/clojure?hl=en --- You received this message because you are subscribed to the Google Groups Clojure group. To unsubscribe from this group and stop receiving emails from it, send an email to clojure+unsubscr...@googlegroups.com. For more options, visit https://groups.google.com/groups/opt_out.
Re: fold over a sequence
On 12 Mar 2013, at 13:49, Adam Clements adam.cleme...@gmail.com wrote: How would feeding a line-seq into this compare to iota? And how would that compare to a version of iota tweaked to work in a slightly less eager fashion? It'll not suffer from the problem of having to drag the whole file into memory, but will incur the overhead of turning everything into JVM data structures that iota avoids. I don't imagine that it would be hard to modify iota to use a similar approach though. Although I imagine that Alan's better placed to have an opinion on that :-) -- paul.butcher-msgCount++ Snetterton, Castle Combe, Cadwell Park... Who says I have a one track mind? http://www.paulbutcher.com/ LinkedIn: http://www.linkedin.com/in/paulbutcher MSN: p...@paulbutcher.com AIM: paulrabutcher Skype: paulrabutcher -- -- You received this message because you are subscribed to the Google Groups Clojure group. To post to this group, send email to clojure@googlegroups.com Note that posts from new members are moderated - please be patient with your first post. To unsubscribe from this group, send email to clojure+unsubscr...@googlegroups.com For more options, visit this group at http://groups.google.com/group/clojure?hl=en --- You received this message because you are subscribed to the Google Groups Clojure group. To unsubscribe from this group and stop receiving emails from it, send an email to clojure+unsubscr...@googlegroups.com. For more options, visit https://groups.google.com/groups/opt_out.
Re: fold over a sequence
On Tue, Mar 12, 2013 at 11:00 PM, Paul Butcher p...@paulbutcher.com wrote: On 12 Mar 2013, at 13:49, Adam Clements adam.cleme...@gmail.com wrote: How would feeding a line-seq into this compare to iota? And how would that compare to a version of iota tweaked to work in a slightly less eager fashion? It'll not suffer from the problem of having to drag the whole file into memory, but will incur the overhead of turning everything into JVM data structures that iota avoids. I don't imagine that it would be hard to modify iota to use a similar approach though. Although I imagine that Alan's better placed to have an opinion on that :-) If you're dealing with a stream that you'll only read once, then Paul's foldseq+line-seq should work much more effectively than Iota I'd think. Iota generates an index of line numbers so (nth iota-vec 100) is O(1), which is why it'll read through the entire file on loading. If you only need to access each line once, then that initial indexing step is likely a waste. If Paul wouldn't mind I'd like to add a a similar seq function to Iota that would allow for index-less processing like he did in foldable-seq. -- -- You received this message because you are subscribed to the Google Groups Clojure group. To post to this group, send email to clojure@googlegroups.com Note that posts from new members are moderated - please be patient with your first post. To unsubscribe from this group, send email to clojure+unsubscr...@googlegroups.com For more options, visit this group at http://groups.google.com/group/clojure?hl=en --- You received this message because you are subscribed to the Google Groups Clojure group. To unsubscribe from this group and stop receiving emails from it, send an email to clojure+unsubscr...@googlegroups.com. For more options, visit https://groups.google.com/groups/opt_out.
Re: fold over a sequence
On 12 Mar 2013, at 15:55, Alan Busby thebu...@thebusby.com wrote: If Paul wouldn't mind I'd like to add a a similar seq function to Iota that would allow for index-less processing like he did in foldable-seq. Paul would be delighted :-) -- paul.butcher-msgCount++ Snetterton, Castle Combe, Cadwell Park... Who says I have a one track mind? http://www.paulbutcher.com/ LinkedIn: http://www.linkedin.com/in/paulbutcher MSN: p...@paulbutcher.com AIM: paulrabutcher Skype: paulrabutcher -- -- You received this message because you are subscribed to the Google Groups Clojure group. To post to this group, send email to clojure@googlegroups.com Note that posts from new members are moderated - please be patient with your first post. To unsubscribe from this group, send email to clojure+unsubscr...@googlegroups.com For more options, visit this group at http://groups.google.com/group/clojure?hl=en --- You received this message because you are subscribed to the Google Groups Clojure group. To unsubscribe from this group and stop receiving emails from it, send an email to clojure+unsubscr...@googlegroups.com. For more options, visit https://groups.google.com/groups/opt_out.
Re: fold over a sequence
Hi Paul, This might be an interesting contribution to clojure.core.reducers. I haven't looked at your code in detail, so I can't say for sure, but being able to do parallel fold over semi-lazy sequences would be very useful. -S On Tuesday, March 12, 2013 9:34:43 AM UTC-4, Paul Butcher wrote: So this turned out to be pretty easy. I've implemented a function called foldable-seq that takes a lazy sequence and turns it into something that can be folded in parallel. I've checked an example program that uses it to count words in a Wikipedia XML dump into GitHub: https://github.com/paulbutcher/foldable-seq The code for foldable-seq is here: https://github.com/paulbutcher/foldable-seq/blob/master/src/wordcount/reducers.clj#L60 On my 4-core MacBook Pro, I see a 40 second runtime without parallel-seq, 13 seconds with. -- paul.butcher-msgCount++ Snetterton, Castle Combe, Cadwell Park... Who says I have a one track mind? http://www.paulbutcher.com/ LinkedIn: http://www.linkedin.com/in/paulbutcher MSN: pa...@paulbutcher.com javascript: AIM: paulrabutcher Skype: paulrabutcher On 11 Mar 2013, at 13:38, Paul Butcher pa...@paulbutcher.comjavascript: wrote: On 11 Mar 2013, at 11:00, Marko Topolnik marko.t...@gmail.comjavascript: wrote: The idea is to transform into a lazy sequence of eager chunks. That approach should work. Exactly. Right - I guess I should put my money where my mouth is and see if I can get it working... -- paul.butcher-msgCount++ Snetterton, Castle Combe, Cadwell Park... Who says I have a one track mind? http://www.paulbutcher.com/ LinkedIn: http://www.linkedin.com/in/paulbutcher MSN: pa...@paulbutcher.com javascript: AIM: paulrabutcher Skype: paulrabutcher -- -- You received this message because you are subscribed to the Google Groups Clojure group. To post to this group, send email to clojure@googlegroups.com Note that posts from new members are moderated - please be patient with your first post. To unsubscribe from this group, send email to clojure+unsubscr...@googlegroups.com For more options, visit this group at http://groups.google.com/group/clojure?hl=en --- You received this message because you are subscribed to the Google Groups Clojure group. To unsubscribe from this group and stop receiving emails from it, send an email to clojure+unsubscr...@googlegroups.com. For more options, visit https://groups.google.com/groups/opt_out.
Re: fold over a sequence
On 12 Mar 2013, at 18:26, Stuart Sierra the.stuart.sie...@gmail.com wrote: This might be an interesting contribution to clojure.core.reducers. I haven't looked at your code in detail, so I can't say for sure, but being able to do parallel fold over semi-lazy sequences would be very useful. I'd be delighted if this (or something like it) could make it into clojure.core.reducers, Stuart. What would I need to do to make this happen? -- paul.butcher-msgCount++ Snetterton, Castle Combe, Cadwell Park... Who says I have a one track mind? http://www.paulbutcher.com/ LinkedIn: http://www.linkedin.com/in/paulbutcher MSN: p...@paulbutcher.com AIM: paulrabutcher Skype: paulrabutcher -- -- You received this message because you are subscribed to the Google Groups Clojure group. To post to this group, send email to clojure@googlegroups.com Note that posts from new members are moderated - please be patient with your first post. To unsubscribe from this group, send email to clojure+unsubscr...@googlegroups.com For more options, visit this group at http://groups.google.com/group/clojure?hl=en --- You received this message because you are subscribed to the Google Groups Clojure group. To unsubscribe from this group and stop receiving emails from it, send an email to clojure+unsubscr...@googlegroups.com. For more options, visit https://groups.google.com/groups/opt_out.
Re: fold over a sequence
See clojure.org/contributing it's all there. On Tuesday, March 12, 2013, Paul Butcher wrote: On 12 Mar 2013, at 18:26, Stuart Sierra the.stuart.sie...@gmail.comjavascript:_e({}, 'cvml', 'the.stuart.sie...@gmail.com'); wrote: This might be an interesting contribution to clojure.core.reducers. I haven't looked at your code in detail, so I can't say for sure, but being able to do parallel fold over semi-lazy sequences would be very useful. I'd be delighted if this (or something like it) could make it into clojure.core.reducers, Stuart. What would I need to do to make this happen? -- paul.butcher-msgCount++ Snetterton, Castle Combe, Cadwell Park... Who says I have a one track mind? http://www.paulbutcher.com/ LinkedIn: http://www.linkedin.com/in/paulbutcher MSN: p...@paulbutcher.com javascript:_e({}, 'cvml', 'p...@paulbutcher.com'); AIM: paulrabutcher Skype: paulrabutcher -- -- You received this message because you are subscribed to the Google Groups Clojure group. To post to this group, send email to clojure@googlegroups.comjavascript:_e({}, 'cvml', 'clojure@googlegroups.com'); Note that posts from new members are moderated - please be patient with your first post. To unsubscribe from this group, send email to clojure+unsubscr...@googlegroups.com javascript:_e({}, 'cvml', 'clojure%2bunsubscr...@googlegroups.com'); For more options, visit this group at http://groups.google.com/group/clojure?hl=en --- You received this message because you are subscribed to a topic in the Google Groups Clojure group. To unsubscribe from this topic, visit https://groups.google.com/d/topic/clojure/8RKCjF00ukQ/unsubscribe?hl=en. To unsubscribe from this group and all its topics, send an email to clojure+unsubscr...@googlegroups.com javascript:_e({}, 'cvml', 'clojure%2bunsubscr...@googlegroups.com');. For more options, visit https://groups.google.com/groups/opt_out. -- -- You received this message because you are subscribed to the Google Groups Clojure group. To post to this group, send email to clojure@googlegroups.com Note that posts from new members are moderated - please be patient with your first post. To unsubscribe from this group, send email to clojure+unsubscr...@googlegroups.com For more options, visit this group at http://groups.google.com/group/clojure?hl=en --- You received this message because you are subscribed to the Google Groups Clojure group. To unsubscribe from this group and stop receiving emails from it, send an email to clojure+unsubscr...@googlegroups.com. For more options, visit https://groups.google.com/groups/opt_out.
Re: fold over a sequence
I don't think you will be able to do a parallel fold on a lazy-seq which is what clojure.data.xml/parse returns. Vectors are the only persistent collection that supports parallel fold and something tells me it's because they are NOT lazy... why can't you 'vec' the result of xml/parse and then use fold on that? Is it a massive seq? Jim On 11/03/13 00:40, Paul Butcher wrote: As things currently stand, fold can be used on a sequence-based reducible collection, but won't be parallel. I'm currently working on code that processes XML generated by clojure.data.xml/parse, and would love to do so in parallel. I can't immediately see any reason why it wouldn't be possible to create a version of CollFold that takes a sequence and chunks it so that it can be folded in parallel. Has anyone tried this yet? Is there some gotcha lurking to catch me out? -- paul.butcher-msgCount++ Snetterton, Castle Combe, Cadwell Park... Who says I have a one track mind? http://www.paulbutcher.com http://www.paulbutcher.com// LinkedIn: http://www.linkedin.com/in/paulbutcher MSN: p...@paulbutcher.com mailto:p...@paulbutcher.com AIM: paulrabutcher Skype: paulrabutcher -- -- You received this message because you are subscribed to the Google Groups Clojure group. To post to this group, send email to clojure@googlegroups.com Note that posts from new members are moderated - please be patient with your first post. To unsubscribe from this group, send email to clojure+unsubscr...@googlegroups.com For more options, visit this group at http://groups.google.com/group/clojure?hl=en --- You received this message because you are subscribed to the Google Groups Clojure group. To unsubscribe from this group and stop receiving emails from it, send an email to clojure+unsubscr...@googlegroups.com. For more options, visit https://groups.google.com/groups/opt_out. -- -- You received this message because you are subscribed to the Google Groups Clojure group. To post to this group, send email to clojure@googlegroups.com Note that posts from new members are moderated - please be patient with your first post. To unsubscribe from this group, send email to clojure+unsubscr...@googlegroups.com For more options, visit this group at http://groups.google.com/group/clojure?hl=en --- You received this message because you are subscribed to the Google Groups Clojure group. To unsubscribe from this group and stop receiving emails from it, send an email to clojure+unsubscr...@googlegroups.com. For more options, visit https://groups.google.com/groups/opt_out.
Re: fold over a sequence
The idea is to transform into a lazy sequence of eager chunks. That approach should work. On Monday, March 11, 2013 11:40:01 AM UTC+1, Jim foo.bar wrote: I don't think you will be able to do a parallel fold on a lazy-seq which is what clojure.data.xml/parse returns. Vectors are the only persistent collection that supports parallel fold and something tells me it's because they are NOT lazy... why can't you 'vec' the result of xml/parse and then use fold on that? Is it a massive seq? Jim On 11/03/13 00:40, Paul Butcher wrote: As things currently stand, fold can be used on a sequence-based reducible collection, but won't be parallel. I'm currently working on code that processes XML generated by clojure.data.xml/parse, and would love to do so in parallel. I can't immediately see any reason why it wouldn't be possible to create a version of CollFold that takes a sequence and chunks it so that it can be folded in parallel. Has anyone tried this yet? Is there some gotcha lurking to catch me out? -- paul.butcher-msgCount++ Snetterton, Castle Combe, Cadwell Park... Who says I have a one track mind? http://www.paulbutcher.com/ LinkedIn: http://www.linkedin.com/in/paulbutcher MSN: pa...@paulbutcher.com javascript: AIM: paulrabutcher Skype: paulrabutcher -- -- You received this message because you are subscribed to the Google Groups Clojure group. To post to this group, send email to clo...@googlegroups.com javascript: Note that posts from new members are moderated - please be patient with your first post. To unsubscribe from this group, send email to clojure+u...@googlegroups.com javascript: For more options, visit this group at http://groups.google.com/group/clojure?hl=en --- You received this message because you are subscribed to the Google Groups Clojure group. To unsubscribe from this group and stop receiving emails from it, send an email to clojure+u...@googlegroups.com javascript:. For more options, visit https://groups.google.com/groups/opt_out. -- -- You received this message because you are subscribed to the Google Groups Clojure group. To post to this group, send email to clojure@googlegroups.com Note that posts from new members are moderated - please be patient with your first post. To unsubscribe from this group, send email to clojure+unsubscr...@googlegroups.com For more options, visit this group at http://groups.google.com/group/clojure?hl=en --- You received this message because you are subscribed to the Google Groups Clojure group. To unsubscribe from this group and stop receiving emails from it, send an email to clojure+unsubscr...@googlegroups.com. For more options, visit https://groups.google.com/groups/opt_out.
Re: fold over a sequence
On 11 Mar 2013, at 10:40, Jim foo.bar jimpil1...@gmail.com wrote: why can't you 'vec' the result of xml/parse and then use fold on that? Is it a massive seq? In my case, it's the Wikipedia XML dump, so around 40GiB (so no, that wouldn't work :-) -- paul.butcher-msgCount++ Snetterton, Castle Combe, Cadwell Park... Who says I have a one track mind? http://www.paulbutcher.com/ LinkedIn: http://www.linkedin.com/in/paulbutcher MSN: p...@paulbutcher.com AIM: paulrabutcher Skype: paulrabutcher -- -- You received this message because you are subscribed to the Google Groups Clojure group. To post to this group, send email to clojure@googlegroups.com Note that posts from new members are moderated - please be patient with your first post. To unsubscribe from this group, send email to clojure+unsubscr...@googlegroups.com For more options, visit this group at http://groups.google.com/group/clojure?hl=en --- You received this message because you are subscribed to the Google Groups Clojure group. To unsubscribe from this group and stop receiving emails from it, send an email to clojure+unsubscr...@googlegroups.com. For more options, visit https://groups.google.com/groups/opt_out.
Re: fold over a sequence
On 11 Mar 2013, at 11:00, Marko Topolnik marko.topol...@gmail.com wrote: The idea is to transform into a lazy sequence of eager chunks. That approach should work. Exactly. Right - I guess I should put my money where my mouth is and see if I can get it working... -- paul.butcher-msgCount++ Snetterton, Castle Combe, Cadwell Park... Who says I have a one track mind? http://www.paulbutcher.com/ LinkedIn: http://www.linkedin.com/in/paulbutcher MSN: p...@paulbutcher.com AIM: paulrabutcher Skype: paulrabutcher -- -- You received this message because you are subscribed to the Google Groups Clojure group. To post to this group, send email to clojure@googlegroups.com Note that posts from new members are moderated - please be patient with your first post. To unsubscribe from this group, send email to clojure+unsubscr...@googlegroups.com For more options, visit this group at http://groups.google.com/group/clojure?hl=en --- You received this message because you are subscribed to the Google Groups Clojure group. To unsubscribe from this group and stop receiving emails from it, send an email to clojure+unsubscr...@googlegroups.com. For more options, visit https://groups.google.com/groups/opt_out.
fold over a sequence
As things currently stand, fold can be used on a sequence-based reducible collection, but won't be parallel. I'm currently working on code that processes XML generated by clojure.data.xml/parse, and would love to do so in parallel. I can't immediately see any reason why it wouldn't be possible to create a version of CollFold that takes a sequence and chunks it so that it can be folded in parallel. Has anyone tried this yet? Is there some gotcha lurking to catch me out? -- paul.butcher-msgCount++ Snetterton, Castle Combe, Cadwell Park... Who says I have a one track mind? http://www.paulbutcher.com/ LinkedIn: http://www.linkedin.com/in/paulbutcher MSN: p...@paulbutcher.com AIM: paulrabutcher Skype: paulrabutcher -- -- You received this message because you are subscribed to the Google Groups Clojure group. To post to this group, send email to clojure@googlegroups.com Note that posts from new members are moderated - please be patient with your first post. To unsubscribe from this group, send email to clojure+unsubscr...@googlegroups.com For more options, visit this group at http://groups.google.com/group/clojure?hl=en --- You received this message because you are subscribed to the Google Groups Clojure group. To unsubscribe from this group and stop receiving emails from it, send an email to clojure+unsubscr...@googlegroups.com. For more options, visit https://groups.google.com/groups/opt_out.
Re: fold over a sequence
On Mar 10, 2013, at 17:40, Paul Butcher wrote: As things currently stand, fold can be used on a sequence- based reducible collection, but won't be parallel. I'm currently working on code that processes XML generated by clojure.data.xml/parse, and would love to do so in parallel. I can't immediately see any reason why it wouldn't be possible to create a version of CollFold that takes a sequence and chunks it so that it can be folded in parallel. Has anyone tried this yet? Is there some gotcha lurking to catch me out? I love the idea; dunno if there are any gremlins lurking. BTW, if you haven't already done so, be sure to watch Guy Steele's talk: How to Think about Parallel Programming: Not! http://www.infoq.com/presentations/Thinking-Parallel-Programming The first third of the talk (~20 minutes) is a wonderful discussion of extreme hacking (eg, a single-card dump program for an IBM 1130). Guy then segues into a discussion of giving up control of details in order to let the system take care of things (eg, memory use). The last part of the talk is about the conceptual basis for order- independent folding of calculations. So, required viewing (IMHO) if you're thinking about extending reducers, etc. -r -- http://www.cfcl.com/rdmRich Morin http://www.cfcl.com/rdm/resume r...@cfcl.com http://www.cfcl.com/rdm/weblog +1 650-873-7841 Software system design, development, and documentation -- -- You received this message because you are subscribed to the Google Groups Clojure group. To post to this group, send email to clojure@googlegroups.com Note that posts from new members are moderated - please be patient with your first post. To unsubscribe from this group, send email to clojure+unsubscr...@googlegroups.com For more options, visit this group at http://groups.google.com/group/clojure?hl=en --- You received this message because you are subscribed to the Google Groups Clojure group. To unsubscribe from this group and stop receiving emails from it, send an email to clojure+unsubscr...@googlegroups.com. For more options, visit https://groups.google.com/groups/opt_out.
Re: fold over a sequence
On Mon, Mar 11, 2013 at 9:40 AM, Paul Butcher p...@paulbutcher.com wrote: I'm currently working on code that processes XML generated by clojure.data.xml/parse, and would love to do so in parallel. I can't immediately see any reason why it wouldn't be possible to create a version of CollFold that takes a sequence and chunks it so that it can be folded in parallel. Has anyone tried this yet? Is there some gotcha lurking to catch me out? I've done something like the above here; https://github.com/thebusby/iota/blob/master/src/clj/iota.clj I haven't run in to any gotcha's yet after 6+ months of extensive use. I did have to copy/paste the definitions for pool, fjtask, fjinvoke, fjfork, fjjoin from core/reducers.clj and keep them from being AOT compiled (for Java 1.7 and 1.6 compatibility). I'm not pleased with the way the thread pool is inaccessible in core/reducers.clj because I'm not aware of any mechanism to have fold *NOT* use every available resource. Hope this helps, Alan -- -- You received this message because you are subscribed to the Google Groups Clojure group. To post to this group, send email to clojure@googlegroups.com Note that posts from new members are moderated - please be patient with your first post. To unsubscribe from this group, send email to clojure+unsubscr...@googlegroups.com For more options, visit this group at http://groups.google.com/group/clojure?hl=en --- You received this message because you are subscribed to the Google Groups Clojure group. To unsubscribe from this group and stop receiving emails from it, send an email to clojure+unsubscr...@googlegroups.com. For more options, visit https://groups.google.com/groups/opt_out.