So getting much better map reduce count probably requires that distributed ops be implemented lazily and then reorganized when they actually run. Is there any appetite for that?
Sent from my iPhone On Jul 27, 2013, at 8:00, Jake Mannix <[email protected]> wrote: > But yeah, maybe we'll just be looking at two different focuses on this: I > really care more about writing nicer MR pipelines for our jobs (I've > already played around with a nice replacement for seq2sparse in a single > small scalding job with modular components, it's about 1/10th the number of > lines of our current one, with most of the functionality), and getting a > nice integrated REPL for playing with the results.
