On 2/13/15 3:45 PM, Peter Alexander wrote:
On Friday, 13 February 2015 at 18:32:35 UTC, Andrei Alexandrescu wrote:
* Perhaps rename groupBy to chunkBy. People coming from SQL and other
languages might expect groupBy to do hash-based grouping.
Agreed.
* The unary function implementation must return for each group a tuple
consisting of the key and the lazy range of values. The binary
function implementation should continue to only return the lazy range
of values.
Is the purpose of this just to avoid the user potentially needing to
evaluate the key function twice?
Yah. Also in many cases of grouping you need the key anyway.
* SortedRange should add a method called group(). Invoked with no
predicate, group() should do what chunkBy does, using the sorting
predicate.
Will need to be called something else since there may be existing code
trying to call std.algorithm.group using UFCS. This would change its
behaviour.
Oops, I thought that's groups. I guess we could call it groupBy as well,
even though it has no predicate so "by" does not participate to a sentence.
* aggregate() should detect the two kinds of results per group (well,
chunk) and process them accordingly: for unary-predicate chunks, pass
the key through and only process the lazy range. Meaning:
auto data = [
tuple("John", 100),
tuple("John", 35),
tuple("Jane", 200),
tuple("Jane", 87),
];
auto r = data.chunkBy!(x => x[0]).aggregate!sum;
yields a range of tuples: tuple("John", 135), tuple("Jane", 187).
Not sure I understand how this is meant to work.
With your second bullet implemented, data.chunkBy!(x => x[0]) will return:
tuple("John", [tuple("John", 100), tuple("John", 35)]),
tuple("Jane", [tuple("Jane", 200), tuple("Jane", 87)])
Correct.
(here [...] denotes the sub-range, not an array).
So aggregate will ignore the key part, but how does it know to ignore
the name in sub-ranges?
Oops, I was wrong here. Let's think about aggregate() integration
post-2.067 and remove it for now.
Peter, could you please take this?
Andrei