[jira] [Commented] (FUNCTOR-1) [functor] New components: summarize and aggregate

Liviu Tudor (JIRA) Wed, 23 May 2012 00:11:51 -0700

    [ 
https://issues.apache.org/jira/browse/FUNCTOR-1?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13281433#comment-13281433
 ]


Liviu Tudor commented on FUNCTOR-1:
-----------------------------------

{quote}
Hi Liviu!

First of all, it's very clear you have put a lot of effort on this patch. I've 
applied at a local working copy of functor trunk, and it is very neat indeed. I 
wish my patches were as good as yours :-)
{quote}

Thank you! To be honest I think the quality of my previous patch was well under 
average on that -- I am still wondering how Simone had so much patience to 
actually go through it :) -- so I thought I'd put some serious effort in making 
it at least readable, so at least if this patch turns out to be a dud people 
don't waste too much time on trying to figure it out :)


{quote}
I'm starting to learn about aggregators now, while reading the code and 
documents that you wrote. So feel free to correct or comment any of my points 
below.

* The documentation and examples provided are very good :-) However, I think 
you used <br/> to break line between paragraph, while the other examples use 
<p>. The visual is a bit different.
{quote}
Very good point! I never picked on that myself -- I'm a sucker for closing and 
opening tags (e.g. <p> ..... </p>) so out of sheer laziness I guess I find 
myself using the <br/> a lot :) I went through the code and replaced all the 
<br/> with <p> so it should be all uniform now. (Wouldn't mind a second pair of 
eyes to verify that I haven't missed a <br/> though, if you have the time!)


{quote}
* After I read the issue and applied the patch (I had a quick view on it 
before, wow, so many tests :) I opened Eclipse and looked for the aggregator 
package, but it's aggregate. I think functor, generator and adapter package 
names follow the name of the pattern they implement. And I have seen aggregate 
and aggregator as the name of the pattern. What do you think?
{quote}
You have a fair point, again -- the "aggregate" name stems really from the 
first patch sent which was quite different to what "aggregator" has become in 
the last patch I've submitted. It somehow never crossed my mind after the 
refactoring(s) to change the package to "aggregator" -- though now that you 
mentioned it, it makes perfect sense! :) So I went ahead and moved packages 
under "aggregator".



{quote}
* How does an aggregator differs from the reduce function (like in map reduce)? 
IIUC, aggregators are like the fold function. A "nostore aggregator" is more 
like a fold with recursion, while the list aggregator is a fold function which 
stores its partial value somewhere. Maybe we could use FoldLeft with 
aggregators?
{quote}
I think you might be right -- gonna go and have a look at {{FoldLeft}} -- 
haven't considered that approach to be honest (nor am I that familiar with the 
class itself -- though I am aware of the whole map/reduce paradigm).



{quote}
* In the o.a.c.functor.aggregate.functions package, there are several 
aggregator functors for Double and Integer. The name of aggregators for Double 
start with Double. But the name of aggregators for Integer start with Int. In 
o.a.c.functor.generator.util, there are LongRange and IntegerRange. So I think 
in this case we could put both as Int, or both as Integer. What do you think?
{quote}
My laziness strikes again :) I don't like typing long class names -- even 
though I'm an Eclipse user and make full usage of all the wonders of 
autocomplete etc, long class names don't sit well with me (showing my old age 
here, see, I used a lot of "dumb" editors in my time when terminals were green 
and black :) -- hence my choosing Int versus Integer. However, that could be 
misleading : Integer suggest java.lang.Integer wheres Int in the name could 
suggest the primitive int.
Since we already used the convention of Integer for when java.lang.Integer is 
used in functor, I went and changed all the names to include Integer rather 
than just Int. Thanks for the heads-up on this!


{quote}
* IIUC, in o.a.c.functor.aggregate there are classes for creating nostore and 
list aggregators. However, if I create a ArrayListBackedAggregator<T>, this 
aggregator extends AbstractTimedAggregator<T>. What means 
AbsractTimedAggregator<Integer> aggr = new 
ArrayListBackedAggregator<Integer>(); would be valid (not tested, but I think 
it works). I think these classes could be designed in some other way to avoid 
this scenario... just food for thought :)
{quote}
Yes, that is the intended behaviour: in other words the implementations 
provided all are meant to have built-in timer support. I have defined the 
top-level interface Aggregator with no timer support as there might be other 
scenarios the Aggregator can be used in, which don't require timer support 
(even though that can be achieved with a null timer in AbstractTimedAggregator).
Basically this package appeared from finding myself quite often writing similar 
aggregators for various apps or components I was working on -- and every time I 
needed this data to be flushed into a sink (typically logged in a log file or 
over a socket) regularly. As I said, this isn't to say Aggregator *must* be 
used in a timer context -- hence, as I said, the Aggregator interface doesn't 
make any reference to that.
Do you think these interfaces / classes should be structured some other way 
then? (Or is it perhaps the fact the naming of classes and interfaces is it not 
the best?)



{quote}
* AbstractTimedAggregator<T> implements TimedAggregator<T>. But in 
TimedAggregator#onTimer, the first parameter is an AbstractTimedAggregator. I 
think this could be replaced by a TimedAggregator. What do you think? 
Otherwise, it would be like the interface had a dependency to a class that 
implements it. And if another class implemented TimedAggregator, it would have 
a dependency to AbstractTimedAggregator too. Or if AbstractTimedAggregator was 
deprecated/removed, it would cause problems in TimedAggregator interface.
{quote}
There is no interface {{TimedAggregator}} -- it's just {{Aggregator}} and 
{{AbstractTimedAggregator}} implements it. The {{TimedAggregatorListener}} 
indeed has a dependency on the {{AbstractTimedAggregator}} class which I agree 
is not the best -- perhaps, a reason for introducing a {{TimedAggregator}} 
interface and work at interface level in {{TimedAggregatorListener}}?
Though at the same time, the only reason why {{AbstractTimedAggregator}} is 
mentioned in {{TimedAggregatorListener}} is because the base interface 
{{Aggregator}} has no concept or mention of timer, so while it is possible to 
have {{TimedAggregatorListener}} declare a method like this:
{code:java}
void onTimer(Aggregator<T> aggregator, T evaluation);
{code}
It just didn't make sense to me, since like I said, {{Aggregator}} doesn't have 
a concept of timer.
So the more I think about it, I could introduce a {{TimedAggregator}} interface 
which would extend {{Aggregator}} and offer 3 other methods:
* addListener
* removeListener
* timer
What do you think? Or should we use {{Aggregator}} in 
{{TimedAggregatorListener}}?



{quote}
The docs, examples and coverage are great. No issues in PMD, FindBugs or 
Checkstyle found so far. All licenses in place, as well as package javadoc. 
Next time I submit a patch I hope it can as neat as yours :-)
{quote}
Thank you!



{quote}
There is an issue in Google Guava for similar feature, have you seen it 
(http://code.google.com/p/guava-libraries/issues/detail?id=546)?
{quote}
I wasn't aware of it to be honest. I'll try to have a look at it but to be 
honest, I fear timing is not my most abundant resource and I'd rather spend 
most of the time coding functor :) I'll have a look though, just to see what 
they came up with.


I'm attaching an update to the patch (labelled 
{{functor.22-may-2012.patch.bz2}}) with the above changes -- would appreciate 
your comments on this!
                
> [functor] New components: summarize and aggregate
> -------------------------------------------------
>
>                 Key: FUNCTOR-1
>                 URL: https://issues.apache.org/jira/browse/FUNCTOR-1
>             Project: Commons Functor
>          Issue Type: New Feature
>         Environment: JDK 1.6.0_25 but should work with any JDK 5+ (possibly 
> 1.4 though I haven't tested).
>            Reporter: Liviu Tudor
>            Assignee: Simone Tripodi
>            Priority: Minor
>              Labels: features
>         Attachments: commons-functor-aggregate+summarizer.zip, 
> commons-functor.patch.bz2, functor.22-may-2012.patch.bz2, functor.patch.bz2, 
> functor.patch.bz2
>
>
> This is the next step from https://issues.apache.org/jira/browse/SANDBOX-340 
> -- as instructed I'm finally hoping to get the code in the right place and 
> hopefully this is something that the functor component could do with.
> Whereas initially I just started with the summarizer component, I have added 
> now the second one, the "aggregator" as they are somehow related. If this 
> code proves to be useful to functor in any way, it would actually be good to 
> get some feedback on these 2 to see if the class hierarchy can in fact be 
> changed to share some common functionality as I feel (probably due to the 
> similar needs that lead to writing/using these components) that somehow they 
> should share a common base.
> In brief, the 2 components:
> * aggregator: this just allows for data to be aggregated in a user defined 
> way (e.g. stored in a list for the purpose of averaging, computing the 
> arithmetic median etc). The classes provided actually offer the 
> implementation for storing data in a list and computing the above-mentioned 
> values or summing up everything.
> * timed summarizer: this is another variation of the aggreator, however, it 
> adds the idea of regular "flushes", so based on a timer it will reset the 
> value and start summing/aggregating the data again. Rather than using an 
> aggregator which would store the whole data series (possibly for applying 
> more complex formulas), this component just computes on the fly on each 
> request the formula and stores the result of it. (Which does mean things like 
> computing arithmetic mean, median etc would be difficult to compute without 
> knowing upfront how many calls will be received -- i.e. how many elements we 
> will be required to summarize/aggregate.) So the memory footprint of running 
> this is much smaller -- even though, as I said, it achieves similar results. 
> I have only provided a summarizer which operates on integers, but obviously 
> others for float, double etc can be created if we go ahead with this design.
> Hopefully the above make sense; this code has resulted from finding myself 
> writing similar components to these a few times and because it's always been 
> either one type (e.g. aggregator) or another (summarizer) I haven't given 
> quite possibly enough thought to the class design to join these 2. Also, 
> unfortunately, the time I sat down to make these components a bit more 
> general and submitted issue 340 was nearly 3 months ago so I'm trying to 
> remember myself all the ideas I had at a time so bear with me please if these 
> are still  a bit fuzzy :) However, if you can make use of these I'm quite 
> happy to elaborate on areas that are unclear and obviously put some effort 
> into getting these components to the standards required to put these into a 
> release.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (FUNCTOR-1) [functor] New components: summarize and aggregate

Reply via email to