[ 
https://issues.apache.org/jira/browse/PIG-2364?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13150251#comment-13150251
 ] 

Dmitriy V. Ryaboy commented on PIG-2364:
----------------------------------------

Thanks Jonathan!

A few notes:

* The name of this function is non-intuitive. I am not sure what window means 
(I would expect some sliding window?) What this really does is something like a 
discrete overlap, right? (for every discrete value in a range, figure out how 
many argument ranges it's in).
* It's unclear what one should be passing in when you talk about 'dates'. Be 
clear about longs in the javadoc. 
* Tuple / Bag factories -- make them final?
* boolean init -- convention is to start booleans with "is" (isInitialized?)
* document that the input set is a DataBag (not just set)
* use @OutputSchema
* spaces around operators ("nextEnd=pq.poll();nextEnd!=null;nextEnd=pq.poll()")
* This function generates output significantly larger than the input, and 
requires holding the priority queue in memory (nice job on making this 
accumulative from the get-go, by the way). Memory will be an issue.. warn 
people not to do crazy things?
* Bag<Tuple<Long, Long>> begs for PIG-2359... hopefully that will be a real 
thing soon. #todo?

                
> Piggybank function which turns ranges of times into a time series of 
> time,occurrences pairs
> -------------------------------------------------------------------------------------------
>
>                 Key: PIG-2364
>                 URL: https://issues.apache.org/jira/browse/PIG-2364
>             Project: Pig
>          Issue Type: Improvement
>            Reporter: Jonathan Coveney
>         Attachments: window.patch
>
>
> This came up on the listserv. Someone wanted a way to turn (start,end) ranges 
> like {(1,3),(2,2),(3,4)} into a timeseries with the number of ranges which 
> include the given time ie {(1,1),(2,2),(3,2),(4,1)} with an optional lag 
> parameter. This patch is that. I included tests. Maybe there is a better 
> name, or this should be in a different package?

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Reply via email to