[ 
https://issues.apache.org/jira/browse/AVRO-513?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Doug Cutting updated AVRO-513:
------------------------------

    Attachment: AVRO-513.patch

> I was thinking that you only need to copy at the beginning of the group, 
> since you can compare subsequent values to the copy, until they differ, at 
> which point you make a new copy.

But they might differ in fields that are not compared, e.g., count.  So all 
objects in the queue must be unique.

> I think it's possible that the interrupt occurs between the check on "done" 
> and the call to take(), so the call to take() would go ahead and cause a 
> deadlock.

Interrupt() sets the thread's interrupt flag, and it remains set until a method 
that throws InterruptableException is called.  So if it's called before or 
after take(), that's fine, since take() will throw it either way when the queue 
is empty.

The risk that user code swallows the InterruptableException can be fixed by 
setting 'done=true' before calling interrupt().  Then if user code swallows the 
interrupt and the queue is empty, we'd never call queue.take(), since, by 
definition, the thread wasn't between checking 'done' and calling 'take()' when 
it got the interrupt.  Does that sound right?

> java mapreduce api should pass iterator of matching objects to reduce
> ---------------------------------------------------------------------
>
>                 Key: AVRO-513
>                 URL: https://issues.apache.org/jira/browse/AVRO-513
>             Project: Avro
>          Issue Type: Improvement
>          Components: java
>            Reporter: Doug Cutting
>            Assignee: Doug Cutting
>             Fix For: 1.4.0
>
>         Attachments: AVRO-513.patch, AVRO-513.patch, AVRO-513.patch
>
>
> The Java mapreduce API added in AVRO-493 requires reducers implementations to 
> explicitly detect sequences of matching data.
> Rather the reduce method might better look something like:
>    void reduce(Iterator<IN>, Collector<OUT>);
> Where all equal values are passed in a single call.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to