[
https://issues.apache.org/jira/browse/AVRO-513?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Doug Cutting updated AVRO-513:
------------------------------
Attachment: AVRO-513.patch
> This could be improved to be a copy per reduce group, although it's more
> work.
I suppose once a value's been consumed from the queue it could be returned to a
pool used by the deserializer. We could limit the size of the pool to be the
same size as the queue. Is that what you had in mind?
> The next() method should check to see if there is a next and throw
> NoSuchElementException if not.
Fixed.
> Rather than polling the queue, you could use the blocking take() method and
> interrupt the thread from close() to signal that there are no more values.
Here's a version that does this. I worry a bit that something else could
interrupt the thread or intercept the InterruptedException, e.g., in the user's
reducer. Is that a well-founded worry? A better approach might be to put in a
sentinel value. Unfortunately this has to be of type T, and we don't know how
to construct a T.
> Starting a thread from within a subclass constructor is unsafe.
Fixed.
> java mapreduce api should pass iterator of matching objects to reduce
> ---------------------------------------------------------------------
>
> Key: AVRO-513
> URL: https://issues.apache.org/jira/browse/AVRO-513
> Project: Avro
> Issue Type: Improvement
> Components: java
> Reporter: Doug Cutting
> Assignee: Doug Cutting
> Fix For: 1.4.0
>
> Attachments: AVRO-513.patch, AVRO-513.patch
>
>
> The Java mapreduce API added in AVRO-493 requires reducers implementations to
> explicitly detect sequences of matching data.
> Rather the reduce method might better look something like:
> void reduce(Iterator<IN>, Collector<OUT>);
> Where all equal values are passed in a single call.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.