[
https://issues.apache.org/jira/browse/CRUNCH-330?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Josh Wills updated CRUNCH-330:
------------------------------
Attachment: CRUNCH-330b.patch
I did a riff on this that simplified things somewhat based on the assumption
that the TaskIOContext already caches counters for us, so we don't need to
cache it ourselves. I also checked the disable config parameter earlier in the
flow and marked the field as final so that the branch shouldn't add much
overhead.
> Use of multiple output counters can be disabled in configuration.
> -----------------------------------------------------------------
>
> Key: CRUNCH-330
> URL: https://issues.apache.org/jira/browse/CRUNCH-330
> Project: Crunch
> Issue Type: New Feature
> Components: Core, IO
> Reporter: Dominique Dierickx
> Assignee: Josh Wills
> Priority: Minor
> Attachments: 0001-working-version.patch, CRUNCH-330b.patch
>
>
> We're having some trouble with the amount of counters that Crunch creates
> when writing to a lot of different output files (slightly more than 120).
> This wouldn't be an issue if we were able to configure the maximum number
> of allowed counters but unfortunately, because we are running an older
> version of Hadoop, doing this is not an option and we are required to patch
> Crunch locally when using a new release to leave out the counters.
> I'm not saying the counters should be removed but maybe it is an option to
> make them configurable without paying too much of a performance penalty?
> I will implement this functionality and submit a patch.
--
This message was sent by Atlassian JIRA
(v6.1.5#6160)