[
https://issues.apache.org/jira/browse/CRUNCH-400?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14010343#comment-14010343
]
Josh Wills commented on CRUNCH-400:
-----------------------------------
Okay, I understand now. Calling:
Sets.newHashSet(...)
is going to immediately materialize the results. And there's no way to get the
StageResults associated with doing that. The client-side fix would be to alter
the code to run like this:
Iterable<String> preMaterialized = dataToBeMaterialized.materialize();
PipelineResult res = pipeline.run();
Set<String> materializedData = Sets.newHashSet(preMaterialized);
I could also make the MaterializableIterable that gets returned by the
materialize() call aware of its StageResults (I think).
> Materialized jobs should have stage in PipelineResult
> -----------------------------------------------------
>
> Key: CRUNCH-400
> URL: https://issues.apache.org/jira/browse/CRUNCH-400
> Project: Crunch
> Issue Type: Improvement
> Components: Core
> Affects Versions: 0.9.0, 0.8.2
> Reporter: Micah Whitacre
>
> Brought up as part of the proposed fix for CRUNCH-272 and on the mailing
> list[1], a set of jobs kicked off due to a materialize() call will not be
> tracked as part of the Pipeline's stage results returned by the
> PipelineResult.
> [1] -
> http://mail-archives.apache.org/mod_mbox/crunch-dev/201405.mbox/%3CCANFazTUAffvTctK5%3DWvW4KyBLSqLCNcke7ZMWwgASu%2BEtkDmyQ%40mail.gmail.com%3E
--
This message was sent by Atlassian JIRA
(v6.2#6252)