[ 
https://issues.apache.org/jira/browse/CRUNCH-449?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14076655#comment-14076655
 ] 

Micah Whitacre commented on CRUNCH-449:
---------------------------------------

* Probably want to provide access to SeqDoFn to have access to a Configuration 
object for the pipeline/target in the execute method.  In the case you give 
where someone wants to bulk load to HBase an HFile Target Configuration for 
accessing the FileSystem would be useful.
* +1 to Javadoc.  Specifically the relationship between when getOutput/execute 
are called and any guaranteed execution order or not.  Also around thread 
safety/concurrent execution guarantees as well as blocking operations.
* Is calling it a DoFn really appropriate?  Currently in Crunch a DoFn operates 
on each element of a PCollection.  This instead essentially fork/joins pipeline 
stages.  I don't have a better name unfortunately.
* Should SeqDoFn expose access to the collection of labels for targets and 
PCollection vs just asking for them by name.


> Add sequentialDo function for injecting arbitrary non-parallel code
> -------------------------------------------------------------------
>
>                 Key: CRUNCH-449
>                 URL: https://issues.apache.org/jira/browse/CRUNCH-449
>             Project: Crunch
>          Issue Type: Bug
>          Components: Core
>            Reporter: Josh Wills
>            Assignee: Josh Wills
>         Attachments: CRUNCH-449.patch, CRUNCH-449b.patch
>
>
> I've been noodling on this one for awhile: how to add the ability to execute 
> some code if and only if one or more targets are created, and have that 
> executed code (optionally) return one or more new PCollections as a result. I 
> was thinking that this functionality could be wired in to libraries to do 
> things like bulk loading HBase tables or running Sqoop jobs as part of Crunch 
> pipelines automatically.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Reply via email to