[ https://issues.apache.org/jira/browse/CRUNCH-449?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14082822#comment-14082822 ]
Gabriel Reid commented on CRUNCH-449: ------------------------------------- Looks good, all the added javadoc makes it way more clear how things work. I was playing around with this a bit, and found that setting a dependency on an input collection doesn't seem to work (because it tries to cast it to a SourceTarget in PipelineCallable.dependsOn. The actual stack trace I got was {code} java.lang.ClassCastException: org.apache.crunch.io.text.TextFileSource cannot be cast to org.apache.crunch.SourceTarget at org.apache.crunch.PipelineCallable.dependsOn(PipelineCallable.java:152) at org.apache.crunch.PipelineCallableIT.testWithTargetDependencies(PipelineCallableIT.java:126) {code} I guess this is kind of a weird case (depending on an input collection), but it would be good to deal with it some how instead of the ClassCastException. Another thing I noticed is that there is no default name or message on PipelineCallable, so when I returned {{Status.FAILURE}} from the {{call()}} method, the following was logged: {code} 1 callable failure(s) occurred: null: null {code} Maybe using the toString() of the PipelineCallable as the default name would be good, and a message like "No message available, please implement PipelineCallable.getMessage()" could be used as the default message. Apart from those couple of things, this is good to go as far as I'm concerned. > Add sequentialDo function for injecting arbitrary non-parallel code > ------------------------------------------------------------------- > > Key: CRUNCH-449 > URL: https://issues.apache.org/jira/browse/CRUNCH-449 > Project: Crunch > Issue Type: Bug > Components: Core > Reporter: Josh Wills > Assignee: Josh Wills > Attachments: CRUNCH-449.patch, CRUNCH-449b.patch, CRUNCH-449c.patch, > CRUNCH-449d.patch > > > I've been noodling on this one for awhile: how to add the ability to execute > some code if and only if one or more targets are created, and have that > executed code (optionally) return one or more new PCollections as a result. I > was thinking that this functionality could be wired in to libraries to do > things like bulk loading HBase tables or running Sqoop jobs as part of Crunch > pipelines automatically. -- This message was sent by Atlassian JIRA (v6.2#6252)