cross posting down to dev… should continue the discussion there I believe.

as I understand it, all Cascading users familiar with packaging a Hadoop job 
jar with a lib folder, in which the packaged custom InputFormat is placed — 
pulled from maven etc, will have this issue.

this also expands to projects on top of Cascading including Scalding and 
Cascalog. 

oddly the org.apache.tez.client.AMConfiguration has a 

private Map<String, String> env;

but is unused.

> On Jun 17, 2015, at 4:32 PM, Andre Kelpe <[email protected]> wrote:
> 
> Hi,
> 
> we are currently running into a problem when a user of Cascading uses a 
> custom InputFormat with Tez. The ApplicationMaster is running into a 
> ClassNotFoundException when calculating the splits, since we are unable to 
> control the environment/classpath visibile to the ApplicationMaster. We have 
> a work-around, where the users have to supply a fat-jar to make it work, but 
> we need to be able to support other ways as well. 
> 
> When interacting with the DAG, we are able to pass along a custom 
> environment/classpath, but that API is missing on the TezClient, causing the 
> AppMaster to fail, when the user is using classic hadoop style jars (embedded 
> lib directory).
> 
> In order to get lingual, our SQL layer on top of Cascading to work correctly, 
> we need a way to supply the environment in a more dynamic way then one 
> fatjar, so it would be great if the API could be extendend to do that.
> 
> I have opened https://issues.apache.org/jira/browse/TEZ-2563 
> <https://issues.apache.org/jira/browse/TEZ-2563>
> 
> Thanks!
> 
> - André
> 
> -- 
> André Kelpe
> [email protected] <mailto:[email protected]>
> http://concurrentinc.com <http://concurrentinc.com/>
—
Chris K Wensel
[email protected]




Reply via email to