Hey Sid,

thanks for response!
So i think i got it now! :)

Johannes


On 15 Sep 2015, at 21:44, Siddharth Seth <[email protected]> wrote:
> 
> I'd skip the second step. "For DAG, VERTEX i use the #setConf() method to 
> forward all properties with the corresponding scope from my main conf 
> object". This won't help anything at the moment.
> Other than that, this should work.
> 
> InputInitializers and OutputCommitters (as well as Processors, Inputs, 
> Outputs) have a user payload field. If using FileInputFrmat / 
> FileOutputFormat based Inputs and Outputs - a payload is setup for the 
> initializer / committer. That will contain a Configuration instances (and 
> some more information) serialized to bytes. This Configuration instance would 
> require some of the properties as well.
> Regarding the TezRuntimeConfiguration values - these are used when 
> configuring the standard Edges, and setAdditionalConfiguration will take care 
> of propagating the appropriate config parameters for a specific edge.
> 
> On Tue, Sep 15, 2015 at 3:52 AM, Johannes Zillmann <[email protected] 
> <mailto:[email protected]>> wrote:
> Alright… once again…
> 
> So i saw that all the TezConfiguration fields are annotated with a Scope like 
> AM, DAG, VERTEX, etc…
> So here is what i intend to do:
> - The TezConfiguration for TezClient.create() will simply contain all 
> properties from my main conf object
> - For DAG, VERTEX i use the #setConf() method to forward all properties with 
> the corresponding scope from my main conf object
> - For the edgeBuilder i use the #setAdditionalConfiguration() method to 
> forward all properties from my main conf object
> 
> So does this strategy make sense to you or am i missing something or getting 
> it wrong ?
> 
> Couple of more questions:
> - Regarding your comment on InputInitializers and OutputCommitters… I don’t 
> see any possibility to set properties on that. I’m using the user payload to 
> transfer conf values which are needed. Do i miss something here ?
> - What about the TezRuntimeConfiguration values, do i need to do anything 
> special with that ?
> 
> 
> best
> Johannes
>  
> 
> 
>> On 14 Sep 2015, at 20:42, Siddharth Seth <[email protected] 
>> <mailto:[email protected]>> wrote:
>> 
>> For Edges, the approach that you took with 
>> edgeBuilder.setAdditionalConfiguration will work to set relevant Tez 
>> properties for an edge. You should be able to iterate through properties and 
>> set the config on the edge - and the relevant ones will be set. (Compression 
>> has a specific API which you could use, but using setAdditionalConfiguration 
>> will also work).
>> Typically, additional Hadoop properties are also required for Edges - things 
>> like the list of compression codecs. edgeConfigs.setAdditionalConfiguration 
>> does take care of allowing these properties through.
>> 
>> The TezClient needs to be provided a config - which is then made available 
>> to the AM. There's not much filtering involved here, and you could set tez.* 
>> for this configuration instance. An attempt will be made to pick up 
>> YarnConfiguration to connect to the cluster.
>> 
>> The same applies for InputInitializers and OutputCommitters. Typically (and 
>> unfortunately), you'll end up setting all configs.
>> 
>> dag.setConf, and vertex.setConf should not be used - I've opened a jira to 
>> add docs for these.
>> 
>> How do you get the Hadoop configs in this case ? Is that part of the 
>> Configuration like object ?
>> 
>> 
>> 
>> On Mon, Sep 14, 2015 at 9:47 AM, Johannes Zillmann <[email protected] 
>> <mailto:[email protected]>> wrote:
>> Ok, 
>> 
>> found it. The 
>>      
>> edgeBuilder.setAdditionalConfiguration(TezRuntimeConfiguration.TEZ_RUNTIME_COMPRESS,
>>  "true”); 
>> does work for me!
>> 
>> So let me describe my use case a little bit...
>> Basically i have one Configuration like object on the client side. This is 
>> assembled by multiple sources and the only way a user can set custom Tez 
>> properties (do not use tez-site.xml in any perspective). 
>> Then i’m building my DAG with its vertices and edges programatically. 
>> Now, do you have any recommendation for me how to route the right Tez 
>> properties effectively to the corresponding Tez components ? (with tez 
>> components i mean like vertex properties, dag properties, AM properties, 
>> edge properties, etc..)
>> 
>> Should i simply set all tez.* properties to any component or is there a 
>> smarter way ?
>> And what components/properties might i’m missing ?
>> 
>> Any help appreciated!
>> Johannes
>> 
>> 
>>> On 14 Sep 2015, at 16:57, Johannes Zillmann <[email protected] 
>>> <mailto:[email protected]>> wrote:
>>> 
>>> Hey guys,
>>> 
>>> question. How do i enabled tez.runtime.compress programatically ?
>>> When i set this property in the tez-site.xml it is picket up correctly.
>>> But all other options i tried:
>>> - dag.setConf(TezRuntimeConfiguration.TEZ_RUNTIME_COMPRESS, "true");
>>> - mapVertex.setConf(TezRuntimeConfiguration.TEZ_RUNTIME_COMPRESS, "true");
>>> - reduceVertex.setConf(TezRuntimeConfiguration.TEZ_RUNTIME_COMPRESS, 
>>> "true”);
>>> 
>>> do not have any effect! (Checking the log output of the Shuffle class)
>>> 
>>> Johannes
>> 
>> 
> 
> 

Reply via email to