Hey Sid, thanks for response! So i think i got it now! :)
Johannes On 15 Sep 2015, at 21:44, Siddharth Seth <[email protected]> wrote: > > I'd skip the second step. "For DAG, VERTEX i use the #setConf() method to > forward all properties with the corresponding scope from my main conf > object". This won't help anything at the moment. > Other than that, this should work. > > InputInitializers and OutputCommitters (as well as Processors, Inputs, > Outputs) have a user payload field. If using FileInputFrmat / > FileOutputFormat based Inputs and Outputs - a payload is setup for the > initializer / committer. That will contain a Configuration instances (and > some more information) serialized to bytes. This Configuration instance would > require some of the properties as well. > Regarding the TezRuntimeConfiguration values - these are used when > configuring the standard Edges, and setAdditionalConfiguration will take care > of propagating the appropriate config parameters for a specific edge. > > On Tue, Sep 15, 2015 at 3:52 AM, Johannes Zillmann <[email protected] > <mailto:[email protected]>> wrote: > Alright… once again… > > So i saw that all the TezConfiguration fields are annotated with a Scope like > AM, DAG, VERTEX, etc… > So here is what i intend to do: > - The TezConfiguration for TezClient.create() will simply contain all > properties from my main conf object > - For DAG, VERTEX i use the #setConf() method to forward all properties with > the corresponding scope from my main conf object > - For the edgeBuilder i use the #setAdditionalConfiguration() method to > forward all properties from my main conf object > > So does this strategy make sense to you or am i missing something or getting > it wrong ? > > Couple of more questions: > - Regarding your comment on InputInitializers and OutputCommitters… I don’t > see any possibility to set properties on that. I’m using the user payload to > transfer conf values which are needed. Do i miss something here ? > - What about the TezRuntimeConfiguration values, do i need to do anything > special with that ? > > > best > Johannes > > > >> On 14 Sep 2015, at 20:42, Siddharth Seth <[email protected] >> <mailto:[email protected]>> wrote: >> >> For Edges, the approach that you took with >> edgeBuilder.setAdditionalConfiguration will work to set relevant Tez >> properties for an edge. You should be able to iterate through properties and >> set the config on the edge - and the relevant ones will be set. (Compression >> has a specific API which you could use, but using setAdditionalConfiguration >> will also work). >> Typically, additional Hadoop properties are also required for Edges - things >> like the list of compression codecs. edgeConfigs.setAdditionalConfiguration >> does take care of allowing these properties through. >> >> The TezClient needs to be provided a config - which is then made available >> to the AM. There's not much filtering involved here, and you could set tez.* >> for this configuration instance. An attempt will be made to pick up >> YarnConfiguration to connect to the cluster. >> >> The same applies for InputInitializers and OutputCommitters. Typically (and >> unfortunately), you'll end up setting all configs. >> >> dag.setConf, and vertex.setConf should not be used - I've opened a jira to >> add docs for these. >> >> How do you get the Hadoop configs in this case ? Is that part of the >> Configuration like object ? >> >> >> >> On Mon, Sep 14, 2015 at 9:47 AM, Johannes Zillmann <[email protected] >> <mailto:[email protected]>> wrote: >> Ok, >> >> found it. The >> >> edgeBuilder.setAdditionalConfiguration(TezRuntimeConfiguration.TEZ_RUNTIME_COMPRESS, >> "true”); >> does work for me! >> >> So let me describe my use case a little bit... >> Basically i have one Configuration like object on the client side. This is >> assembled by multiple sources and the only way a user can set custom Tez >> properties (do not use tez-site.xml in any perspective). >> Then i’m building my DAG with its vertices and edges programatically. >> Now, do you have any recommendation for me how to route the right Tez >> properties effectively to the corresponding Tez components ? (with tez >> components i mean like vertex properties, dag properties, AM properties, >> edge properties, etc..) >> >> Should i simply set all tez.* properties to any component or is there a >> smarter way ? >> And what components/properties might i’m missing ? >> >> Any help appreciated! >> Johannes >> >> >>> On 14 Sep 2015, at 16:57, Johannes Zillmann <[email protected] >>> <mailto:[email protected]>> wrote: >>> >>> Hey guys, >>> >>> question. How do i enabled tez.runtime.compress programatically ? >>> When i set this property in the tez-site.xml it is picket up correctly. >>> But all other options i tried: >>> - dag.setConf(TezRuntimeConfiguration.TEZ_RUNTIME_COMPRESS, "true"); >>> - mapVertex.setConf(TezRuntimeConfiguration.TEZ_RUNTIME_COMPRESS, "true"); >>> - reduceVertex.setConf(TezRuntimeConfiguration.TEZ_RUNTIME_COMPRESS, >>> "true”); >>> >>> do not have any effect! (Checking the log output of the Shuffle class) >>> >>> Johannes >> >> > >
