Re: spark config params conventions

2014-03-14 Thread Chester Chen
Based on typesafe config maintainer's response, with latest version of 
typeconfig, the double quote is no longer needed for key like 
spark.speculation, so you don't need code to strip the quotes



Chester
Alpine data labs

Sent from my iPhone

On Mar 12, 2014, at 2:50 PM, Aaron Davidson ilike...@gmail.com wrote:

 One solution for typesafe config is to use
 spark.speculation = true
 
 Typesafe will recognize the key as a string rather than a path, so the name 
 will actually be \spark.speculation\, so you need to handle this 
 contingency when passing the config operations to spark (stripping the quotes 
 from the key).
 
 Solving this in Spark itself is a little tricky because there are ~5 such 
 conflicts (spark.serializer, spark.speculation, spark.locality.wait, 
 spark.shuffle.spill, and spark.cleaner.ttl), some of which are used pretty 
 frequently. We could provide aliases for all of these in Spark, but actually 
 deprecating the old ones would affect many users, so we could only do that if 
 enough users would benefit from fully hierarchical config options.
 
 
 
 On Wed, Mar 12, 2014 at 9:24 AM, Mark Hamstra m...@clearstorydata.com wrote:
 That's the whole reason why some of the intended configuration changes were 
 backed out just before the 0.9.0 release.  It's a well-known issue, even if 
 a completely satisfactory solution isn't as well-known and is probably 
 something which should do another iteration on. 
 
 
 On Wed, Mar 12, 2014 at 9:10 AM, Koert Kuipers ko...@tresata.com wrote:
 i am reading the spark configuration params from another configuration 
 object (typesafe config) before setting them as system properties.
 
 i noticed typesafe config has trouble with settings like:
 spark.speculation=true
 spark.speculation.interval=0.5
 
 the issue seems to be that if spark.speculation is a container that has  
 more values inside then it cannot be also a value itself, i think. so this 
 would work fine:
 spark.speculation.enabled=true
 spark.speculation.interval=0.5
 
 just a heads up. i would probably suggest we avoid this situation.
 
 


Re: spark config params conventions

2014-03-12 Thread Aaron Davidson
Should we try to deprecate these types of configs for 1.0.0? We can start
by accepting both and giving a warning if you use the old one, and then
actually remove them in the next minor release. I think
spark.speculation.enabled=true is better than spark.speculation=true,
and if we decide to use typesafe configs again ourselves, this change is
necessary.

We actually don't have to ever complete the deprecation - we can always
accept both spark.speculation and spark.speculation.enabled, and people
just have to use the latter if they want to use typesafe config.


On Wed, Mar 12, 2014 at 9:24 AM, Mark Hamstra m...@clearstorydata.comwrote:

 That's the whole reason why some of the intended configuration changes
 were backed out just before the 0.9.0 release.  It's a well-known issue,
 even if a completely satisfactory solution isn't as well-known and is
 probably something which should do another iteration on.


 On Wed, Mar 12, 2014 at 9:10 AM, Koert Kuipers ko...@tresata.com wrote:

 i am reading the spark configuration params from another configuration
 object (typesafe config) before setting them as system properties.

 i noticed typesafe config has trouble with settings like:
 spark.speculation=true
 spark.speculation.interval=0.5

 the issue seems to be that if spark.speculation is a container that has
 more values inside then it cannot be also a value itself, i think. so this
 would work fine:
 spark.speculation.enabled=true
 spark.speculation.interval=0.5

 just a heads up. i would probably suggest we avoid this situation.





Re: spark config params conventions

2014-03-12 Thread yao
+1. I agree to keep the old ones only for backward compatibility purpose.


On Wed, Mar 12, 2014 at 12:38 PM, Evan Chan e...@ooyala.com wrote:

 +1.

 Not just for Typesafe Config, but if we want to consider hierarchical
 configs like JSON rather than flat key mappings, it is necessary.  It
 is also clearer.

 On Wed, Mar 12, 2014 at 9:58 AM, Aaron Davidson ilike...@gmail.com
 wrote:
  Should we try to deprecate these types of configs for 1.0.0? We can start
  by accepting both and giving a warning if you use the old one, and then
  actually remove them in the next minor release. I think
  spark.speculation.enabled=true is better than spark.speculation=true,
  and if we decide to use typesafe configs again ourselves, this change is
  necessary.
 
  We actually don't have to ever complete the deprecation - we can always
  accept both spark.speculation and spark.speculation.enabled, and people
  just have to use the latter if they want to use typesafe config.
 
 
  On Wed, Mar 12, 2014 at 9:24 AM, Mark Hamstra m...@clearstorydata.com
 wrote:
 
  That's the whole reason why some of the intended configuration changes
  were backed out just before the 0.9.0 release.  It's a well-known issue,
  even if a completely satisfactory solution isn't as well-known and is
  probably something which should do another iteration on.
 
 
  On Wed, Mar 12, 2014 at 9:10 AM, Koert Kuipers ko...@tresata.com
 wrote:
 
  i am reading the spark configuration params from another configuration
  object (typesafe config) before setting them as system properties.
 
  i noticed typesafe config has trouble with settings like:
  spark.speculation=true
  spark.speculation.interval=0.5
 
  the issue seems to be that if spark.speculation is a container that
 has
  more values inside then it cannot be also a value itself, i think. so
 this
  would work fine:
  spark.speculation.enabled=true
  spark.speculation.interval=0.5
 
  just a heads up. i would probably suggest we avoid this situation.
 
 
 



 --
 --
 Evan Chan
 Staff Engineer
 e...@ooyala.com  |



Re: spark config params conventions

2014-03-12 Thread Aaron Davidson
One solution for typesafe config is to use
spark.speculation = true

Typesafe will recognize the key as a string rather than a path, so the name
will actually be \spark.speculation\, so you need to handle this
contingency when passing the config operations to spark (stripping the
quotes from the key).

Solving this in Spark itself is a little tricky because there are ~5 such
conflicts (spark.serializer, spark.speculation, spark.locality.wait,
spark.shuffle.spill, and spark.cleaner.ttl), some of which are used pretty
frequently. We could provide aliases for all of these in Spark, but
actually deprecating the old ones would affect many users, so we could only
do that if enough users would benefit from fully hierarchical config
options.



On Wed, Mar 12, 2014 at 9:24 AM, Mark Hamstra m...@clearstorydata.comwrote:

 That's the whole reason why some of the intended configuration changes
 were backed out just before the 0.9.0 release.  It's a well-known issue,
 even if a completely satisfactory solution isn't as well-known and is
 probably something which should do another iteration on.


 On Wed, Mar 12, 2014 at 9:10 AM, Koert Kuipers ko...@tresata.com wrote:

 i am reading the spark configuration params from another configuration
 object (typesafe config) before setting them as system properties.

 i noticed typesafe config has trouble with settings like:
 spark.speculation=true
 spark.speculation.interval=0.5

 the issue seems to be that if spark.speculation is a container that has
 more values inside then it cannot be also a value itself, i think. so this
 would work fine:
 spark.speculation.enabled=true
 spark.speculation.interval=0.5

 just a heads up. i would probably suggest we avoid this situation.