[GitHub] spark pull request #21648: [SPARK-24665][PySpark] Add SQLConf in PySpark to ...

xuanyuanking Tue, 26 Jun 2018 23:03:47 -0700

Github user xuanyuanking commented on a diff in the pull request:

    https://github.com/apache/spark/pull/21648#discussion_r198375472
  
    --- Diff: python/pyspark/sql/conf.py ---
    @@ -64,6 +64,97 @@ def _checkType(self, obj, identifier):
                                 (identifier, obj, type(obj).__name__))
     
     
    +class ConfigEntry(object):
    +    """An entry contains all meta information for a configuration"""
    +
    +    def __init__(self, confKey):
    +        """Create a new ConfigEntry with config key"""
    +        self.confKey = confKey
    +        self.converter = None
    +        self.default = _NoValue
    +
    +    def boolConf(self):
    +        """Designate current config entry is boolean config"""
    +        self.converter = lambda x: str(x).lower() == "true"
    +        return self
    +
    +    def intConf(self):
    +        """Designate current config entry is integer config"""
    +        self.converter = lambda x: int(x)
    +        return self
    +
    +    def stringConf(self):
    +        """Designate current config entry is string config"""
    +        self.converter = lambda x: str(x)
    +        return self
    +
    +    def withDefault(self, default):
    +        """Give a default value for current config entry, the default 
value will be set
    +        to _NoValue when its absent"""
    +        self.default = default
    +        return self
    +
    +    def read(self, ctx):
    +        """Read value from this config entry through sql context"""
    +        return self.converter(ctx.getConf(self.confKey, self.default))
    +
    +
    +class SQLConf(object):
    +    """A class that enables the getting of SQL config parameters in 
pyspark"""
    +
    +    REPL_EAGER_EVAL_ENABLED = 
ConfigEntry("spark.sql.repl.eagerEval.enabled")\
    +        .boolConf()\
    +        .withDefault("false")
    +
    +    REPL_EAGER_EVAL_MAX_NUM_ROWS = 
ConfigEntry("spark.sql.repl.eagerEval.maxNumRows")\
    +        .intConf()\
    +        .withDefault("20")
    +
    +    REPL_EAGER_EVAL_TRUNCATE = 
ConfigEntry("spark.sql.repl.eagerEval.truncate")\
    +        .intConf()\
    +        .withDefault("20")
    +
    +    PANDAS_RESPECT_SESSION_LOCAL_TIMEZONE = \
    +        ConfigEntry("spark.sql.execution.pandas.respectSessionTimeZone")\
    +        .boolConf()
    +
    +    SESSION_LOCAL_TIMEZONE = ConfigEntry("spark.sql.session.timeZone")\
    +        .stringConf()
    +
    +    ARROW_EXECUTION_ENABLED = 
ConfigEntry("spark.sql.execution.arrow.enabled")\
    +        .boolConf()\
    +        .withDefault("false")
    +
    +    ARROW_FALLBACK_ENABLED = 
ConfigEntry("spark.sql.execution.arrow.fallback.enabled")\
    +        .boolConf()\
    +        .withDefault("true")
    --- End diff --
    
    Just want to remove the hard coding as we discussed in 
https://github.com/apache/spark/pull/21370#discussion_r194276735. For the 
duplication of Scala code, currently I have an idea is just call buildConf and 
doc in Scala side to register the config and leave its doc, and manage the name 
also default value in python SQLConf. May I ask your suggestion? :) Thx.



---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #21648: [SPARK-24665][PySpark] Add SQLConf in PySpark to ...

Reply via email to