[incubator-superset] branch master updated: [docs] Add SSL config options for postgres (#9767)

dpgaspar Sun, 10 May 2020 11:38:09 -0700

This is an automated email from the ASF dual-hosted git repository.

dpgaspar pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/incubator-superset.git



The following commit(s) were added to refs/heads/master by this push:
     new 24db9ab  [docs] Add SSL config options for postgres (#9767)
24db9ab is described below

commit 24db9ab088d0140309cc0d57ce87e52a912fa93f
Author: ʈᵃᵢ <[email protected]>
AuthorDate: Sun May 10 11:37:13 2020 -0700

    [docs] Add SSL config options for postgres (#9767)
    
    * [docs] add postgres SSL documentation
    
    * move caching section to where it makes more sense
---
 docs/installation.rst | 310 ++++++++++++++++++++++++++++----------------------
 1 file changed, 171 insertions(+), 139 deletions(-)

diff --git a/docs/installation.rst b/docs/installation.rst
index c6a7c37..881d2e6 100644
--- a/docs/installation.rst
+++ b/docs/installation.rst
@@ -333,6 +333,144 @@ auth postback endpoint, you can add them to 
*WTF_CSRF_EXEMPT_LIST*
 
 .. _ref_database_deps:
 
+Caching
+-------
+
+Superset uses `Flask-Cache <https://pythonhosted.org/Flask-Cache/>`_ for
+caching purpose. Configuring your caching backend is as easy as providing
+a ``CACHE_CONFIG``, constant in your ``superset_config.py`` that
+complies with the Flask-Cache specifications.
+
+Flask-Cache supports multiple caching backends (Redis, Memcached,
+SimpleCache (in-memory), or the local filesystem). If you are going to use
+Memcached please use the `pylibmc` client library as `python-memcached` does
+not handle storing binary data correctly. If you use Redis, please install
+the `redis <https://pypi.python.org/pypi/redis>`_ Python package: ::
+
+    pip install redis
+
+For setting your timeouts, this is done in the Superset metadata and goes
+up the "timeout searchpath", from your slice configuration, to your
+data source's configuration, to your database's and ultimately falls back
+into your global default defined in ``CACHE_CONFIG``.
+
+.. code-block:: python
+
+    CACHE_CONFIG = {
+        'CACHE_TYPE': 'redis',
+        'CACHE_DEFAULT_TIMEOUT': 60 * 60 * 24, # 1 day default (in secs)
+        'CACHE_KEY_PREFIX': 'superset_results',
+        'CACHE_REDIS_URL': 'redis://localhost:6379/0',
+    }
+
+It is also possible to pass a custom cache initialization function in the
+config to handle additional caching use cases. The function must return an
+object that is compatible with the `Flask-Cache 
<https://pythonhosted.org/Flask-Cache/>`_ API.
+
+.. code-block:: python
+
+    from custom_caching import CustomCache
+
+    def init_cache(app):
+        """Takes an app instance and returns a custom cache backend"""
+        config = {
+            'CACHE_DEFAULT_TIMEOUT': 60 * 60 * 24, # 1 day default (in secs)
+            'CACHE_KEY_PREFIX': 'superset_results',
+        }
+        return CustomCache(app, config)
+
+    CACHE_CONFIG = init_cache
+
+Superset has a Celery task that will periodically warm up the cache based on
+different strategies. To use it, add the following to the `CELERYBEAT_SCHEDULE`
+section in `config.py`:
+
+.. code-block:: python
+
+    CELERYBEAT_SCHEDULE = {
+        'cache-warmup-hourly': {
+            'task': 'cache-warmup',
+            'schedule': crontab(minute=0, hour='*'),  # hourly
+            'kwargs': {
+                'strategy_name': 'top_n_dashboards',
+                'top_n': 5,
+                'since': '7 days ago',
+            },
+        },
+    }
+
+This will cache all the charts in the top 5 most popular dashboards every hour.
+For other strategies, check the `superset/tasks/cache.py` file.
+
+Caching Thumbnails
+------------------
+
+This is an optional feature that can be turned on by activating it's feature 
flag on config:
+
+.. code-block:: python
+
+    FEATURE_FLAGS = {
+        "THUMBNAILS": True,
+        "THUMBNAILS_SQLA_LISTENERS": True,
+    }
+
+
+For this feature you will need a cache system and celery workers. All 
thumbnails are store on cache and are processed
+asynchronously by the workers.
+
+An example config where images are stored on S3 could be:
+
+.. code-block:: python
+
+    from flask import Flask
+    from s3cache.s3cache import S3Cache
+
+    ...
+
+    class CeleryConfig(object):
+        BROKER_URL = "redis://localhost:6379/0"
+        CELERY_IMPORTS = ("superset.sql_lab", "superset.tasks", 
"superset.tasks.thumbnails")
+        CELERY_RESULT_BACKEND = "redis://localhost:6379/0"
+        CELERYD_PREFETCH_MULTIPLIER = 10
+        CELERY_ACKS_LATE = True
+
+
+    CELERY_CONFIG = CeleryConfig
+
+    def init_thumbnail_cache(app: Flask) -> S3Cache:
+        return S3Cache("bucket_name", 'thumbs_cache/')
+
+
+    THUMBNAIL_CACHE_CONFIG = init_thumbnail_cache
+    # Async selenium thumbnail task will use the following user
+    THUMBNAIL_SELENIUM_USER = "Admin"
+
+Using the above example cache keys for dashboards will be 
`superset_thumb__dashboard__{ID}`
+
+You can override the base URL for selenium using:
+
+.. code-block:: python
+
+    WEBDRIVER_BASEURL = "https://superset.company.com";
+
+
+Additional selenium web drive config can be set using `WEBDRIVER_CONFIGURATION`
+
+You can implement a custom function to authenticate selenium, the default uses 
flask-login session cookie.
+An example of a custom function signature:
+
+.. code-block:: python
+
+    def auth_driver(driver: WebDriver, user: "User") -> WebDriver:
+        pass
+
+
+Then on config:
+
+.. code-block:: python
+
+    WEBDRIVER_AUTH_FUNC = auth_driver
+
 Database dependencies
 ---------------------
 
@@ -424,8 +562,40 @@ The connection string for PostgreSQL looks like this ::
 
     postgresql+psycopg2://{username}:{password}@{host}:{port}/{database}
 
-See `psycopg2 SQLAlchemy 
<https://docs.sqlalchemy.org/en/13/dialects/postgresql.html#module-sqlalchemy.dialects.postgresql.psycopg2>`_.
+Additional  may be configured via the ``extra`` field under ``engine_params``. 
+If you would like to enable mutual SSL here is a sample configuration:
+
+.. code-block:: json
+
+    {
+        "metadata_params": {},
+        "engine_params": {
+              "connect_args":{
+                    "sslmode": "require",
+                    "sslrootcert": "/path/to/root_cert"
+            }
+         }
+    }
+
+If the key ``sslrootcert`` is present the server's certificate will be 
verified to be signed by the same Certificate Authority (CA). 
 
+If you would like to enable mutual SSL here is a sample configuration:
+
+.. code-block:: json
+
+    {
+        "metadata_params": {},
+        "engine_params": {
+              "connect_args":{
+                    "sslmode": "require",
+                    "sslcert": "/path/to/client_cert",
+                    "sslkey": "/path/to/client_key",
+                    "sslrootcert": "/path/to/root_cert"
+            }
+         }
+    }
+
+See `psycopg2 SQLAlchemy 
<https://docs.sqlalchemy.org/en/13/dialects/postgresql.html#module-sqlalchemy.dialects.postgresql.psycopg2>`_.
 
 Hana
 ------------
@@ -588,144 +758,6 @@ If you are using JDBC to connect to Drill, the connection 
string looks like this
 For a complete tutorial about how to use Apache Drill with Superset, see this 
tutorial:
 `Visualize Anything with Superset and Drill 
<http://thedataist.com/visualize-anything-with-superset-and-drill/>`_
 
-Caching
--------
-
-Superset uses `Flask-Cache <https://pythonhosted.org/Flask-Cache/>`_ for
-caching purpose. Configuring your caching backend is as easy as providing
-a ``CACHE_CONFIG``, constant in your ``superset_config.py`` that
-complies with the Flask-Cache specifications.
-
-Flask-Cache supports multiple caching backends (Redis, Memcached,
-SimpleCache (in-memory), or the local filesystem). If you are going to use
-Memcached please use the `pylibmc` client library as `python-memcached` does
-not handle storing binary data correctly. If you use Redis, please install
-the `redis <https://pypi.python.org/pypi/redis>`_ Python package: ::
-
-    pip install redis
-
-For setting your timeouts, this is done in the Superset metadata and goes
-up the "timeout searchpath", from your slice configuration, to your
-data source's configuration, to your database's and ultimately falls back
-into your global default defined in ``CACHE_CONFIG``.
-
-.. code-block:: python
-
-    CACHE_CONFIG = {
-        'CACHE_TYPE': 'redis',
-        'CACHE_DEFAULT_TIMEOUT': 60 * 60 * 24, # 1 day default (in secs)
-        'CACHE_KEY_PREFIX': 'superset_results',
-        'CACHE_REDIS_URL': 'redis://localhost:6379/0',
-    }
-
-It is also possible to pass a custom cache initialization function in the
-config to handle additional caching use cases. The function must return an
-object that is compatible with the `Flask-Cache 
<https://pythonhosted.org/Flask-Cache/>`_ API.
-
-.. code-block:: python
-
-    from custom_caching import CustomCache
-
-    def init_cache(app):
-        """Takes an app instance and returns a custom cache backend"""
-        config = {
-            'CACHE_DEFAULT_TIMEOUT': 60 * 60 * 24, # 1 day default (in secs)
-            'CACHE_KEY_PREFIX': 'superset_results',
-        }
-        return CustomCache(app, config)
-
-    CACHE_CONFIG = init_cache
-
-Superset has a Celery task that will periodically warm up the cache based on
-different strategies. To use it, add the following to the `CELERYBEAT_SCHEDULE`
-section in `config.py`:
-
-.. code-block:: python
-
-    CELERYBEAT_SCHEDULE = {
-        'cache-warmup-hourly': {
-            'task': 'cache-warmup',
-            'schedule': crontab(minute=0, hour='*'),  # hourly
-            'kwargs': {
-                'strategy_name': 'top_n_dashboards',
-                'top_n': 5,
-                'since': '7 days ago',
-            },
-        },
-    }
-
-This will cache all the charts in the top 5 most popular dashboards every hour.
-For other strategies, check the `superset/tasks/cache.py` file.
-
-Caching Thumbnails
-------------------
-
-This is an optional feature that can be turned on by activating it's feature 
flag on config:
-
-.. code-block:: python
-
-    FEATURE_FLAGS = {
-        "THUMBNAILS": True,
-        "THUMBNAILS_SQLA_LISTENERS": True,
-    }
-
-
-For this feature you will need a cache system and celery workers. All 
thumbnails are store on cache and are processed
-asynchronously by the workers.
-
-An example config where images are stored on S3 could be:
-
-.. code-block:: python
-
-    from flask import Flask
-    from s3cache.s3cache import S3Cache
-
-    ...
-
-    class CeleryConfig(object):
-        BROKER_URL = "redis://localhost:6379/0"
-        CELERY_IMPORTS = ("superset.sql_lab", "superset.tasks", 
"superset.tasks.thumbnails")
-        CELERY_RESULT_BACKEND = "redis://localhost:6379/0"
-        CELERYD_PREFETCH_MULTIPLIER = 10
-        CELERY_ACKS_LATE = True
-
-
-    CELERY_CONFIG = CeleryConfig
-
-    def init_thumbnail_cache(app: Flask) -> S3Cache:
-        return S3Cache("bucket_name", 'thumbs_cache/')
-
-
-    THUMBNAIL_CACHE_CONFIG = init_thumbnail_cache
-    # Async selenium thumbnail task will use the following user
-    THUMBNAIL_SELENIUM_USER = "Admin"
-
-Using the above example cache keys for dashboards will be 
`superset_thumb__dashboard__{ID}`
-
-You can override the base URL for selenium using:
-
-.. code-block:: python
-
-    WEBDRIVER_BASEURL = "https://superset.company.com";
-
-
-Additional selenium web drive config can be set using `WEBDRIVER_CONFIGURATION`
-
-You can implement a custom function to authenticate selenium, the default uses 
flask-login session cookie.
-An example of a custom function signature:
-
-.. code-block:: python
-
-    def auth_driver(driver: WebDriver, user: "User") -> WebDriver:
-        pass
-
-
-Then on config:
-
-.. code-block:: python
-
-    WEBDRIVER_AUTH_FUNC = auth_driver
-
 Deeper SQLAlchemy integration
 -----------------------------

[incubator-superset] branch master updated: [docs] Add SSL config options for postgres (#9767)

Reply via email to