[incubator-datalab] branch DATALAB-2998 updated: [DATALAB-2998]: added new files nad changes for zeppelin dataengine-service connection

lfrolov Wed, 21 Sep 2022 04:45:02 -0700

This is an automated email from the ASF dual-hosted git repository.

lfrolov pushed a commit to branch DATALAB-2998
in repository https://gitbox.apache.org/repos/asf/incubator-datalab.git



The following commit(s) were added to refs/heads/DATALAB-2998 by this push:
     new 420a87a03 [DATALAB-2998]: added new files nad changes for zeppelin 
dataengine-service connection
420a87a03 is described below

commit 420a87a03eeb0c1c8890c0e3ca06323a63d3cddf
Author: leonidfrolov <[email protected]>
AuthorDate: Wed Sep 21 14:44:41 2022 +0300

    [DATALAB-2998]: added new files nad changes for zeppelin dataengine-service 
connection
---
 .../files/azure/dataengine-service_Dockerfile      |   1 +
 .../src/general/lib/azure/actions_lib.py           | 139 ++++++++++++++++++
 ...common_notebook_configure_dataengine-service.py |  12 +-
 .../zeppelin_dataengine-service_create_configs.py  |  50 +++----
 .../zeppelin_install_dataengine-service_kernels.py |  29 ++--
 .../azure/dataengine-service_interpreter_livy.json | 159 +++++++++++++++++++++
 6 files changed, 340 insertions(+), 50 deletions(-)

diff --git 
a/infrastructure-provisioning/src/general/files/azure/dataengine-service_Dockerfile
 
b/infrastructure-provisioning/src/general/files/azure/dataengine-service_Dockerfile
index 2b443239b..966aaa5c2 100644
--- 
a/infrastructure-provisioning/src/general/files/azure/dataengine-service_Dockerfile
+++ 
b/infrastructure-provisioning/src/general/files/azure/dataengine-service_Dockerfile
@@ -33,6 +33,7 @@ COPY general/scripts/os/get_list_available_pkgs.py 
/root/scripts/get_list_availa
 COPY general/templates/os/inactive.sh /root/templates/
 COPY general/templates/os/inactive.service /root/templates/
 COPY general/templates/os/inactive.timer /root/templates/
+COPY general/templates/azure/dataengine-service_interpreter_livy.json 
/root/templates/dataengine-service_interpreter_livy.json
 
 RUN chmod a+x /root/fabfile.py; \
     chmod a+x /root/scripts/*
diff --git a/infrastructure-provisioning/src/general/lib/azure/actions_lib.py 
b/infrastructure-provisioning/src/general/lib/azure/actions_lib.py
index 6cb86ded6..63bdc793a 100644
--- a/infrastructure-provisioning/src/general/lib/azure/actions_lib.py
+++ b/infrastructure-provisioning/src/general/lib/azure/actions_lib.py
@@ -1209,6 +1209,145 @@ class AzureActions:
                                    file=sys.stdout)}))
             traceback.print_exc(file=sys.stdout)
 
+    def configure_zeppelin_hdinsight_interpreter(self, cluster_name, os_user, 
headnode_ip):
+        try:
+            # (self, emr_version, cluster_name, region, spark_dir, os_user, 
yarn_dir, bucket,
+            #                                            user_name, 
endpoint_url, multiple_emrs)
+            # port_number_found = False
+            # zeppelin_restarted = False
+            default_port = 8998
+            # get_cluster_python_version(region, bucket, user_name, 
cluster_name)
+            # with open('/tmp/python_version') as f:
+            #     python_version = f.read()
+            # python_version = python_version[0:5]
+            # livy_port = ''
+            # livy_path = '/opt/{0}/{1}/livy/'.format(emr_version, 
cluster_name)
+            # spark_libs = 
"/opt/{0}/jars/usr/share/aws/aws-java-sdk/aws-java-sdk-core*.jar " \
+            #              "/opt/{0}/jars/usr/lib/hadoop/hadoop-aws*.jar " \
+            #              
"/opt/{0}/jars/usr/share/aws/aws-java-sdk/aws-java-sdk-s3-*.jar " \
+            #              
"/opt/{0}/jars/usr/lib/hadoop-lzo/lib/hadoop-lzo-*.jar".format(emr_version)
+            # # fix due to: Multiple py4j files found under 
..../spark/python/lib
+            # # py4j-0.10.7-src.zip still in folder. Versions may varies.
+            # subprocess.run('rm 
/opt/{0}/{1}/spark/python/lib/py4j-src.zip'.format(emr_version, cluster_name),
+            #                shell=True, check=True)
+            #
+            # subprocess.run('echo \"Configuring emr path for Zeppelin\"', 
shell=True, check=True)
+            # subprocess.run('sed -i \"s/^export SPARK_HOME.*/export 
SPARK_HOME=\/opt\/{0}\/{1}\/spark/\" '
+            #                
'/opt/zeppelin/conf/zeppelin-env.sh'.format(emr_version, cluster_name), 
shell=True,
+            #                check=True)
+            # subprocess.run('sed -i "s/^export HADOOP_CONF_DIR.*/export 
HADOOP_CONF_DIR=' + \
+            #                '\/opt\/{0}\/{1}\/conf/" 
/opt/{0}/{1}/spark/conf/spark-env.sh'.format(emr_version,
+            #                                                                  
                    cluster_name),
+            #                shell=True, check=True)
+            # subprocess.run(
+            #     'echo \"spark.jars $(ls {0} | tr \'\\n\' \',\')\" >> 
/opt/{1}/{2}/spark/conf/spark-defaults.conf'
+            #     .format(spark_libs, emr_version, cluster_name), shell=True, 
check=True)
+            # subprocess.run('sed -i "/spark.executorEnv.PYTHONPATH/d" 
/opt/{0}/{1}/spark/conf/spark-defaults.conf'
+            #                .format(emr_version, cluster_name), shell=True, 
check=True)
+            # subprocess.run('sed -i "/spark.yarn.dist.files/d" 
/opt/{0}/{1}/spark/conf/spark-defaults.conf'
+            #                .format(emr_version, cluster_name), shell=True, 
check=True)
+            # subprocess.run('sudo chown {0}:{0} -R 
/opt/zeppelin/'.format(os_user), shell=True, check=True)
+            # subprocess.run('sudo systemctl daemon-reload', shell=True, 
check=True)
+            # subprocess.run('sudo service zeppelin-notebook stop', 
shell=True, check=True)
+            # subprocess.run('sudo service zeppelin-notebook start', 
shell=True, check=True)
+            # while not zeppelin_restarted:
+            #     subprocess.run('sleep 5', shell=True, check=True)
+            #     result = subprocess.run('sudo bash -c "nmap -p 8080 
localhost | grep closed > /dev/null" ; echo $?',
+            #                             capture_output=True, shell=True, 
check=True).stdout.decode('UTF-8').rstrip(
+            #         "\n\r")
+            #     result = result[:1]
+            #     if result == '1':
+            #         zeppelin_restarted = True
+            # subprocess.run('sleep 5', shell=True, check=True)
+            subprocess.run('echo \"Configuring emr spark interpreter for 
Zeppelin\"', shell=True, check=True)
+            if False: #multiple_emrs == 'true':
+                pass
+                # while not port_number_found:
+                #     port_free = subprocess.run('sudo bash -c "nmap -p ' + 
str(default_port) +
+                #                                ' localhost | grep closed > 
/dev/null" ; echo $?', capture_output=True,
+                #                                shell=True, 
check=True).stdout.decode('UTF-8').rstrip("\n\r")
+                #     port_free = port_free[:1]
+                #     if port_free == '0':
+                #         livy_port = default_port
+                #         port_number_found = True
+                #     else:
+                #         default_port += 1
+                # subprocess.run(
+                #     'sudo echo "livy.server.port = {0}" >> 
{1}conf/livy.conf'.format(str(livy_port), livy_path),
+                #     shell=True, check=True)
+                # subprocess.run('sudo echo "livy.spark.master = yarn" >> 
{}conf/livy.conf'.format(livy_path), shell=True,
+                #                check=True)
+                # if 
os.path.exists('{}conf/spark-blacklist.conf'.format(livy_path)):
+                #     subprocess.run('sudo sed -i "s/^/#/g" 
{}conf/spark-blacklist.conf'.format(livy_path), shell=True,
+                #                    check=True)
+                # subprocess.run(
+                #     ''' sudo echo "export SPARK_HOME={0}" >> 
{1}conf/livy-env.sh'''.format(spark_dir, livy_path),
+                #     shell=True, check=True)
+                # subprocess.run(
+                #     ''' sudo echo "export HADOOP_CONF_DIR={0}" >> 
{1}conf/livy-env.sh'''.format(yarn_dir, livy_path),
+                #     shell=True, check=True)
+                # subprocess.run(''' sudo echo "export 
PYSPARK3_PYTHON=python{0}" >> {1}conf/livy-env.sh'''.format(
+                #     python_version[0:3],
+                #     livy_path), shell=True, check=True)
+                # template_file = "/tmp/dataengine-service_interpreter.json"
+                # fr = open(template_file, 'r+')
+                # text = fr.read()
+                # text = text.replace('CLUSTER_NAME', cluster_name)
+                # text = text.replace('SPARK_HOME', spark_dir)
+                # text = text.replace('ENDPOINTURL', endpoint_url)
+                # text = text.replace('LIVY_PORT', str(livy_port))
+                # fw = open(template_file, 'w')
+                # fw.write(text)
+                # fw.close()
+                # for _ in range(5):
+                #     try:
+                #         subprocess.run("curl --noproxy localhost -H 
'Content-Type: application/json' -X POST -d " +
+                #                        
"@/tmp/dataengine-service_interpreter.json 
http://localhost:8080/api/interpreter/setting";,
+                #                        shell=True, check=True)
+                #         break
+                #     except:
+                #         subprocess.run('sleep 5', shell=True, check=True)
+                # subprocess.run('sudo cp /opt/livy-server-cluster.service 
/etc/systemd/system/livy-server-{}.service'
+                #                .format(str(livy_port)), shell=True, 
check=True)
+                # subprocess.run("sudo sed -i 's|OS_USER|{0}|' 
/etc/systemd/system/livy-server-{1}.service"
+                #                .format(os_user, str(livy_port)), shell=True, 
check=True)
+                # subprocess.run("sudo sed -i 's|LIVY_PATH|{0}|' 
/etc/systemd/system/livy-server-{1}.service"
+                #                .format(livy_path, str(livy_port)), 
shell=True, check=True)
+                # subprocess.run('sudo chmod 644 
/etc/systemd/system/livy-server-{}.service'.format(str(livy_port)),
+                #                shell=True, check=True)
+                # subprocess.run("sudo systemctl daemon-reload", shell=True, 
check=True)
+                # subprocess.run("sudo systemctl enable 
livy-server-{}".format(str(livy_port)), shell=True, check=True)
+                # subprocess.run('sudo systemctl start 
livy-server-{}'.format(str(livy_port)), shell=True, check=True)
+            else:
+                template_file = "/tmp/dataengine-service_interpreter.json"
+                fr = open(template_file, 'r+')
+                text = fr.read()
+                text = text.replace('CLUSTERNAME', cluster_name)
+                text = text.replace('HEADNODEIP', headnode_ip)
+                text = text.replace('PORT', default_port)
+                    # text = text.replace('PYTHONVERSION', p_version)
+                    # text = text.replace('SPARK_HOME', spark_dir)
+                    # text = text.replace('PYTHONVER_SHORT', p_version[:1])
+                    # text = text.replace('ENDPOINTURL', endpoint_url)
+                    # text = text.replace('DATAENGINE-SERVICE_VERSION', 
emr_version)
+                tmp_file = "/tmp/hdinsight_interpreter_livy.json"
+                fw = open(tmp_file, 'w')
+                fw.write(text)
+                fw.close()
+                for _ in range(5):
+                    try:
+                        subprocess.run("curl --noproxy localhost -H 
'Content-Type: application/json' -X POST "
+                                       "-d 
@/tmp/hdinsight_interpreter_livy.json "
+                                       
"http://localhost:8080/api/interpreter/setting";,
+                                       shell=True, check=True)
+                        break
+                    except:
+                        subprocess.run('sleep 5', shell=True, check=True)
+            subprocess.run(
+                'touch /home/' + os_user + '/.ensure_dir/dataengine-service_' 
+ cluster_name + '_interpreter_ensured',
+                shell=True, check=True)
+        except:
+            sys.exit(1)
 
 def ensure_local_jars(os_user, jars_dir):
     if not 
exists(datalab.fab.conn,'/home/{}/.ensure_dir/local_jars_ensured'.format(os_user)):
diff --git 
a/infrastructure-provisioning/src/general/scripts/azure/common_notebook_configure_dataengine-service.py
 
b/infrastructure-provisioning/src/general/scripts/azure/common_notebook_configure_dataengine-service.py
index 92484657f..f198ba49f 100644
--- 
a/infrastructure-provisioning/src/general/scripts/azure/common_notebook_configure_dataengine-service.py
+++ 
b/infrastructure-provisioning/src/general/scripts/azure/common_notebook_configure_dataengine-service.py
@@ -72,6 +72,9 @@ if __name__ == "__main__":
                                                    
notebook_config['project_name'], notebook_config['endpoint_tag'])
     edge_instance_hostname = 
AzureMeta.get_private_ip_address(notebook_config['resource_group_name'],
                                                               
edge_instance_name)
+    notebook_config['headnode_ip'] = 
datalab.fab.get_hdinsight_headnode_private_ip(os.environ['conf_os_user'],
+                                                                               
    notebook_config['cluster_name'],
+                                                                               
    notebook_config['key_path'])
 
     if os.environ['application'] == 'deeplearning':
         application = 'jupyter'
@@ -80,13 +83,14 @@ if __name__ == "__main__":
 
     try:
         logging.info('[INSTALLING KERNELS INTO SPECIFIED NOTEBOOK]')
-        params = "--bucket {} --cluster_name {} --dataproc_version {} 
--keyfile {} --notebook_ip {} --region {} " \
+        params = "--bucket {} --cluster_name {} --hdinsight_version {} 
--keyfile {} --notebook_ip {} --region {} " \
                  "--edge_user_name {} --project_name {} --os_user {}  
--edge_hostname {} --proxy_port {} " \
-                 "--scala_version {} --application {}" \
-            .format(notebook_config['storage_account_name_tag'], 
notebook_config['cluster_name'], os.environ['dataproc_version'],
+                 "--scala_version {} --application {} --headnode_ip" \
+            .format(notebook_config['storage_account_name_tag'], 
notebook_config['cluster_name'], os.environ['hdinsight_version'],
                     notebook_config['key_path'], 
notebook_config['notebook_ip'], os.environ['gcp_region'],
                     notebook_config['edge_user_name'], 
notebook_config['project_name'], os.environ['conf_os_user'],
-                    edge_instance_hostname, '3128', 
os.environ['notebook_scala_version'], os.environ['application'])
+                    edge_instance_hostname, '3128', 
os.environ['notebook_scala_version'], os.environ['application'],
+                    notebook_config['headnode_ip'])
         try:
             subprocess.run("~/scripts/{}_{}.py {}".format(application, 
'install_dataengine-service_kernels', params), 
                            shell=True, check=True)
diff --git 
a/infrastructure-provisioning/src/general/scripts/azure/zeppelin_dataengine-service_create_configs.py
 
b/infrastructure-provisioning/src/general/scripts/azure/zeppelin_dataengine-service_create_configs.py
index f645b64b4..ad8cc6731 100644
--- 
a/infrastructure-provisioning/src/general/scripts/azure/zeppelin_dataengine-service_create_configs.py
+++ 
b/infrastructure-provisioning/src/general/scripts/azure/zeppelin_dataengine-service_create_configs.py
@@ -23,7 +23,7 @@
 
 import argparse
 import subprocess
-from datalab.actions_lib import jars, yarn, install_emr_spark, spark_defaults, 
installing_python, configure_zeppelin_emr_interpreter
+from datalab.actions_lib import jars, yarn, install_hdinsight_spark, 
spark_defaults, installing_python, configure_zeppelin_hdinsight_interpreter
 from datalab.common_lib import *
 from datalab.fab import configuring_notebook, update_zeppelin_interpreters
 from datalab.notebook_lib import *
@@ -33,7 +33,7 @@ parser = argparse.ArgumentParser()
 parser.add_argument('--bucket', type=str, default='')
 parser.add_argument('--cluster_name', type=str, default='')
 parser.add_argument('--dry_run', type=str, default='false')
-parser.add_argument('--emr_version', type=str, default='')
+parser.add_argument('--hdinsight_version', type=str, default='')
 parser.add_argument('--spark_version', type=str, default='')
 parser.add_argument('--scala_version', type=str, default='')
 parser.add_argument('--hadoop_version', type=str, default='')
@@ -49,25 +49,26 @@ parser.add_argument('--multiple_clusters', type=str, 
default='')
 parser.add_argument('--numpy_version', type=str, default='')
 parser.add_argument('--application', type=str, default='')
 parser.add_argument('--r_enabled', type=str, default='')
+parser.add_argument('--headnode_ip', type=str, default='')
 args = parser.parse_args()
 
-emr_dir = '/opt/' + args.emr_version + '/jars/'
+hdinsight_dir = '/opt/' + args.hdinsight_version + '/jars/'
 kernels_dir = '/home/' + args.os_user + '/.local/share/jupyter/kernels/'
-spark_dir = '/opt/' + args.emr_version + '/' + args.cluster_name + '/spark/'
-yarn_dir = '/opt/' + args.emr_version + '/' + args.cluster_name + '/conf/'
+spark_dir = '/opt/' + args.hdinsight_version + '/' + args.cluster_name + 
'/spark/'
+yarn_dir = '/opt/' + args.hdinsight_version + '/' + args.cluster_name + 
'/conf/'
 
 
 def install_remote_livy(args):
     subprocess.run('sudo chown ' + args.os_user + ':' + args.os_user + ' -R 
/opt/zeppelin/', shell=True, check=True)
     subprocess.run('sudo service zeppelin-notebook stop', shell=True, 
check=True)
     subprocess.run('sudo -i wget 
http://archive.cloudera.com/beta/livy/livy-server-' + args.livy_version + '.zip 
-O /opt/'
-          + args.emr_version + '/' + args.cluster_name + '/livy-server-' + 
args.livy_version + '.zip', shell=True, check=True)
+          + args.hdinsight_version + '/' + args.cluster_name + '/livy-server-' 
+ args.livy_version + '.zip', shell=True, check=True)
     subprocess.run('sudo unzip /opt/'
-          + args.emr_version + '/' + args.cluster_name + '/livy-server-' + 
args.livy_version + '.zip -d /opt/'
-          + args.emr_version + '/' + args.cluster_name + '/', shell=True, 
check=True)
-    subprocess.run('sudo mv /opt/' + args.emr_version + '/' + 
args.cluster_name + '/livy-server-' + args.livy_version +
-          '/ /opt/' + args.emr_version + '/' + args.cluster_name + '/livy/', 
shell=True, check=True)
-    livy_path = '/opt/' + args.emr_version + '/' + args.cluster_name + '/livy/'
+          + args.hdinsight_version + '/' + args.cluster_name + '/livy-server-' 
+ args.livy_version + '.zip -d /opt/'
+          + args.hdinsight_version + '/' + args.cluster_name + '/', 
shell=True, check=True)
+    subprocess.run('sudo mv /opt/' + args.hdinsight_version + '/' + 
args.cluster_name + '/livy-server-' + args.livy_version +
+          '/ /opt/' + args.hdinsight_version + '/' + args.cluster_name + 
'/livy/', shell=True, check=True)
+    livy_path = '/opt/' + args.hdinsight_version + '/' + args.cluster_name + 
'/livy/'
     subprocess.run('sudo mkdir -p ' + livy_path + '/logs', shell=True, 
check=True)
     subprocess.run('sudo mkdir -p /var/run/livy', shell=True, check=True)
     subprocess.run('sudo chown ' + args.os_user + ':' + args.os_user + ' -R 
/var/run/livy', shell=True, check=True)
@@ -78,17 +79,16 @@ if __name__ == "__main__":
     if args.dry_run == 'true':
         parser.print_help()
     else:
-        result = prepare(emr_dir, yarn_dir)
-        if result == False :
-            jars(args, emr_dir)
-        yarn(args, yarn_dir)
-        install_emr_spark(args)
-        spark_defaults(args)
-        configuring_notebook(args.emr_version)
-        if args.multiple_clusters == 'true':
-            install_remote_livy(args)
-        installing_python(args.region, args.bucket, args.project_name, 
args.cluster_name, args.application,
-                          args.numpy_version, args.matplotlib_version)
-        configure_zeppelin_emr_interpreter(args.emr_version, 
args.cluster_name, args.region, spark_dir, args.os_user,
-                                           yarn_dir, args.bucket, 
args.project_name, endpoint_url, args.multiple_clusters)
-        update_zeppelin_interpreters(args.multiple_clusters, args.r_enabled)
+        # result = prepare(hdinsight_dir, yarn_dir)
+        # if result == False :
+        #     jars(args, hdinsight_dir)
+        # yarn(args, yarn_dir)
+        # install_hdinsight_spark(args)
+        # spark_defaults(args)
+        # configuring_notebook(args.hdinsight_version)
+        # if args.multiple_clusters == 'true':
+        #     install_remote_livy(args)
+        # installing_python(args.region, args.bucket, args.project_name, 
args.cluster_name, args.application,
+        #                   args.numpy_version, args.matplotlib_version)
+        configure_zeppelin_hdinsight_interpreter(args.cluster_name, 
args.os_user, args.headnode_ip)
+        # update_zeppelin_interpreters(args.multiple_clusters, args.r_enabled)
diff --git 
a/infrastructure-provisioning/src/general/scripts/azure/zeppelin_install_dataengine-service_kernels.py
 
b/infrastructure-provisioning/src/general/scripts/azure/zeppelin_install_dataengine-service_kernels.py
index b8ea051d1..2ffbf2a7c 100644
--- 
a/infrastructure-provisioning/src/general/scripts/azure/zeppelin_install_dataengine-service_kernels.py
+++ 
b/infrastructure-provisioning/src/general/scripts/azure/zeppelin_install_dataengine-service_kernels.py
@@ -30,17 +30,18 @@ parser = argparse.ArgumentParser()
 parser.add_argument('--bucket', type=str, default='')
 parser.add_argument('--cluster_name', type=str, default='')
 parser.add_argument('--dry_run', type=str, default='false')
-parser.add_argument('--emr_version', type=str, default='')
+parser.add_argument('--hdinsight_version', type=str, default='')
 parser.add_argument('--keyfile', type=str, default='')
 parser.add_argument('--region', type=str, default='')
 parser.add_argument('--notebook_ip', type=str, default='')
 parser.add_argument('--scala_version', type=str, default='')
-parser.add_argument('--emr_excluded_spark_properties', type=str, default='')
+parser.add_argument('--hdinsight_excluded_spark_properties', type=str, 
default='')
 parser.add_argument('--project_name', type=str, default='')
 parser.add_argument('--os_user', type=str, default='')
 parser.add_argument('--edge_hostname', type=str, default='')
 parser.add_argument('--proxy_port', type=str, default='')
 parser.add_argument('--application', type=str, default='')
+parser.add_argument('--headnode_ip', type=str, default='')
 args = parser.parse_args()
 
 
@@ -77,31 +78,16 @@ if __name__ == "__main__":
     numpy_version = os.environ['notebook_numpy_version']
     matplotlib_version = os.environ['notebook_matplotlib_version']
     command = "/usr/bin/python3 
/usr/local/bin/zeppelin_dataengine-service_create_configs.py " \
-              "--bucket {0} " \
               "--cluster_name {1} " \
-              "--emr_version {2} " \
-              "--spark_version {3} " \
-              "--hadoop_version {4} " \
-              "--region {5} " \
-              "--excluded_lines '{6}' " \
-              "--project_name {7} " \
               "--os_user {8} " \
-              "--edge_hostname {9} " \
-              "--proxy_port {10} " \
-              "--scala_version {11} " \
-              "--livy_version {12} " \
-              "--multiple_clusters {13} " \
-              "--numpy_version {14} " \
-              "--matplotlib_version {15} " \
-              "--application {16} " \
-              "--r_enabled {17}" \
+              "--headnode_ip {18}" \
         .format(args.bucket,
                 args.cluster_name,
-                args.emr_version,
+                args.hdinsight_version,
                 spark_version,
                 hadoop_version,
                 args.region,
-                args.emr_excluded_spark_properties,
+                args.hdinsight_excluded_spark_properties,
                 args.project_name,
                 args.os_user,
                 args.edge_hostname,
@@ -112,5 +98,6 @@ if __name__ == "__main__":
                 numpy_version,
                 matplotlib_version,
                 args.application,
-                r_enabled)
+                r_enabled,
+                args.headnode_ip)
     conn.sudo(command)
diff --git 
a/infrastructure-provisioning/src/general/templates/azure/dataengine-service_interpreter_livy.json
 
b/infrastructure-provisioning/src/general/templates/azure/dataengine-service_interpreter_livy.json
new file mode 100644
index 000000000..5bd79823e
--- /dev/null
+++ 
b/infrastructure-provisioning/src/general/templates/azure/dataengine-service_interpreter_livy.json
@@ -0,0 +1,159 @@
+{
+   "id":"livy",
+   "name":"CLUSTERNAME",
+   "group":"livy",
+   "properties":{
+      "zeppelin.livy.url":{
+         "name":"zeppelin.livy.url",
+         "value":"https://HEADNODEIP:PORT";,
+         "type":"url",
+         "description":"The URL for Livy Server."
+      },
+      "zeppelin.livy.session.create_timeout":{
+         "name":"zeppelin.livy.session.create_timeout",
+         "value":"120",
+         "type":"number",
+         "description":"Livy Server create session timeout (seconds)."
+      },
+      "livy.spark.driver.memory":{
+         "name":"livy.spark.driver.memory",
+         "value":"1g",
+         "type":"string",
+         "description":"Driver memory. ex) 512m, 32g"
+      },
+      "zeppelin.livy.pull_status.interval.millis":{
+         "name":"zeppelin.livy.pull_status.interval.millis",
+         "value":"1000",
+         "type":"number",
+         "description":"The interval for checking paragraph execution status"
+      },
+      "zeppelin.livy.maxLogLines":{
+         "name":"zeppelin.livy.maxLogLines",
+         "value":"1000",
+         "type":"number",
+         "description":"Max number of lines of logs"
+      },
+      "livy.spark.jars.packages":{
+         "name":"livy.spark.jars.packages",
+         "value":"",
+         "type":"textarea",
+         "description":"Adding extra libraries to livy interpreter"
+      },
+      "zeppelin.livy.displayAppInfo":{
+         "name":"zeppelin.livy.displayAppInfo",
+         "value":true,
+         "type":"checkbox",
+         "description":"Whether display app info"
+      },
+      "zeppelin.livy.spark.sql.maxResult":{
+         "name":"zeppelin.livy.spark.sql.maxResult",
+         "value":"1000",
+         "type":"number",
+         "description":"Max number of Spark SQL result to display."
+      },
+      "zeppelin.livy.spark.sql.field.truncate":{
+         "name":"zeppelin.livy.spark.sql.field.truncate",
+         "value":true,
+         "type":"checkbox",
+         "description":"If true, truncate field values longer than 20 
characters."
+      },
+      "zeppelin.R.knitr":{
+         "name":"zeppelin.R.knitr",
+         "value":"true",
+         "type":"string"
+      },
+      "zeppelin.R.image.width":{
+         "name":"zeppelin.R.image.width",
+         "value":"100%",
+         "type":"string"
+      },
+      "zeppelin.R.cmd":{
+         "name":"zeppelin.R.cmd",
+         "value":"R",
+         "type":"string"
+      },
+      "zeppelin.R.render.options":{
+         "name":"zeppelin.R.render.options",
+         "value":"out.format \u003d \u0027html\u0027, comment \u003d NA, echo 
\u003d FALSE, results \u003d \u0027asis\u0027, message \u003d F, warning \u003d 
F",
+         "type":"string"
+      }
+   },
+   "status":"READY",
+   "interpreterGroup":[
+      {
+         "name":"spark",
+         "class":"org.apache.zeppelin.livy.LivySparkInterpreter",
+         "defaultInterpreter":true,
+         "editor":{
+            "language":"scala",
+            "editOnDblClick":false,
+            "completionKey":"TAB",
+            "completionSupport":true
+         }
+      },
+      {
+         "name":"sql",
+         "class":"org.apache.zeppelin.livy.LivySparkSQLInterpreter",
+         "defaultInterpreter":false,
+         "editor":{
+            "language":"sql",
+            "editOnDblClick":false,
+            "completionKey":"TAB",
+            "completionSupport":true
+         }
+      },
+      {
+         "name":"pyspark",
+         "class":"org.apache.zeppelin.livy.LivyPySparkInterpreter",
+         "defaultInterpreter":false,
+         "editor":{
+            "language":"python",
+            "editOnDblClick":false,
+            "completionKey":"TAB",
+            "completionSupport":true
+         }
+      },
+      {
+         "name":"pyspark3",
+         "class":"org.apache.zeppelin.livy.LivyPySpark3Interpreter",
+         "defaultInterpreter":false,
+         "editor":{
+            "language":"python",
+            "editOnDblClick":false,
+            "completionKey":"TAB",
+            "completionSupport":true
+         }
+      },
+      {
+         "name":"sparkr",
+         "class":"org.apache.zeppelin.livy.LivySparkRInterpreter",
+         "defaultInterpreter":false,
+         "editor":{
+            "language":"r",
+            "editOnDblClick":false,
+            "completionKey":"TAB",
+            "completionSupport":true
+         }
+      },
+      {
+         "name":"shared",
+         "class":"org.apache.zeppelin.livy.LivySharedInterpreter",
+         "defaultInterpreter":false
+      }
+   ],
+   "dependencies":[
+
+   ],
+   "option":{
+      "remote":true,
+      "port":-1,
+      "perNote":"shared",
+      "perUser":"scoped",
+      "isExistingProcess":false,
+      "setPermission":false,
+      "owners":[
+
+      ],
+      "isUserImpersonate":false
+   }
+}
\ No newline at end of file


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[incubator-datalab] branch DATALAB-2998 updated: [DATALAB-2998]: added new files nad changes for zeppelin dataengine-service connection

Reply via email to