[ https://issues.apache.org/jira/browse/HAWQ-1252?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15818996#comment-15818996 ]
ASF GitHub Bot commented on HAWQ-1252: -------------------------------------- Github user dyozie commented on a diff in the pull request: https://github.com/apache/incubator-hawq-docs/pull/83#discussion_r95659257 --- Diff: markdown/plext/using_pljava.html.md.erb --- @@ -2,205 +2,222 @@ title: Using PL/Java --- -This section contains an overview of the HAWQ PL/Java language. +This section provides an overview of the HAWQ PL/Java procedural language extension. ## <a id="aboutpljava"></a>About PL/Java -With the HAWQ PL/Java extension, you can write Java methods using your favorite Java IDE and install the JAR files that implement the methods in your HAWQ cluster. +With the PL/Java extension, you can write Java methods using your favorite Java IDE and invoke the methods from PostgreSQL user-defined functions (UDFs). -**Note**: If building HAWQ from source, you must specify PL/Java as a build option when compiling HAWQ. To use PL/Java in a HAWQ deployment, you must explicitly enable the PL/Java extension in all desired databases. +The HAWQ PL/Java package is based on the open source PL/Java 1.4.0 and provides the following features: -The HAWQ PL/Java package is based on the open source PL/Java 1.4.0. HAWQ PL/Java provides the following features. - -- Ability to execute PL/Java functions with Java 1.6 or 1.7. -- Standardized utilities (modeled after the SQL 2003 proposal) to install and maintain Java code in the database. -- Standardized mappings of parameters and result. Complex types as well as sets are supported. -- An embedded, high performance, JDBC driver utilizing the internal HAWQ Database SPI routines. +- PL/Java function execution with Java 1.6 or 1.7. +- Standardized mappings of Java and PostgreSQL parameters and results. Complex types as well as sets are supported. +- Two HAWQ Java languages: + - `java` - Tusted PL/Java language + - `javau` - Untrusted PL/Java language (superusers only) +- An embedded, high performance, JDBC driver utilizing the internal HAWQ Database Server Programming Interface (SPI). - Metadata support for the JDBC driver. Both `DatabaseMetaData` and `ResultSetMetaData` are included. -- The ability to return a `ResultSet` from a query as an alternative to building a ResultSet row by row. +- A standard way of passing parameters and return values. Complex types and sets are passed using the standard JDBC `ResultSet` class. +- The ability to return a `ResultSet` from a query as an alternative to building a `ResultSet` row by row. - Full support for savepoints and exception handling. -- The ability to use IN, INOUT, and OUT parameters. -- Two separate HAWQ languages: - - pljava, TRUSTED PL/Java language - - pljavau, UNTRUSTED PL/Java language - Transaction and Savepoint listeners enabling code execution when a transaction or savepoint is committed or rolled back. - Integration with GNU GCJ on selected platforms. -A function in SQL will appoint a static method in a Java class. In order for the function to execute, the appointed class must available on the class path specified by the HAWQ server configuration parameter `pljava_classpath`. The PL/Java extension adds a set of functions that helps to install and maintain the Java classes. Classes are stored in normal Java archives, JAR files. A JAR file can optionally contain a deployment descriptor that in turn contains SQL commands to be executed when the JAR is deployed or undeployed. The functions are modeled after the standards proposed for SQL 2003. - -PL/Java implements a standard way of passing parameters and return values. Complex types and sets are passed using the standard JDBC ResultSet class. - -A JDBC driver is included in PL/Java. This driver calls HAWQ internal SPI routines. The driver is essential since it is common for functions to make calls back to the database to fetch data. When PL/Java functions fetch data, they must use the same transactional boundaries that are used by the main function that entered PL/Java execution context. +PL/Java is optimized for performance. The Java virtual machine executes within the same process as the backend, minimizing call overhead. PL/Java brings the power of Java to the database itself, enabling the database-intensive business logic to execute as close to the actual data as possible. -PL/Java is optimized for performance. The Java virtual machine executes within the same process as the backend to minimize call overhead. PL/Java is designed with the objective to enable the power of Java to the database itself so that database intensive business logic can execute as close to the actual data as possible. - -The standard Java Native Interface (JNI) is used when bridging calls between the backend and the Java VM. +Calls between the HAWQ backend and the Java VM are bridged using the standard Java Native Interface (JNI). ## <a id="abouthawqpljava"></a>About HAWQ PL/Java +PL/Java is embedded in your HAWQ product distribution or within your HAWQ build if you chose to enable it as a build option. To use PL/Java in a HAWQ deployment, you must explicitly enable the PL/Java extension in all desired databases. + There are a few key differences between the implementation of PL/Java in standard PostgreSQL and HAWQ. -### <a id="pljavafunctions"></a>Functions +### <a id="pljavafunctions"></a>Functions Not Supported -The following functions are not supported in HAWQ. The classpath is handled differently in a distributed HAWQ environment than in the PostgreSQL environment. +The following functions are not supported in HAWQ. The PL/Java classpath is handled differently in a distributed HAWQ environment than that of the PostgreSQL environment. -- sqlj.install_jar -- sqlj.install_jar -- sqlj.replace_jar -- sqlj.remove_jar -- sqlj.get_classpath -- sqlj.set_classpath +- sqlj.install_jar() +- sqlj.install_jar() +- sqlj.replace_jar() +- sqlj.remove_jar() +- sqlj.get_classpath() +- sqlj.set_classpath() -HAWQ uses the `pljava_classpath` server configuration parameter in place of the `sqlj.set_classpath` function. +HAWQ uses the `pljava_classpath` server configuration parameter in place of the `sqlj.set_classpath()` function. ### <a id="serverconfigparams"></a>Server Configuration Parameters -PL/Java uses server configuration parameters to configure classpath, Java VM, and other options. Refer to the [Server Configuration Parameter Reference](../reference/HAWQSiteConfig.html) for general information about HAWQ server configuration parameters. +PL/Java uses server configuration parameters to configure the classpath, Java Virtual Machine (VM), and other options. Refer to the [Server Configuration Parameter Reference](../reference/HAWQSiteConfig.html) for general information about HAWQ server configuration parameters. -The following server configuration parameters are used by PL/Java in HAWQ. These parameters replace the `pljava.*` parameters that are used in the standard PostgreSQL PL/Java implementation. +The following server configuration parameters are used by HAWQ PL/Java. These parameters replace the `pljava.*` configuration parameters that are used in the standard PostgreSQL PL/Java implementation. #### pljava\_classpath -A colon (:) separated list of the jar files containing the Java classes used in any PL/Java functions. The jar files must be installed in the same locations on all HAWQ hosts. With the trusted PL/Java language handler, jar file paths must be relative to the `$GPHOME/lib/postgresql/java/` directory. With the untrusted language handler (javaU language tag), paths may be relative to `$GPHOME/lib/postgresql/java/` or absolute. +A PL/Java user-defined function will appoint a static method in a Java class. In order for the function to execute, the appointed class must be available on the class path specified by the [`pljava_classpath`](../reference/guc/parameter_definitions.html#pljava_classpath) HAWQ server configuration parameter. -#### pljava\_statement\_cache\_size - -Sets the size in KB of the Most Recently Used (MRU) cache for prepared statements. +`pljava_classpath` is a colon (:) separated list of the JAR files implementing the Java classes referenced in any PL/Java user-defined functions. The JAR files must be installed in the same locations on all HAWQ hosts. When using the trusted PL/Java language handler (`java` language tag), JAR file paths must be relative to the `$GPHOME/lib/postgresql/java/` directory. Relative or absolute JAR file paths may be used with the untrusted PL/Java language handler (`javau` language tag). #### pljava\_release\_lingering\_savepoints -If TRUE, lingering savepoints will be released on function exit. If FALSE, they will be rolled back. +[`pljava_release_lingering_savepoints`](../reference/guc/parameter_definitions.html#pljava_release_lingering_savepoints) identifies the action to perform when lingering savepoints exist on exit of a PL/Java user-defined function. If TRUE, lingering savepoints will be released on function exit. If FALSE, lingering savepoints will be rolled back. + +#### pljava\_statement\_cache\_size + +[`pljava_statement_cache_size`](../reference/guc/parameter_definitions.html#pljava_statement_cache_size) sets the size (in KB) of the Most Recently Used (MRU) cache for prepared statements. #### pljava\_vmoptions -Defines the start up options for the Java VM. +[`pljava_vmoptions`](../reference/guc/parameter_definitions.html#pljava_vmoptions) defines the start-up options for the Java VM. ### <a id="setting_serverconfigparams"></a>Setting PL/Java Configuration Parameters -You can set PL/Java server configuration parameters at the session level, or globally across your whole cluster. Your HAWQ cluster configuration must be reloaded after setting a server configuration value globally. +You can set PL/Java server configuration parameters at the session level, or globally across your whole cluster. #### <a id="setsrvrcfg_global"></a>Cluster Level You will perform different procedures to set a PL/Java server configuration parameter for your whole HAWQ cluster depending upon whether you manage your cluster from the command line or use Ambari. If you use Ambari to manage your HAWQ cluster, you must ensure that you update server configuration parameters only via the Ambari Web UI. If you manage your HAWQ cluster from the command line, you will use the `hawq config` command line utility to set PL/Java server configuration parameters. +Your HAWQ cluster configuration must be reloaded after setting a server configuration value globally. + The following examples add a JAR file named `myclasses.jar` to the `pljava_classpath` server configuration parameter for the entire HAWQ cluster. If you use Ambari to manage your HAWQ cluster: -1. Set the `pljava_classpath` configuration property to include `myclasses.jar` via the HAWQ service **Configs > Advanced > Custom hawq-site** drop down. +1. Set the `pljava_classpath` configuration property **Key** **Value** to include `myclasses.jar` via the HAWQ service **Configs > Advanced > Custom hawq-site** drop down. 2. Select **Service Actions > Restart All** to load the updated configuration. If you manage your HAWQ cluster from the command line: -1. Log in to the HAWQ master host as a HAWQ administrator and source the file `/usr/local/hawq/greenplum_path.sh`. +1. Log in to the HAWQ master node as a HAWQ administrator and set up your HAWQ environment. ``` shell - $ source /usr/local/hawq/greenplum_path.sh + $ ssh gpadmin@master + gpadmin@master$ source /usr/local/hawq/greenplum_path.sh ``` 1. Use the `hawq config` utility to set `pljava_classpath`: ``` shell - $ hawq config -c pljava_classpath -v \'myclasses.jar\' + gpadmin@master$ hawq config -c pljava_classpath -v 'myclasses.jar' ``` + + The `-c` option identifies the name of the server configuration parameter, `-v` identifies the new value. + 2. Reload the HAWQ configuration: ``` shell - $ hawq stop cluster -u + gpadmin@master$ hawq stop cluster -u ``` #### <a id="setsrvrcfg_session"></a>Session Level -To set a PL/Java server configuration parameter for only the *current* database session, set the parameter within the `psql` subsystem. For example, to set `pljava_classpath`: +To set a PL/Java server configuration parameter for only the *current* database session, set the parameter within the `psql` subsystem. For example: ``` sql => SET pljava_classpath='myclasses.jar'; ``` +## <a id="security"></a>Security -## <a id="enablepljava"></a>Enabling and Removing PL/Java Support +This section describes specific security considerations when using PL/Java. + +### <a id="installation"></a>Enable/Disable + +Only a database superuser can enable/disable PL/Java. The PL/Java utility functions are created using `SECURITY DEFINER` privileges; they execute with the access permissions that were granted to the creator of the functions. + +### <a id="trustedlang"></a>Trusted/Untrusted Language -The PL/Java extension must be explicitly enabled on each database in which it will be used. +PL/Java includes trusted and untrusted language variants. +The PL/Java trusted language is named `java`. The trusted PL/Java language has no access to the file system as stipulated by PostgreSQL definition of a trusted language. Any database user can create and execute functions in the trusted `java` language. + +The PL/Java untrusted language is named `javau`. `javau` offers no way of restricting what you can program in UDFs created with the language. Creating untrusted PL/Java UDFs is permitted only by database superusers. Executing untrusted PL/Java UDFs is permitted only by database superusers and other database users explicitly `GRANT`ed the permissions. +## <a id="enablepljava"></a>Enabling and Removing PL/Java Support + +The PL/Java extension must be explicitly enabled on each database in which it will be used. To remove PL/Java support, the extension must be explicitly removed from each database in which it was used. ### <a id="pljavaprereq"></a>Prerequisites Before you enable PL/Java: 1. Ensure that you have installed a supported Java runtime environment and that the `$JAVA_HOME` variable is set to the same path on the master and all segment nodes. -2. Perform the following step on all machines to set up `ldconfig` for the installed JDK: +2. Perform the following step on all nodes in your HAWQ cluster to set up the shared library cache for the installed JDK: ``` shell - $ echo "$JAVA_HOME/jre/lib/amd64/server" > /etc/ld.so.conf.d/libjdk.conf - $ ldconfig + root@hawq-node$ echo "$JAVA_HOME/jre/lib/amd64/server" > /etc/ld.so.conf.d/libjdk.conf + root@hawq-node$ ldconfig ``` 4. Make sure that your HAWQ cluster is running, you have sourced `greenplum_path.sh` and that your `$GPHOME` environment variable is set. + ``` shell + $ ssh gpadmin@master + gpadmin@master$ source /usr/local/hawq/greenplum_path.sh + gpadmin@master$ hawq state + ``` ### <a id="enablepljava"></a>Enable PL/Java and Install JAR Files -To use PL/Java: +To use PL/Java, you must: 1. Enable the language for each database. -1. Install user-created JAR files on all HAWQ hosts. -1. Add the names of the JAR files to the HAWQ `pljava_classpath` server configuration parameter. This parameter value should identify a list of the installed JAR files. +1. Install custom JAR files on all HAWQ hosts. +1. Add the names of the JAR files to the HAWQ `pljava_classpath` server configuration parameter. This parameter value should identify a list of all installed JAR files. #### <a id="enablepljava"></a>Enable PL/Java and Install JAR Files Perform the following steps as the `gpadmin` user: -1. Enable PL/Java by running the `$GPHOME/share/postgresql/pljava/install.sql` SQL script in the databases that will use PL/Java. The `install.sql` script registers both the trusted and untrusted PL/Java languages. For example, the following command enables PL/Java on a database named `testdb`: +1. Enable PL/Java by running the `$GPHOME/share/postgresql/pljava/install.sql` SQL script on the databases that will use PL/Java. The `install.sql` script registers both the trusted (`java`) and untrusted PL/Java (`javau`) languages. For example, the following command enables the PL/Java languages on a database named `testdb`: ``` shell - $ psql -d testdb -f $GPHOME/share/postgresql/pljava/install.sql + gpadmin@master$ psql -d testdb -f $GPHOME/share/postgresql/pljava/install.sql ``` To enable the PL/Java extension in all new HAWQ databases, run the script on the `template1` database: ``` shell - $ psql -d template1 -f $GPHOME/share/postgresql/pljava/install.sql + gpadmin@master$ psql -d template1 -f $GPHOME/share/postgresql/pljava/install.sql ``` - Use this option *only* if you are certain you want to enable PL/Java in all new databases. + Use this option *only* if you are certain you want to enable both the trusted and untrusted versions of PL/Java in all newly-created databases. This operation takes effect on all databases created after the command is invoked. -2. Copy your Java archives (JAR files) to `$GPHOME/lib/postgresql/java/` on all HAWQ hosts. This example uses the `hawq scp` utility to copy the `myclasses.jar` file located in the current directory: +2. Copy all custom Java JAR files to `$GPHOME/lib/postgresql/java/` on all HAWQ nodes. This example uses the `hawq scp` utility to copy the `myclasses.jar` file located in the current directory to the list of hosts identified in the `hawq_hosts` file, also located in the current directory: ``` shell - $ hawq scp -f hawq_hosts myclasses.jar =:$GPHOME/lib/postgresql/java/ + gpadmin@master$ hawq scp -f hawq_hosts myclasses.jar =:/usr/local/hawq/lib/postgresql/java/ ``` - The `hawq_hosts` file contains a list of the HAWQ hosts. -3. Add the JAR files to the `pljava_classpath` configuration parameter. Refer to [Setting PL/Java Configuration Parameters](#setting_serverconfigparams) for the specific procedure. +3. Add all JAR files to the `pljava_classpath` configuration parameter. Refer to [Setting PL/Java Configuration Parameters](#setting_serverconfigparams) for the specific procedure. -5. (Optional) Your HAWQ installation includes an `examples.sql` file. This script contains sample PL/Java functions that you can use for testing. Run the commands in this file to create and run test functions that use the Java classes in `examples.jar`: +5. (Optional) Your HAWQ installation includes an `examples.sql` file. This script includes sample PL/Java functions that you can use to test PL/Java installation. Run the commands in this file to create and execute PL/Java user-defined functions that use the Java classes implemented in `examples.jar`. The database on which you run the script must have PL/Java enabled: --- End diff -- "test PL/Java" -> "test the PL/Java" > PL/Java doc page - validate and enhance info > -------------------------------------------- > > Key: HAWQ-1252 > URL: https://issues.apache.org/jira/browse/HAWQ-1252 > Project: Apache HAWQ > Issue Type: Improvement > Components: Documentation > Reporter: Lisa Owen > Assignee: David Yozie > Priority: Minor > > PL/Java docs section: > - review and validate current content > - make sure examples run and are relevant > - develop new examples as appropriate -- This message was sent by Atlassian JIRA (v6.3.4#6332)