Github user dyozie commented on a diff in the pull request:
https://github.com/apache/incubator-hawq-docs/pull/83#discussion_r95659535
--- Diff: markdown/plext/using_pljava.html.md.erb ---
@@ -2,205 +2,222 @@
title: Using PL/Java
---
-This section contains an overview of the HAWQ PL/Java language.
+This section provides an overview of the HAWQ PL/Java procedural language
extension.
## <a id="aboutpljava"></a>About PL/Java
-With the HAWQ PL/Java extension, you can write Java methods using your
favorite Java IDE and install the JAR files that implement the methods in your
HAWQ cluster.
+With the PL/Java extension, you can write Java methods using your favorite
Java IDE and invoke the methods from PostgreSQL user-defined functions (UDFs).
-**Note**: If building HAWQ from source, you must specify PL/Java as a
build option when compiling HAWQ. To use PL/Java in a HAWQ deployment, you must
explicitly enable the PL/Java extension in all desired databases.
+The HAWQ PL/Java package is based on the open source PL/Java 1.4.0 and
provides the following features:
-The HAWQ PL/Java package is based on the open source PL/Java 1.4.0. HAWQ
PL/Java provides the following features.
-
-- Ability to execute PL/Java functions with Java 1.6 or 1.7.
-- Standardized utilities (modeled after the SQL 2003 proposal) to install
and maintain Java code in the database.
-- Standardized mappings of parameters and result. Complex types as well as
sets are supported.
-- An embedded, high performance, JDBC driver utilizing the internal HAWQ
Database SPI routines.
+- PL/Java function execution with Java 1.6 or 1.7.
+- Standardized mappings of Java and PostgreSQL parameters and results.
Complex types as well as sets are supported.
+- Two HAWQ Java languages:
+ - `java` - Tusted PL/Java language
+ - `javau` - Untrusted PL/Java language (superusers only)
+- An embedded, high performance, JDBC driver utilizing the internal HAWQ
Database Server Programming Interface (SPI).
- Metadata support for the JDBC driver. Both `DatabaseMetaData` and
`ResultSetMetaData` are included.
-- The ability to return a `ResultSet` from a query as an alternative to
building a ResultSet row by row.
+- A standard way of passing parameters and return values. Complex types
and sets are passed using the standard JDBC `ResultSet` class.
+- The ability to return a `ResultSet` from a query as an alternative to
building a `ResultSet` row by row.
- Full support for savepoints and exception handling.
-- The ability to use IN, INOUT, and OUT parameters.
-- Two separate HAWQ languages:
- - pljava, TRUSTED PL/Java language
- - pljavau, UNTRUSTED PL/Java language
- Transaction and Savepoint listeners enabling code execution when a
transaction or savepoint is committed or rolled back.
- Integration with GNU GCJ on selected platforms.
-A function in SQL will appoint a static method in a Java class. In order
for the function to execute, the appointed class must available on the class
path specified by the HAWQ server configuration parameter `pljava_classpath`.
The PL/Java extension adds a set of functions that helps to install and
maintain the Java classes. Classes are stored in normal Java archives, JAR
files. A JAR file can optionally contain a deployment descriptor that in turn
contains SQL commands to be executed when the JAR is deployed or undeployed.
The functions are modeled after the standards proposed for SQL 2003.
-
-PL/Java implements a standard way of passing parameters and return values.
Complex types and sets are passed using the standard JDBC ResultSet class.
-
-A JDBC driver is included in PL/Java. This driver calls HAWQ internal SPI
routines. The driver is essential since it is common for functions to make
calls back to the database to fetch data. When PL/Java functions fetch data,
they must use the same transactional boundaries that are used by the main
function that entered PL/Java execution context.
+PL/Java is optimized for performance. The Java virtual machine executes
within the same process as the backend, minimizing call overhead. PL/Java
brings the power of Java to the database itself, enabling the
database-intensive business logic to execute as close to the actual data as
possible.
-PL/Java is optimized for performance. The Java virtual machine executes
within the same process as the backend to minimize call overhead. PL/Java is
designed with the objective to enable the power of Java to the database itself
so that database intensive business logic can execute as close to the actual
data as possible.
-
-The standard Java Native Interface (JNI) is used when bridging calls
between the backend and the Java VM.
+Calls between the HAWQ backend and the Java VM are bridged using the
standard Java Native Interface (JNI).
## <a id="abouthawqpljava"></a>About HAWQ PL/Java
+PL/Java is embedded in your HAWQ product distribution or within your HAWQ
build if you chose to enable it as a build option. To use PL/Java in a HAWQ
deployment, you must explicitly enable the PL/Java extension in all desired
databases.
+
There are a few key differences between the implementation of PL/Java in
standard PostgreSQL and HAWQ.
-### <a id="pljavafunctions"></a>Functions
+### <a id="pljavafunctions"></a>Functions Not Supported
-The following functions are not supported in HAWQ. The classpath is
handled differently in a distributed HAWQ environment than in the PostgreSQL
environment.
+The following functions are not supported in HAWQ. The PL/Java classpath
is handled differently in a distributed HAWQ environment than that of the
PostgreSQL environment.
-- sqlj.install_jar
-- sqlj.install_jar
-- sqlj.replace_jar
-- sqlj.remove_jar
-- sqlj.get_classpath
-- sqlj.set_classpath
+- sqlj.install_jar()
+- sqlj.install_jar()
+- sqlj.replace_jar()
+- sqlj.remove_jar()
+- sqlj.get_classpath()
+- sqlj.set_classpath()
-HAWQ uses the `pljava_classpath` server configuration parameter in place
of the `sqlj.set_classpath` function.
+HAWQ uses the `pljava_classpath` server configuration parameter in place
of the `sqlj.set_classpath()` function.
### <a id="serverconfigparams"></a>Server Configuration Parameters
-PL/Java uses server configuration parameters to configure classpath, Java
VM, and other options. Refer to the [Server Configuration Parameter
Reference](../reference/HAWQSiteConfig.html) for general information about HAWQ
server configuration parameters.
+PL/Java uses server configuration parameters to configure the classpath,
Java Virtual Machine (VM), and other options. Refer to the [Server
Configuration Parameter Reference](../reference/HAWQSiteConfig.html) for
general information about HAWQ server configuration parameters.
-The following server configuration parameters are used by PL/Java in HAWQ.
These parameters replace the `pljava.*` parameters that are used in the
standard PostgreSQL PL/Java implementation.
+The following server configuration parameters are used by HAWQ PL/Java.
These parameters replace the `pljava.*` configuration parameters that are used
in the standard PostgreSQL PL/Java implementation.
#### pljava\_classpath
-A colon (:) separated list of the jar files containing the Java classes
used in any PL/Java functions. The jar files must be installed in the same
locations on all HAWQ hosts. With the trusted PL/Java language handler, jar
file paths must be relative to the `$GPHOME/lib/postgresql/java/` directory.
With the untrusted language handler (javaU language tag), paths may be relative
to `$GPHOME/lib/postgresql/java/` or absolute.
+A PL/Java user-defined function will appoint a static method in a Java
class. In order for the function to execute, the appointed class must be
available on the class path specified by the
[`pljava_classpath`](../reference/guc/parameter_definitions.html#pljava_classpath)
HAWQ server configuration parameter.
-#### pljava\_statement\_cache\_size
-
-Sets the size in KB of the Most Recently Used (MRU) cache for prepared
statements.
+`pljava_classpath` is a colon (:) separated list of the JAR files
implementing the Java classes referenced in any PL/Java user-defined functions.
The JAR files must be installed in the same locations on all HAWQ hosts. When
using the trusted PL/Java language handler (`java` language tag), JAR file
paths must be relative to the `$GPHOME/lib/postgresql/java/` directory.
Relative or absolute JAR file paths may be used with the untrusted PL/Java
language handler (`javau` language tag).
#### pljava\_release\_lingering\_savepoints
-If TRUE, lingering savepoints will be released on function exit. If FALSE,
they will be rolled back.
+[`pljava_release_lingering_savepoints`](../reference/guc/parameter_definitions.html#pljava_release_lingering_savepoints)
identifies the action to perform when lingering savepoints exist on exit of a
PL/Java user-defined function. If TRUE, lingering savepoints will be released
on function exit. If FALSE, lingering savepoints will be rolled back.
+
+#### pljava\_statement\_cache\_size
+
+[`pljava_statement_cache_size`](../reference/guc/parameter_definitions.html#pljava_statement_cache_size)
sets the size (in KB) of the Most Recently Used (MRU) cache for prepared
statements.
#### pljava\_vmoptions
-Defines the start up options for the Java VM.
+[`pljava_vmoptions`](../reference/guc/parameter_definitions.html#pljava_vmoptions)
defines the start-up options for the Java VM.
### <a id="setting_serverconfigparams"></a>Setting PL/Java Configuration
Parameters
-You can set PL/Java server configuration parameters at the session level,
or globally across your whole cluster. Your HAWQ cluster configuration must be
reloaded after setting a server configuration value globally.
+You can set PL/Java server configuration parameters at the session level,
or globally across your whole cluster.
#### <a id="setsrvrcfg_global"></a>Cluster Level
You will perform different procedures to set a PL/Java server
configuration parameter for your whole HAWQ cluster depending upon whether you
manage your cluster from the command line or use Ambari. If you use Ambari to
manage your HAWQ cluster, you must ensure that you update server configuration
parameters only via the Ambari Web UI. If you manage your HAWQ cluster from the
command line, you will use the `hawq config` command line utility to set
PL/Java server configuration parameters.
+Your HAWQ cluster configuration must be reloaded after setting a server
configuration value globally.
+
The following examples add a JAR file named `myclasses.jar` to the
`pljava_classpath` server configuration parameter for the entire HAWQ cluster.
If you use Ambari to manage your HAWQ cluster:
-1. Set the `pljava_classpath` configuration property to include
`myclasses.jar` via the HAWQ service **Configs > Advanced > Custom hawq-site**
drop down.
+1. Set the `pljava_classpath` configuration property **Key** **Value** to
include `myclasses.jar` via the HAWQ service **Configs > Advanced > Custom
hawq-site** drop down.
2. Select **Service Actions > Restart All** to load the updated
configuration.
If you manage your HAWQ cluster from the command line:
-1. Log in to the HAWQ master host as a HAWQ administrator and source the
file `/usr/local/hawq/greenplum_path.sh`.
+1. Log in to the HAWQ master node as a HAWQ administrator and set up your
HAWQ environment.
``` shell
- $ source /usr/local/hawq/greenplum_path.sh
+ $ ssh gpadmin@master
+ gpadmin@master$ source /usr/local/hawq/greenplum_path.sh
```
1. Use the `hawq config` utility to set `pljava_classpath`:
``` shell
- $ hawq config -c pljava_classpath -v \'myclasses.jar\'
+ gpadmin@master$ hawq config -c pljava_classpath -v 'myclasses.jar'
```
+
+ The `-c` option identifies the name of the server configuration
parameter, `-v` identifies the new value.
+
2. Reload the HAWQ configuration:
``` shell
- $ hawq stop cluster -u
+ gpadmin@master$ hawq stop cluster -u
```
#### <a id="setsrvrcfg_session"></a>Session Level
-To set a PL/Java server configuration parameter for only the *current*
database session, set the parameter within the `psql` subsystem. For example,
to set `pljava_classpath`:
+To set a PL/Java server configuration parameter for only the *current*
database session, set the parameter within the `psql` subsystem. For example:
``` sql
=> SET pljava_classpath='myclasses.jar';
```
+## <a id="security"></a>Security
-## <a id="enablepljava"></a>Enabling and Removing PL/Java Support
+This section describes specific security considerations when using PL/Java.
+
+### <a id="installation"></a>Enable/Disable
+
+Only a database superuser can enable/disable PL/Java. The PL/Java utility
functions are created using `SECURITY DEFINER` privileges; they execute with
the access permissions that were granted to the creator of the functions.
+
+### <a id="trustedlang"></a>Trusted/Untrusted Language
-The PL/Java extension must be explicitly enabled on each database in which
it will be used.
+PL/Java includes trusted and untrusted language variants.
+The PL/Java trusted language is named `java`. The trusted PL/Java language
has no access to the file system as stipulated by PostgreSQL definition of a
trusted language. Any database user can create and execute functions in the
trusted `java` language.
+
+The PL/Java untrusted language is named `javau`. `javau` offers no way of
restricting what you can program in UDFs created with the language. Creating
untrusted PL/Java UDFs is permitted only by database superusers. Executing
untrusted PL/Java UDFs is permitted only by database superusers and other
database users explicitly `GRANT`ed the permissions.
+## <a id="enablepljava"></a>Enabling and Removing PL/Java Support
+
+The PL/Java extension must be explicitly enabled on each database in which
it will be used. To remove PL/Java support, the extension must be explicitly
removed from each database in which it was used.
### <a id="pljavaprereq"></a>Prerequisites
Before you enable PL/Java:
1. Ensure that you have installed a supported Java runtime environment and
that the `$JAVA_HOME` variable is set to the same path on the master and all
segment nodes.
-2. Perform the following step on all machines to set up `ldconfig` for the
installed JDK:
+2. Perform the following step on all nodes in your HAWQ cluster to set up
the shared library cache for the installed JDK:
``` shell
- $ echo "$JAVA_HOME/jre/lib/amd64/server" > /etc/ld.so.conf.d/libjdk.conf
- $ ldconfig
+ root@hawq-node$ echo "$JAVA_HOME/jre/lib/amd64/server" >
/etc/ld.so.conf.d/libjdk.conf
+ root@hawq-node$ ldconfig
```
4. Make sure that your HAWQ cluster is running, you have sourced
`greenplum_path.sh` and that your `$GPHOME` environment variable is set.
+ ``` shell
+ $ ssh gpadmin@master
+ gpadmin@master$ source /usr/local/hawq/greenplum_path.sh
+ gpadmin@master$ hawq state
+ ```
### <a id="enablepljava"></a>Enable PL/Java and Install JAR Files
-To use PL/Java:
+To use PL/Java, you must:
1. Enable the language for each database.
-1. Install user-created JAR files on all HAWQ hosts.
-1. Add the names of the JAR files to the HAWQ `pljava_classpath` server
configuration parameter. This parameter value should identify a list of the
installed JAR files.
+1. Install custom JAR files on all HAWQ hosts.
+1. Add the names of the JAR files to the HAWQ `pljava_classpath` server
configuration parameter. This parameter value should identify a list of all
installed JAR files.
#### <a id="enablepljava"></a>Enable PL/Java and Install JAR Files
Perform the following steps as the `gpadmin` user:
-1. Enable PL/Java by running the
`$GPHOME/share/postgresql/pljava/install.sql` SQL script in the databases that
will use PL/Java. The `install.sql` script registers both the trusted and
untrusted PL/Java languages. For example, the following command enables PL/Java
on a database named `testdb`:
+1. Enable PL/Java by running the
`$GPHOME/share/postgresql/pljava/install.sql` SQL script on the databases that
will use PL/Java. The `install.sql` script registers both the trusted (`java`)
and untrusted PL/Java (`javau`) languages. For example, the following command
enables the PL/Java languages on a database named `testdb`:
``` shell
- $ psql -d testdb -f $GPHOME/share/postgresql/pljava/install.sql
+ gpadmin@master$ psql -d testdb -f
$GPHOME/share/postgresql/pljava/install.sql
```
To enable the PL/Java extension in all new HAWQ databases, run the
script on the `template1` database:
``` shell
- $ psql -d template1 -f $GPHOME/share/postgresql/pljava/install.sql
+ gpadmin@master$ psql -d template1 -f
$GPHOME/share/postgresql/pljava/install.sql
```
- Use this option *only* if you are certain you want to enable PL/Java
in all new databases.
+ Use this option *only* if you are certain you want to enable both the
trusted and untrusted versions of PL/Java in all newly-created databases. This
operation takes effect on all databases created after the command is invoked.
-2. Copy your Java archives (JAR files) to `$GPHOME/lib/postgresql/java/`
on all HAWQ hosts. This example uses the `hawq scp` utility to copy the
`myclasses.jar` file located in the current directory:
+2. Copy all custom Java JAR files to `$GPHOME/lib/postgresql/java/` on all
HAWQ nodes. This example uses the `hawq scp` utility to copy the
`myclasses.jar` file located in the current directory to the list of hosts
identified in the `hawq_hosts` file, also located in the current directory:
``` shell
- $ hawq scp -f hawq_hosts myclasses.jar =:$GPHOME/lib/postgresql/java/
+ gpadmin@master$ hawq scp -f hawq_hosts myclasses.jar
=:/usr/local/hawq/lib/postgresql/java/
```
- The `hawq_hosts` file contains a list of the HAWQ hosts.
-3. Add the JAR files to the `pljava_classpath` configuration parameter.
Refer to [Setting PL/Java Configuration
Parameters](#setting_serverconfigparams) for the specific procedure.
+3. Add all JAR files to the `pljava_classpath` configuration parameter.
Refer to [Setting PL/Java Configuration
Parameters](#setting_serverconfigparams) for the specific procedure.
-5. (Optional) Your HAWQ installation includes an `examples.sql` file.
This script contains sample PL/Java functions that you can use for testing. Run
the commands in this file to create and run test functions that use the Java
classes in `examples.jar`:
+5. (Optional) Your HAWQ installation includes an `examples.sql` file. This
script includes sample PL/Java functions that you can use to test PL/Java
installation. Run the commands in this file to create and execute PL/Java
user-defined functions that use the Java classes implemented in `examples.jar`.
The database on which you run the script must have PL/Java enabled:
``` shell
- $ psql -f $GPHOME/share/postgresql/pljava/examples.sql
+ gpadmin@master$ psql -d <dbname> -f
$GPHOME/share/postgresql/pljava/examples.sql
```
#### Configuring PL/Java VM Options
-PL/Java JVM options can be configured via the `pljava_vmoptions` server
configuration parameter. For example, `pljava_vmoptions=-Xmx512M` sets the
maximum heap size of the JVM. The default `-Xmx` value is `64M`.
+Configure PL/Java JVM options via the `pljava_vmoptions` server
configuration parameter. For example, `pljava_vmoptions=-Xmx512M` sets the
maximum heap size of the JVM. (The default `-Xmx` value is `64M`.)
Refer to [Setting PL/Java Configuration
Parameters](#setting_serverconfigparams) for the specific procedure to set
PL/Java server configuration parameters.
-
### <a id="uninstallpljava"></a>Disable PL/Java
To disable PL/Java, you should:
1. Remove PL/Java support from each database in which it was added.
-2. Uninstall the Java JAR files.
+2. Uninstall/remove the Java JAR files.
#### <a id="uninstallpljavasupport"></a>Remove PL/Java Support from
Databases
-For a database that no longer requires the PL/Java language, remove
support for PL/Java by running the `uninstall.sql` script as the `gpadmin`
user. For example, the following command disables the PL/Java language in the
specified database:
+For a database that no longer requires the PL/Java language, remove
support for PL/Java by running the `uninstall.sql` script as the `gpadmin`
user. For example, the following command disables the PL/Java languages in the
specified database:
--- End diff --
Edit: Seems ok to remove "For a database that no longer requires the
PL/Java language, " here.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---