epugh commented on a change in pull request #2215:
URL: https://github.com/apache/lucene-solr/pull/2215#discussion_r559696904



##########
File path: solr/solr-ref-guide/src/scripting-update-processor.adoc
##########
@@ -0,0 +1,295 @@
+= Scripting Update Processor
+// Licensed to the Apache Software Foundation (ASF) under one
+// or more contributor license agreements.  See the NOTICE file
+// distributed with this work for additional information
+// regarding copyright ownership.  The ASF licenses this file
+// to you under the Apache License, Version 2.0 (the
+// "License"); you may not use this file except in compliance
+// with the License.  You may obtain a copy of the License at
+//
+//   http://www.apache.org/licenses/LICENSE-2.0
+//
+// Unless required by applicable law or agreed to in writing,
+// software distributed under the License is distributed on an
+// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+// KIND, either express or implied.  See the License for the
+// specific language governing permissions and limitations
+// under the License.
+
+The 
{solr-javadocs}/contrib/scripting/org/apache/solr/scripting/update/ScriptUpdateProcessorFactory.html[ScriptUpdateProcessor]
 allows Java scripting engines to be used
+during Solr document update processing, allowing dramatic flexibility in
+expressing custom document processing logic before being indexed.  It has 
hooks to the
+commit, delete, rollback, etc indexing actions, however add is the most common 
usage.
+It is implemented as an UpdateProcessor to be placed in an UpdateChain.
+
+TIP: This used to be known as the _StatelessScriptingUpdateProcessor_ and was 
renamed to clarify the key aspect of this update processor is it enables 
scripting.
+
+The script can be written in any scripting language supported by your JVM (such
+as JavaScript), and executed dynamically so no pre-compilation is necessary.
+
+WARNING: Being able to run a script of your choice as part of the indexing 
pipeline is a really powerful tool, that I sometimes call the
+_Get out of jail free_ card because you can solve some problems this way that 
you can't in any other way.  However, you are introducing some
+potential security vulnerabilities.
+
+== Installing the ScriptingUpdateProcessor and Scripting Engines
+
+The scripting update processor lives in the contrib module 
`/contrib/scripting`, and you need to explicitly add it to your Solr setup.
+
+Java 11 and previous versions come with a JavaScript engine called Nashorn, 
but Java 12 will require you to add your own JavaScript engine.   Other 
supported scripting engines like
+JRuby, Jython, Groovy, all require you to add JAR files.
+
+
+You can either add the `dist/solr-scripting-*.jar` file into Solr’s resource 
loader in a core `lib/` directory, or via `<lib>` directives in 
`solrconfig.xml`:
+
+[source,xml]
+----
+<lib dir="${solr.install.dir:../../..}/dist/" regex="solr-scripting-\d.*\.jar" 
/>
+----
+
+Likewise you will need to add some JAR files depending on which scripting 
engines you choose.
+
+
+== Configuration
+
+[source,xml]
+----
+<updateRequestProcessorChain name="script">
+   <processor 
class="org.apache.solr.scripting.update.ScriptUpdateProcessorFactory">
+     <str name="script">update-script.js</str>
+   </processor>
+   <!--  optional parameters passed to script
+     <lst name="params">
+       <str name="config_param">example config parameter</str>
+     </lst>
+   -->
+   <processor class="solr.LogUpdateProcessorFactory" />
+   <processor class="solr.RunUpdateProcessorFactory" />
+ </updateRequestProcessorChain>
+----
+
+NOTE: The processor supports the defaults/appends/invariants concept for its 
config.
+However, it is also possible to skip this level and configure the parameters 
directly underneath the `<processor>` tag.
+
+Below follows a list of each configuration parameters and their meaning:
+
+`script`::
+The script file name. The script file must be placed in the `conf/ directory.
+There can be one or more "script" parameters specified; multiple scripts are 
executed in the order specified.
+
+`engine`::
+Optionally specifies the scripting engine to use. This is only needed if the 
extension
+of the script file is not a standard mapping to the scripting engine. For 
example, if your
+script file was coded in JavaScript but the file name was called 
`update-script.foo`,
+use "javascript" as the engine name.
+
+`params`::
+Optional parameters that are passed into the script execution context. This is
+specified as a named list (`<lst>`) structure with nested typed parameters. If
+specified, the script context will get a "params" object, otherwise there will 
be no "params" object available.
+
+
+== Script execution context
+
+Every script has some variables provided to it.
+
+`logger`::
+Logger (org.slf4j.Logger) instance. This is useful for logging information 
from the script.
+
+`req`::
+{solr-javadocs}/core/org/apache/solr/response/SolrQueryResponse.html[SolrQueryRequest]
 instance.
+
+`rsp`::
+{solr-javadocs}/core/org/apache/solr/response/SolrQueryResponse.html[SolrQueryResponse]
 instance.
+
+`params`::
+The "params" object, if any specified, from the configuration.
+
+== Examples
+
+The `processAdd()` and the other script methods can return false to skip 
further
+processing of the document. All methods must be defined, though generally the
+`processAdd()` method is where the action is.
+
+Here's a URL that works with the techproducts example setup demonstrating 
specifying
+the "script" update chain: 
`http://localhost:8983/solr/techproducts/update?commit=true&stream.contentType=text/csv&fieldnames=id,description&stream.body=1,foo&update.chain=script`
+which logs the following:
+
+[source,text]
+----
+INFO: update-script#processAdd: id=1
+----
+
+You can see the message recorded in the Solr logging UI.
+
+=== Javascript
+
+Note: There is a JavaScript example `update-script.js` as part of the 
`techproducts` configset.
+Check `solrconfig.xml` and uncomment the update request processor definition 
to enable this feature.
+
+[source,javascript]
+----
+function processAdd(cmd) {
+
+  doc = cmd.solrDoc;  // org.apache.solr.common.SolrInputDocument
+  id = doc.getFieldValue("id");
+  logger.info("update-script#processAdd: id=" + id);
+
+// Set a field value:
+//  doc.setField("foo_s", "whatever");
+
+// Get a configuration parameter:
+//  config_param = params.get('config_param');  // "params" only exists if 
processor configured with <lst name="params">
+
+// Get a request parameter:
+// some_param = req.getParams().get("some_param")
+
+// Add a field of field names that match a pattern:
+//   - Potentially useful to determine the fields/attributes represented in a 
result set, via faceting on field_name_ss
+//  field_names = doc.getFieldNames().toArray();
+//  for(i=0; i < field_names.length; i++) {
+//    field_name = field_names[i];
+//    if (/attr_.*/.test(field_name)) { doc.addField("attribute_ss", 
field_names[i]); }
+//  }
+
+}
+
+function processDelete(cmd) {
+  // no-op
+}
+
+function processMergeIndexes(cmd) {
+  // no-op
+}
+
+function processCommit(cmd) {
+  // no-op
+}
+
+function processRollback(cmd) {
+  // no-op
+}
+
+function finish() {
+  // no-op
+}
+----
+
+=== JRuby
+
+To use JRuby as the scripting engine, add `jruby.jar` to Solr's resource 
loader.

Review comment:
       Reworked it, and provide a link at the top to pint to the resource 
loader ref guide page.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

Reply via email to