This is an automated email from the ASF dual-hosted git repository.
epugh pushed a commit to branch main
in repository https://gitbox.apache.org/repos/asf/solr.git
The following commit(s) were added to refs/heads/main by this push:
new f14e0d0f4ff Fix typos and improve grammar in Query Guide (#2024)
f14e0d0f4ff is described below
commit f14e0d0f4ffe9eab4dd3822cdbe826e98fea078f
Author: Andrey Bozhko <[email protected]>
AuthorDate: Tue Dec 12 15:19:02 2023 -0600
Fix typos and improve grammar in Query Guide (#2024)
---------
Co-authored-by: Andrey Bozhko <[email protected]>
---
.../query-guide/pages/block-join-query-parser.adoc | 6 +++---
.../pages/collapse-and-expand-results.adoc | 4 ++--
.../query-guide/pages/common-query-parameters.adoc | 12 +++++------
.../query-guide/pages/computational-geometry.adoc | 8 ++++----
.../query-guide/pages/dense-vector-search.adoc | 4 ++--
.../query-guide/pages/dismax-query-parser.adoc | 6 +++---
.../query-guide/pages/document-transformers.adoc | 10 ++++-----
.../modules/query-guide/pages/dsp.adoc | 8 ++++----
.../query-guide/pages/exporting-result-sets.adoc | 2 +-
.../modules/query-guide/pages/faceting.adoc | 6 +++---
.../modules/query-guide/pages/graph-traversal.adoc | 8 ++++----
.../modules/query-guide/pages/graph.adoc | 4 ++--
.../modules/query-guide/pages/highlighting.adoc | 6 +++---
.../modules/query-guide/pages/jdbc-zeppelin.adoc | 2 +-
.../query-guide/pages/join-query-parser.adoc | 6 +++---
.../modules/query-guide/pages/json-facet-api.adoc | 12 +++++------
.../modules/query-guide/pages/json-query-dsl.adoc | 2 +-
.../query-guide/pages/json-request-api.adoc | 2 +-
.../modules/query-guide/pages/loading.adoc | 4 ++--
.../modules/query-guide/pages/logs.adoc | 14 ++++++-------
.../query-guide/pages/machine-learning.adoc | 24 +++++++++++-----------
.../modules/query-guide/pages/math-start.adoc | 2 +-
.../query-guide/pages/numerical-analysis.adoc | 4 ++--
.../modules/query-guide/pages/other-parsers.adoc | 24 +++++++++++-----------
.../query-guide/pages/pagination-of-results.adoc | 14 ++++++-------
.../pages/probability-distributions.adoc | 4 ++--
.../query-guide/pages/query-re-ranking.adoc | 6 +++---
.../modules/query-guide/pages/regression.adoc | 2 +-
.../query-guide/pages/response-writers.adoc | 6 +++---
.../modules/query-guide/pages/search-sample.adoc | 10 ++++-----
.../pages/searching-nested-documents.adoc | 4 ++--
.../modules/query-guide/pages/simulations.adoc | 4 ++--
.../modules/query-guide/pages/spatial-search.adoc | 6 +++---
.../modules/query-guide/pages/spell-checking.adoc | 2 +-
.../modules/query-guide/pages/sql-query.adoc | 16 +++++++--------
.../query-guide/pages/standard-query-parser.adoc | 2 +-
.../modules/query-guide/pages/statistics.adoc | 10 ++++-----
.../modules/query-guide/pages/stats-component.adoc | 2 +-
.../pages/stream-decorator-reference.adoc | 22 ++++++++++----------
.../pages/stream-evaluator-reference.adoc | 16 +++++++--------
.../query-guide/pages/stream-source-reference.adoc | 14 ++++++-------
.../query-guide/pages/streaming-expressions.adoc | 4 ++--
.../modules/query-guide/pages/suggester.adoc | 2 +-
.../modules/query-guide/pages/tagger-handler.adoc | 4 ++--
.../query-guide/pages/term-vector-component.adoc | 2 +-
.../modules/query-guide/pages/transform.adoc | 4 ++--
.../modules/query-guide/pages/variables.adoc | 2 +-
47 files changed, 169 insertions(+), 169 deletions(-)
diff --git
a/solr/solr-ref-guide/modules/query-guide/pages/block-join-query-parser.adoc
b/solr/solr-ref-guide/modules/query-guide/pages/block-join-query-parser.adoc
index 95ad96d0a53..2eed9721639 100644
--- a/solr/solr-ref-guide/modules/query-guide/pages/block-join-query-parser.adoc
+++ b/solr/solr-ref-guide/modules/query-guide/pages/block-join-query-parser.adoc
@@ -178,13 +178,13 @@ When subordinate clause (`<someChildren>`) is omitted,
it's parsed as a _segment
[#block-mask]
== Block Masks: The `of` and `which` local params
-The purpose of the "Block Mask" query specified as either an `of` or `which`
param (depending on the parser used) is to identy the set of all documents in
the index which should be treated as "parents" _(or their ancestors)_ and which
documents should be treated as "children".
+The purpose of the "Block Mask" query specified as either an `of` or `which`
param (depending on the parser used) is to identify the set of all documents in
the index which should be treated as "parents" _(or their ancestors)_ and which
documents should be treated as "children".
This is important because in the "on disk" index, the relationships are
flattened into "blocks" of documents, so the `of` / `which` params are needed
to serve as a "mask" against the flat document blocks to identify the
boundaries of every hierarchical relationship.
In the example queries above, we were able to use a very simple Block Mask of
`doc_type:parent` because our data is very simple: every document is either a
`parent` or a `child`.
So this query string easily distinguishes _all_ of our documents.
-A common mistake is to try and use a `which` parameter that is more
restrictive then the set of all parent documents, in order to filter the
parents that are matched, as in this bad example:
+A common mistake is to try and use a `which` parameter that is more
restrictive than the set of all parent documents, in order to filter the
parents that are matched, as in this bad example:
----
// BAD! DO NOT USE!
@@ -210,4 +210,4 @@ A similar problematic situation can arise when mixing
parent/child documents wit
...then our simple `doc_type:parent` Block Mask would no longer be adequate.
We would instead need to use `\*:* -doc_type:child` or `doc_type:(simple
parent)` to prevent our "simple" document from mistakenly being treated as a
"child" of an adjacent "parent" document.
-The xref:query-guide:searching-nested-documents.adoc[] section contains more
detailed examples of specifying Block Mask queries with non trivial hierarchies
of documents.
+The xref:query-guide:searching-nested-documents.adoc[] section contains more
detailed examples of specifying Block Mask queries with nontrivial hierarchies
of documents.
diff --git
a/solr/solr-ref-guide/modules/query-guide/pages/collapse-and-expand-results.adoc
b/solr/solr-ref-guide/modules/query-guide/pages/collapse-and-expand-results.adoc
index 2d23b9d2144..cfce616a453 100644
---
a/solr/solr-ref-guide/modules/query-guide/pages/collapse-and-expand-results.adoc
+++
b/solr/solr-ref-guide/modules/query-guide/pages/collapse-and-expand-results.adoc
@@ -199,7 +199,7 @@ For example: searching for "grand child" documents and
collapsing on a field tha
[CAUTION]
====
-Specifing `hint=block` when collapsing on a field that is not unique per
contiguous block of documents is not supported and may fail in unexpected ways;
including the possibility of silently returning incorrect results.
+Specifying `hint=block` when collapsing on a field that is not unique per
contiguous block of documents is not supported and may fail in unexpected ways;
including the possibility of silently returning incorrect results.
The implementation does not offer any safeguards against misuse on an
unsupported field, since doing so would require the same group level tracking
as the non-Block collapsing implementation -- defeating the purpose of this
optimization.
====
@@ -228,7 +228,7 @@ q=foo&fq={!collapse field=ISBN}&expand=true
[IMPORTANT]
====
-When used with CollapsingQParserPlugin and there are multiple collapse groups,
the field is chosen from the group with least cost.
+When used with CollapsingQParserPlugin and there are multiple collapse groups,
the field is chosen from the group with the least cost.
If there are multiple collapse groups with same cost then the first specified
one is chosen.
====
diff --git
a/solr/solr-ref-guide/modules/query-guide/pages/common-query-parameters.adoc
b/solr/solr-ref-guide/modules/query-guide/pages/common-query-parameters.adoc
index 29fdf0f17e0..82419ba9754 100644
--- a/solr/solr-ref-guide/modules/query-guide/pages/common-query-parameters.adoc
+++ b/solr/solr-ref-guide/modules/query-guide/pages/common-query-parameters.adoc
@@ -18,7 +18,7 @@
Several query parsers share supported query parameters.
-The following sections describe Solr's common query parameters, which are
supported by the
xref:configuration-guide:requesthandlers-searchcomponents#search-handlers[search
request handlers].
+The following sections describe Solr's common query parameters, which are
supported by the
xref:configuration-guide:requesthandlers-searchcomponents.adoc#search-handlers[search
request handlers].
== defType Parameter
@@ -40,7 +40,7 @@ Solr can sort query responses according to:
* Document scores
* xref:function-queries.adoc#sort-by-function[Function results]
-* The value of any primitive field (numerics, string, boolean, dates, etc.)
which has `docValues="true"` (or `multiValued="false"` and `indexed="true"`, in
which case the indexed terms will used to build DocValue like structures on the
fly at runtime)
+* The value of any primitive field (numerics, string, boolean, dates, etc.)
which has `docValues="true"` (or `multiValued="false"` and `indexed="true"`, in
which case the indexed terms will be used to build DocValue like structures on
the fly at runtime)
* A SortableTextField which implicitly uses `docValues="true"` by default to
allow sorting on the original input string regardless of the analyzers used for
Searching.
* A single-valued TextField that uses an analyzer (such as the
KeywordTokenizer) that produces only a single term per document.
TextField does not support `docValues="true"`, but a DocValue-like structure
will be built on the fly at runtime.
@@ -73,7 +73,7 @@ If there is a third entry, it will only be used if the first
AND second entries
And so on.
** If documents tie in all of the explicit sort criteria, Solr uses each
document's Lucene document ID as the final tie-breaker.
This internal property is subject to change during segment merges and document
updates, which can lead to unexpected result ordering changes.
-Users looking to avoid this behavior can add an additional sort criteria on a
unique or rarely-shared field such as `id` to prevent ties from occurring
(e.g., `price desc,id asc`).
+Users looking to avoid this behavior can define an additional sort criteria on
a unique or rarely-shared field such as `id` to prevent ties from occurring
(e.g., `price desc,id asc`).
== start Parameter
@@ -122,7 +122,7 @@ When using the `fq` parameter, keep in mind the following:
* The `fq` parameter can be specified multiple times in a query.
Documents will only be included in the result if they are in the intersection
of the document sets resulting from each instance of the parameter.
-In the example below, only documents which have a popularity greater then 10
and have a section of 0 will match.
+In the example below, only documents which have a popularity greater than 10
and have a section of 0 will match.
+
[source,text]
----
@@ -142,7 +142,7 @@ Thus, concerning the previous examples: use a single `fq`
containing two mandato
(To learn about tuning cache sizes and making sure a filter cache actually
exists, see xref:configuration-guide:caches-warming.adoc#caches[Caches].)
* It is also possible to use
xref:standard-query-parser.adoc#differences-between-lucenes-classic-query-parser-and-solrs-standard-query-parser[filter(condition)
syntax] inside the `fq` to cache clauses individually and - among other things
- to achieve union of cached filter queries.
-* As with all parameters: special characters in an URL need to be properly
escaped and encoded as hex values.
+* As with all parameters: special characters in a URL need to be properly
escaped and encoded as hex values.
Online tools are available to help you with URL-encoding.
For example: http://meyerweb.com/eric/tools/dencoder/.
@@ -277,7 +277,7 @@ For example:
[source,text]
----
-q=supervillians&debugQuery=on&explainOther=id:juggernaut
+q=supervillains&debugQuery=on&explainOther=id:juggernaut
----
The query above allows you to examine the scoring explain info of the top
matching documents, compare it to the explain info for documents matching
`id:juggernaut`, and determine why the rankings are not as you expect.
diff --git
a/solr/solr-ref-guide/modules/query-guide/pages/computational-geometry.adoc
b/solr/solr-ref-guide/modules/query-guide/pages/computational-geometry.adoc
index 02006bcc6b6..c40052b2b54 100644
--- a/solr/solr-ref-guide/modules/query-guide/pages/computational-geometry.adoc
+++ b/solr/solr-ref-guide/modules/query-guide/pages/computational-geometry.adoc
@@ -37,7 +37,7 @@ An investigation of the border around the rat sightings can
be done to better un
==== Scatter Plot
-Before visualizing the convex hull its often useful to visualize the 2D points
as a scatter plot.
+Before visualizing the convex hull it's often useful to visualize the 2D
points as a scatter plot.
In this example the `random` function draws a sample of records from the
NYC311 (complaints database) collection where the complaint description matches
"rat sighting" and the zip code is 11238.
The latitude and longitude fields are then vectorized and plotted as a scatter
plot with longitude on x-axis and latitude on the y-axis.
@@ -59,7 +59,7 @@ The convex hull is set a variable called `hull`.
Once the convex hull has been created the `getVertices` function can be used to
retrieve the matrix of points in the scatter plot that comprise the convex
border around the scatter plot.
The `colAt` function can then be used to retrieve the latitude and longitude
vectors from the matrix
-so they can visualized by the `zplot` function.
+so they can be visualized by the `zplot` function.
In the example below the convex hull points are visualized as a scatter plot.
image::math-expressions/hullplot.png[]
@@ -68,8 +68,8 @@ Notice that the 15 points in the scatter plot describe that
latitude and longitu
==== Projecting and Clustering
-The once a convex hull as been calculated the `projectToBorder` can then be
used to project points to the nearest point on the border.
-In the example below the `projectToBorder` function is used to project the
original scatter scatter plot points to the nearest border.
+Once a convex hull has been calculated the `projectToBorder` can then be used
to project points to the nearest point on the border.
+In the example below the `projectToBorder` function is used to project the
original scatter plot points to the nearest border.
The `projectToBorder` function returns a matrix of lat/lon points for the
border projections.
In the example the matrix of border points is then clustered into 7 clusters
using kmeans clustering.
diff --git
a/solr/solr-ref-guide/modules/query-guide/pages/dense-vector-search.adoc
b/solr/solr-ref-guide/modules/query-guide/pages/dense-vector-search.adoc
index 076b42fe410..24d7859bb39 100644
--- a/solr/solr-ref-guide/modules/query-guide/pages/dense-vector-search.adoc
+++ b/solr/solr-ref-guide/modules/query-guide/pages/dense-vector-search.adoc
@@ -100,9 +100,9 @@ this similarity is intended as an optimized way to perform
cosine similarity. In
the preferred way to perform cosine similarity is to normalize all vectors to
unit length, and instead use DOT_PRODUCT. You should only use this function if
you need to preserve the original vectors and cannot normalize them in advance.
To use the following advanced parameters that customise the codec format
-and the hyper-parameter of the HNSW algorithm, make sure the
xref:configuration-guide:codec-factory.adoc[Schema Codec Factory], is in use.
+and the hyperparameter of the HNSW algorithm, make sure the
xref:configuration-guide:codec-factory.adoc[Schema Codec Factory], is in use.
-Here's how `DenseVectorField` can be configured with the advanced
hyper-parameters:
+Here's how `DenseVectorField` can be configured with the advanced
hyperparameters:
[source,xml]
<fieldType name="knn_vector" class="solr.DenseVectorField" vectorDimension="4"
similarityFunction="cosine" knnAlgorithm="hnsw" hnswMaxConnections="10"
hnswBeamWidth="40"/>
diff --git
a/solr/solr-ref-guide/modules/query-guide/pages/dismax-query-parser.adoc
b/solr/solr-ref-guide/modules/query-guide/pages/dismax-query-parser.adoc
index c9d0db263f7..5720c4c4212 100644
--- a/solr/solr-ref-guide/modules/query-guide/pages/dismax-query-parser.adoc
+++ b/solr/solr-ref-guide/modules/query-guide/pages/dismax-query-parser.adoc
@@ -92,7 +92,7 @@ The table below explains the various ways that mm values can
be specified.
|Percentage |75% |Sets the minimum number of matching clauses to this
percentage of the total number of optional clauses. The number computed from
the percentage is rounded down and used as the minimum.
|Negative percentage |-25% |Indicates that this percent of the total number of
optional clauses can be missing. The number computed from the percentage is
rounded down, before being subtracted from the total to determine the minimum
number.
|An expression beginning with a positive integer followed by a > or < sign and
another value |3<90% |Defines a conditional expression indicating that if the
number of optional clauses is equal to (or less than) the integer, they are all
required, but if it's greater than the integer, the specification applies. In
this example: if there are 1 to 3 clauses they are all required, but for 4 or
more clauses only 90% are required.
-|Multiple conditional expressions involving > or < signs |2\<-25% 9\<-3
|Defines multiple conditions, each one being valid only for numbers greater
than the one before it. In the example at left, if there are 1 or 2 clauses,
then both are required. If there are 3-9 clauses all but 25% are required. If
there are more then 9 clauses, all but three are required.
+|Multiple conditional expressions involving > or < signs |2\<-25% 9\<-3
|Defines multiple conditions, each one being valid only for numbers greater
than the one before it. In the example at left, if there are 1 or 2 clauses,
then both are required. If there are 3-9 clauses all but 25% are required. If
there are more than 9 clauses, all but three are required.
|===
When specifying `mm` values, keep in mind the following:
@@ -163,7 +163,7 @@ bq=category:food^10
bq=category:deli^5
----
-Using the `bq` parameter in this way is functionally equivilent to combining
your `q` and `bq` parameters into a single larger boolean query, where the
(original) `q` parameter is "mandatory" and the other clauses are optional:
+Using the `bq` parameter in this way is functionally equivalent to combining
your `q` and `bq` parameters into a single larger boolean query, where the
(original) `q` parameter is "mandatory" and the other clauses are optional:
[source,text]
----
@@ -249,7 +249,7 @@ Another request handler is registered at "/instock" and has
slightly different c
`\http://localhost:8983/solr/techproducts/instock?defType=dismax&q=video&fl=name,score,inStock`
One of the other really cool features in this parser is robust support for
specifying the "BooleanQuery.minimumNumberShouldMatch" you want to be used
based on how many terms are in your user's query.
-These allows flexibility for typos and partial matches.
+This allows flexibility for typos and partial matches.
For the dismax parser, one and two word queries require that all of the
optional clauses match, but for three to five word queries one missing word is
allowed.
`\http://localhost:8983/solr/techproducts/select?defType=dismax&q=belkin+ipod`
diff --git
a/solr/solr-ref-guide/modules/query-guide/pages/document-transformers.adoc
b/solr/solr-ref-guide/modules/query-guide/pages/document-transformers.adoc
index addbefedb9d..e454ebafd97 100644
--- a/solr/solr-ref-guide/modules/query-guide/pages/document-transformers.adoc
+++ b/solr/solr-ref-guide/modules/query-guide/pages/document-transformers.adoc
@@ -137,7 +137,7 @@ Note that this transformer can be used even when the query
used to match the res
q=book_title:Solr&fl=id,[child childFilter=doc_type:chapter limit=100]
----
-If the documents involved include a `\_nest_path_` field, then it is used to
re-create the hierarchical structure of the descendent documents using the
original pseudo-field names the documents were indexed with, otherwise the
descendent documents are returned as a flat list of
xref:indexing-guide:indexing-nested-documents#indexing-anonymous-children[anonymous
children].
+If the documents involved include a `\_nest_path_` field, then it is used to
re-create the hierarchical structure of the descendent documents using the
original pseudo-field names the documents were indexed with, otherwise the
descendent documents are returned as a flat list of
xref:indexing-guide:indexing-nested-documents.adoc#indexing-anonymous-children[anonymous
children].
`childFilter`::
+
@@ -281,7 +281,7 @@ fl=id,source_s:[json]&wt=json
This transformer executes a separate query per transforming document passing
document fields as an input for subquery parameters.
It's usually used with `{!join}` and `{!parent}` query parsers, and is
intended to be an improvement for `[child]`.
-* It must be given an unique name: `fl=*,children:[subquery]`
+* It must be given a unique name: `fl=*,children:[subquery]`
* There might be a few of them, e.g.,
`fl=*,sons:[subquery],daughters:[subquery]`.
* Every `[subquery]` occurrence adds a field into a result document with the
given name, the value of this field is a document list, which is a result of
executing subquery using document fields as an input.
* Subquery will use the `/select` search handler by default, and will return
an error if `/select` is not configured.
@@ -391,7 +391,7 @@ If subquery collection has a different unique key field
name (such as `foo_id` i
[source,plain]
foo.fl=id:foo_id&foo.distrib.singlePass=true
-Otherwise you'll get `NullPointerException` from `QueryComponent.mergeIds`.
+Otherwise, you'll get `NullPointerException` from `QueryComponent.mergeIds`.
====
@@ -414,7 +414,7 @@ In a sense this double-storage between docValues and
stored-value storage isn't
=== [features] - LTRFeatureLoggerTransformerFactory
The "LTR" prefix stands for xref:learning-to-rank.adoc[].
-This transformer returns the values of features and it can be used for feature
extraction and feature logging.
+This transformer returns the values of features, and it can be used for
feature extraction and feature logging.
[source,plain]
----
@@ -428,4 +428,4 @@ This will return the values of the features in the
`yourFeatureStore` store.
fl=id,[features]&rq={!ltr model=yourModel}
----
-If you use `[features]` together with an Learning-To-Rank reranking query then
the values of the features in the reranking model (`yourModel`) will be
returned.
+If you use `[features]` together with a Learning-To-Rank reranking query then
the values of the features in the reranking model (`yourModel`) will be
returned.
diff --git a/solr/solr-ref-guide/modules/query-guide/pages/dsp.adoc
b/solr/solr-ref-guide/modules/query-guide/pages/dsp.adoc
index e0662bc5416..846e118bed0 100644
--- a/solr/solr-ref-guide/modules/query-guide/pages/dsp.adoc
+++ b/solr/solr-ref-guide/modules/query-guide/pages/dsp.adoc
@@ -162,8 +162,8 @@ image::math-expressions/noise.png[]
In the next example the random noise is added to the sine wave using the
`ebeAdd` function.
The result of this is plotted in the image below.
Notice that the sine wave has been hidden somewhat within the noise.
-Its difficult to say for sure if there is structure.
-As plots becomes more dense it can become harder to see a pattern hidden
within noise.
+It's difficult to say for sure if there is structure.
+As plots become more dense it can become harder to see a pattern hidden within
noise.
image::math-expressions/hidden-signal.png[]
@@ -249,7 +249,7 @@ In the second example the `fft` function is called on a
vector of random data si
The plot of the real values of the `fft` response is shown below.
Notice that in is this response there is no clear peak.
-Instead all frequencies have accumulated a random level of power.
+Instead, all frequencies have accumulated a random level of power.
This `fft` shows no clear sign of signal and appears to be noise.
image::math-expressions/noise-fft.png[]
@@ -259,6 +259,6 @@ The plot of the real values of the `fft` response is shown
below.
Notice that there are two clear mirrored peaks, at the same locations as the
`fft` of the pure signal.
But there is also now considerable noise on the frequencies.
-The `fft` has found the signal and but also shows that there is considerable
noise along with the signal.
+The `fft` has found the signal and also shows that there is considerable noise
along with the signal.
image::math-expressions/hidden-signal-fft.png[]
diff --git
a/solr/solr-ref-guide/modules/query-guide/pages/exporting-result-sets.adoc
b/solr/solr-ref-guide/modules/query-guide/pages/exporting-result-sets.adoc
index 6719d378013..28c395daa46 100644
--- a/solr/solr-ref-guide/modules/query-guide/pages/exporting-result-sets.adoc
+++ b/solr/solr-ref-guide/modules/query-guide/pages/exporting-result-sets.adoc
@@ -16,7 +16,7 @@
// specific language governing permissions and limitations
// under the License.
-The `/export` request handler allows a fully sorted result set to be streamed
out of Solr using a special xref:query-re-ranking.adoc[rank query parser] and
xef:response-writers.adoc[response writer].
+The `/export` request handler allows a fully sorted result set to be streamed
out of Solr using a special xref:query-re-ranking.adoc[rank query parser] and
xref:response-writers.adoc[response writer].
These have been specifically designed to work together to handle scenarios
that involve sorting and exporting millions of records.
This feature uses a stream sorting technique that begins to send records
within milliseconds and continues to stream results until the entire result set
has been sorted and exported.
diff --git a/solr/solr-ref-guide/modules/query-guide/pages/faceting.adoc
b/solr/solr-ref-guide/modules/query-guide/pages/faceting.adoc
index 8c1cacf9040..e5669dbc936 100644
--- a/solr/solr-ref-guide/modules/query-guide/pages/faceting.adoc
+++ b/solr/solr-ref-guide/modules/query-guide/pages/faceting.adoc
@@ -67,7 +67,7 @@ When using these parameters, it is important to remember that
"term" is a very s
For text fields that include stemming, lowercasing, or word splitting, the
resulting terms may not be what you expect.
If you want Solr to perform both analysis (for searching) and faceting on the
full literal strings, use the `copyField` directive in your Schema to create
two versions of the field: one Text and one String.
-The Text field should have `indexed="true" docValues=“false"` if used for
searching but not faceting and the String field should have `indexed="false"
docValues="true"` if used for faceting but not searching.
+The Text field should have `indexed="true" docValues="false"` if used for
searching but not faceting and the String field should have `indexed="false"
docValues="true"` if used for faceting but not searching.
(For more information about the `copyField` directive, see
xref:indexing-guide:copy-fields.adoc[].)
Unless otherwise specified, all of the parameters below can be specified on a
per-field basis with the syntax of `f.<fieldname>.facet.<parameter>`
@@ -398,8 +398,8 @@ To ensure you avoid double-counting, do not choose both
`lower` and `upper`, do
+
The `facet.range.other` parameter specifies that in addition to the counts for
each range constraint between `facet.range.start` and `facet.range.end`, counts
should also be computed for these options:
-* `before`: All records with field values lower then lower bound of the first
range.
-* `after`: All records with field values greater then the upper bound of the
last range.
+* `before`: All records with field values lower than lower bound of the first
range.
+* `after`: All records with field values greater than the upper bound of the
last range.
* `between`: All records with field values between the start and end bounds of
all ranges.
* `none`: Do not compute any counts.
* `all`: Compute counts for before, between, and after.
diff --git a/solr/solr-ref-guide/modules/query-guide/pages/graph-traversal.adoc
b/solr/solr-ref-guide/modules/query-guide/pages/graph-traversal.adoc
index 1b79156a130..1d034f5bfce 100644
--- a/solr/solr-ref-guide/modules/query-guide/pages/graph-traversal.adoc
+++ b/solr/solr-ref-guide/modules/query-guide/pages/graph-traversal.adoc
@@ -117,7 +117,7 @@ The `node` field contains the node IDs gathered by the
function.
The `collection`, `field`, and `level` of the traversal are also included in
the output.
Notice that the level is "1" for each tuple in the example.
-The root nodes are level 0 (in the example above, the root nodes are
"[email protected], [email protected]") By default the `nodes` function
emits only the _*leaf nodes*_ of the traversal, which is the outer-most node
set.
+The root nodes are level 0 (in the example above, the root nodes are
"[email protected], [email protected]") By default the `nodes` function
emits only the _*leaf nodes*_ of the traversal, which is the outermost node set.
To emit the root nodes you can specify the `scatter` parameter:
[source,plain]
@@ -129,7 +129,7 @@ nodes(emails,
----
The `scatter` parameter controls whether to emit the _branches_ with the
_leaves_.
-The root nodes are considered "branches" because they are not the outer-most
level of the traversal.
+The root nodes are considered "branches" because they are not the outermost
level of the traversal.
When scattering both branches and leaves the output would like this:
@@ -321,7 +321,7 @@ The sample code below shows steps 1 and 2 of the
recommendation:
In the example above, the inner search expression searches the `logs`
collection and returning all the articles viewed by "user1".
The outer `nodes` expression takes all the articles emitted from the inner
search expression and finds all the records in the logs collection for those
articles.
It then gathers and aggregates the users that have read the articles.
-The `maxDocFreq` parameter limits the articles returned to those that appear
in no more then 10,000 log records (per shard).
+The `maxDocFreq` parameter limits the articles returned to those that appear
in no more than 10,000 log records (per shard).
This guards against returning articles that have been viewed by millions of
users.
== Tracking the Traversal
@@ -351,7 +351,7 @@ nodes(emails,
== Cross-Collection Traversals
Nested `nodes` functions can operate on different SolrCloud collections.
-This allow traversals to "walk" from one collection to another to gather nodes.
+This allows traversals to "walk" from one collection to another to gather
nodes.
Cycle detection does not cross collection boundaries, so nodes collected in
one collection will be traversed in a different collection.
This was done deliberately to support cross-collection traversals.
Note that the output from a cross-collection traversal will likely contain
duplicate nodes with different collection attributes.
diff --git a/solr/solr-ref-guide/modules/query-guide/pages/graph.adoc
b/solr/solr-ref-guide/modules/query-guide/pages/graph.adoc
index 5977e279673..edaab44f18f 100644
--- a/solr/solr-ref-guide/modules/query-guide/pages/graph.adoc
+++ b/solr/solr-ref-guide/modules/query-guide/pages/graph.adoc
@@ -635,7 +635,7 @@ In a lagged correlation an event occurs and following a
*delay* another event oc
The window parameter doesn't capture the delay as we only know that an event
occurred somewhere within a prior window.
The `lag` parameter can be used to start calculating the window parameter a
number of ten second windows in the past.
-For example we could walk the graph in 20 second windows starting from 30
seconds prior to a set of root events.
+For example, we could walk the graph in 20 second windows starting from 30
seconds prior to a set of root events.
By adjusting the lag and re-running the query we can determine which lagged
window has the highest degree.
From this we can determine the delay.
@@ -733,7 +733,7 @@ This score give us a good indication of where to begin our
*root cause analysis*
To switch to *day* or *weekday* time windows we must first index day truncated
ISO 8601 timestamps in a string field with the log records.
In the example below the field `time_day_s` contains the day truncated time
stamps.
-Then its simply a matter of specifying -3DAYS in the window parameter. This
will switch from the default ten second
+Then it's simply a matter of specifying -3DAYS in the window parameter. This
will switch from the default ten second
time windows to daily windows.
[source,text]
diff --git a/solr/solr-ref-guide/modules/query-guide/pages/highlighting.adoc
b/solr/solr-ref-guide/modules/query-guide/pages/highlighting.adoc
index 6077827b224..e857c1ff476 100644
--- a/solr/solr-ref-guide/modules/query-guide/pages/highlighting.adoc
+++ b/solr/solr-ref-guide/modules/query-guide/pages/highlighting.adoc
@@ -330,7 +330,7 @@ If you don't go out of your way to configure the other
options below, the highli
+
The benefit of this approach is that your index won't grow larger with any
extra data that isn't strictly necessary for highlighting.
+
-The down side is that highlighting speed is roughly linear with the amount of
text to process, with a large factor being the complexity of your analysis
chain.
+The downside is that highlighting speed is roughly linear with the amount of
text to process, with a large factor being the complexity of your analysis
chain.
+
For "short" text, this is a good choice.
Or maybe it's not short but you're prioritizing a smaller index and indexing
speed over highlighting performance.
@@ -366,7 +366,7 @@ The Unified Highlighter supports these following additional
parameters to the on
|===
+
By default, the Unified Highlighter will usually pick the right offset source
(see above).
-However it may be ambiguous such as during a migration from one offset source
to another that hasn't completed.
+However, it may be ambiguous such as during a migration from one offset source
to another that hasn't completed.
+
The offset source can be explicitly configured to one of: `ANALYSIS`,
`POSTINGS`, `POSTINGS_WITH_TERM_VECTORS`, or `TERM_VECTORS`.
@@ -587,7 +587,7 @@ If set to `false`, or if there is no match in the alternate
field either, the al
|===
+
Selects a formatter for the highlighted output.
-Currently the only legal value is `simple`, which surrounds a highlighted term
with a customizable pre- and post-text snippet.
+Currently, the only legal value is `simple`, which surrounds a highlighted
term with a customizable pre- and post-text snippet.
`hl.simple.pre`, `hl.simple.post`::
+
diff --git a/solr/solr-ref-guide/modules/query-guide/pages/jdbc-zeppelin.adoc
b/solr/solr-ref-guide/modules/query-guide/pages/jdbc-zeppelin.adoc
index a025637a15f..3fd357c0d09 100644
--- a/solr/solr-ref-guide/modules/query-guide/pages/jdbc-zeppelin.adoc
+++ b/solr/solr-ref-guide/modules/query-guide/pages/jdbc-zeppelin.adoc
@@ -81,7 +81,7 @@ Instructions on how to bind the JDBC interpreter to a
notebook are available htt
.Results of Solr query
image::jdbc-zeppelin/zeppelin_solrjdbc_6.png[image,width=481,height=400]
-The below code block assumes that the Apache Solr driver is setup as the
default JDBC interpreter driver.
+The below code block assumes that the Apache Solr driver is set up as the
default JDBC interpreter driver.
If that is not the case, instructions for using a different prefix is
available
https://zeppelin.apache.org/docs/latest/interpreter/jdbc.html#how-to-use[here].
[source,text]
diff --git
a/solr/solr-ref-guide/modules/query-guide/pages/join-query-parser.adoc
b/solr/solr-ref-guide/modules/query-guide/pages/join-query-parser.adoc
index 8c6e876c30d..1b116c8c5a8 100644
--- a/solr/solr-ref-guide/modules/query-guide/pages/join-query-parser.adoc
+++ b/solr/solr-ref-guide/modules/query-guide/pages/join-query-parser.adoc
@@ -43,7 +43,7 @@ WHERE id IN (
The join operation is done on a term basis, so the `from` and `to` fields must
use compatible field types.
For example: joining between a `StrField` and a `IntPointField` will not work.
-Likewise joining between a `StrField` and a `TextField` that uses
`LowerCaseFilterFactory` will only work for values that are already lower cased
in the string field.
+Likewise, joining between a `StrField` and a `TextField` that uses
`LowerCaseFilterFactory` will only work for values that are already lower cased
in the string field.
== Parameters
@@ -183,7 +183,7 @@ node 4: movie_directors_shard1_replica4
At query time, the `JoinQParser` will access the local replica of the
*movie_directors* collection to perform the join.
If a local replica is not available or active, then the query will fail.
At this point, it should be clear that since you're limited to a single shard
and the data must be replicated across all nodes where it is needed, this
approach works better with smaller data sets where there is a one-to-many
relationship between the from collection and the to collection.
-Moreover, if you add a replica to the to collection, then you also need to add
a replica for the from collection.
+Moreover, if you add a replica to the "to" collection, then you also need to
add a replica for the "from" collection.
For more information, Erick Erickson has written a blog post about join
performance titled https://lucidworks.com/post/solr-and-joins/[Solr and Joins].
@@ -315,7 +315,7 @@ If neither `zkHost` nor `solrUrl` are specified, the local
ZooKeeper cluster wil
|===
+
The URL of the external Solr node to be queried.
-Must be a character-for-character exact match of a allow-listed URL that is
listed in the `allowSolrUrls` parameter in `solrconfig.xml`.
+Must be a character-for-character exact match of an allow-listed URL that is
listed in the `allowSolrUrls` parameter in `solrconfig.xml`.
If the URL does not match, this parameter will be effectively disabled.
`from`::
diff --git a/solr/solr-ref-guide/modules/query-guide/pages/json-facet-api.adoc
b/solr/solr-ref-guide/modules/query-guide/pages/json-facet-api.adoc
index 16faa60c1fd..988e6a42f82 100644
--- a/solr/solr-ref-guide/modules/query-guide/pages/json-facet-api.adoc
+++ b/solr/solr-ref-guide/modules/query-guide/pages/json-facet-api.adoc
@@ -212,7 +212,7 @@ The default of `-1` causes a heuristic to be applied based
on other options spec
|`mincount` |Only return buckets with a count of at least this number.
Defaults to `1`.
|`missing` |A boolean that specifies if a special “missing” bucket should be
returned that is defined by documents without a value in the field. Defaults to
`false`.
|`numBuckets` |A boolean. If `true`, adds “numBuckets” to the response, an
integer representing the number of buckets for the facet (as opposed to the
number of buckets returned). Defaults to `false`.
-|`allBuckets` |A boolean. If `true`, adds an “allBuckets” bucket to the
response, representing the union of all of the buckets. For multi-valued
fields, this is different than a bucket for all of the documents in the domain
since a single document can belong to multiple buckets. Defaults to `false`.
+|`allBuckets` |A boolean. If `true`, adds an “allBuckets” bucket to the
response, representing the union of all of the buckets. For multi-valued
fields, this is different from a bucket for all of the documents in the domain
since a single document can belong to multiple buckets. Defaults to `false`.
|`prefix` |Only produce buckets for terms starting with the specified prefix.
|`facet` |Aggregations, metrics or nested facets that will be calculated for
every returned bucket
|`method` a|
@@ -388,8 +388,8 @@ For example "start" here corresponds to "facet.range.start"
in a facet.range com
|other a|
This parameter indicates that in addition to the counts for each range
constraint between `start` and `end`, counts should also be computed for…
-* "before" all records with field values lower then lower bound of the first
range
-* "after" all records with field values greater then the upper bound of the
last range
+* "before" all records with field values lower than lower bound of the first
range
+* "after" all records with field values greater than the upper bound of the
last range
* "between" all records with field values between the start and end bounds of
all ranges
* "none" compute none of this information
* "all" shortcut for before, between, and after
@@ -626,7 +626,7 @@
include::example$JsonRequestApiTest.java[tag=solrj-json-metrics-facet-simple]
--
An expanded form allows for xref:local-params.adoc[] to be specified.
-These may be used explicitly by some specialized aggregations such as
<<relatedness-options,`relatedness()`>>, but can also be used as parameter
references to make aggregation expressions more readable, with out needing to
use (global) request parameters:
+These may be used explicitly by some specialized aggregations such as
<<relatedness-options,`relatedness()`>>, but can also be used as parameter
references to make aggregation expressions more readable, without needing to
use (global) request parameters:
[.dynamic-tabs]
--
@@ -930,7 +930,7 @@ ____
The `relatedness(...)` function is used to "score" these relationships,
relative to "Foreground" and "Background" sets of documents, specified in the
function params as queries.
-Unlike most aggregation functions, the `relatedness(...)` function is aware of
whether and how it's used in <<nested-facets,Nested Facets>>. It evaluates the
query defining the current bucket _independently_ from its parent/ancestor
buckets, and intersects those documents with a "Foreground Set" defined by the
foreground query _combined with the ancestor buckets_. The result is then
compared to a similar intersection done against the "Background Set" (defined
exclusively by background [...]
+Unlike most aggregation functions, the `relatedness(...)` function is aware of
whether and how it's used in <<nested-facets,Nested Facets>>. It evaluates the
query defining the current bucket _independently_ of its parent/ancestor
buckets, and intersects those documents with a "Foreground Set" defined by the
foreground query _combined with the ancestor buckets_. The result is then
compared to a similar intersection done against the "Background Set" (defined
exclusively by background qu [...]
NOTE: The semantics of `relatedness(...)` in an `allBuckets` context is
currently undefined.
Accordingly, although the `relatedness(...)` stat may be specified for a facet
request that also specifies `allBuckets:true`, the `allBuckets` bucket itself
will not include a relatedness calculation.
@@ -1067,7 +1067,7 @@ curl -sS -X POST
http://localhost:8983/solr/gettingstarted/query -d 'rows=0&q=*:
<1> Even though `hobbies:golf` has a lower total facet `count` then
`hobbies:painting`, it has a higher `relatedness` score, indicating that
relative to the Background Set (the entire collection) Golf has a stronger
correlation to our Foreground Set (people age 35+) then Painting.
<2> The number of documents matching `age:[35 TO *]` _and_ `hobbies:golf` is
31.25% of the total number of documents in the Background Set
<3> 37.5% of the documents in the Background Set match `hobbies:golf`
-<4> The state of Arizona (AZ) has a _positive_ relatedness correlation with
the _nested_ Foreground Set (people ages 35+ who play Golf) compared to the
Background Set -- i.e., "People in Arizona are statistically more likely to be
'35+ year old Golfers' then the country as a whole."
+<4> The state of Arizona (AZ) has a _positive_ relatedness correlation with
the _nested_ Foreground Set (people ages 35+ who play Golf) compared to the
Background Set -- i.e., "People in Arizona are statistically more likely to be
'35+ year old Golfers' than the country as a whole."
<5> The state of Colorado (CO) has a _negative_ correlation with the nested
Foreground Set -- i.e., "People in Colorado are statistically less likely to be
'35+ year old Golfers' then the country as a whole."
<6> The number documents matching `age:[35 TO *]` _and_ `hobbies:golf` _and_
`state:AZ` is 18.75% of the total number of documents in the Background Set
<7> 50% of the documents in the Background Set match `state:AZ`
diff --git a/solr/solr-ref-guide/modules/query-guide/pages/json-query-dsl.adoc
b/solr/solr-ref-guide/modules/query-guide/pages/json-query-dsl.adoc
index 85544cbd5c7..3f6b9f2572b 100644
--- a/solr/solr-ref-guide/modules/query-guide/pages/json-query-dsl.adoc
+++ b/solr/solr-ref-guide/modules/query-guide/pages/json-query-dsl.adoc
@@ -455,7 +455,7 @@ need to do this is if you are displaying filters for
products (parents) with SKU
options (children). Let's go on item by item:
* _Filter exclusion_ is usually necessary when multiple filter values can be
applied to each field.
-This is also also known as drill-sideways facets.
+This is also known as drill-sideways facets.
See also xref:faceting.adoc#tagging-and-excluding-filters[tagging and filter
exclusion].
* _Nested documents_, or child documents, are described in
xref:indexing-guide:indexing-nested-documents.adoc[].
In the example below, they are referred as SKUs since this is a frequent use
case for this feature.
diff --git
a/solr/solr-ref-guide/modules/query-guide/pages/json-request-api.adoc
b/solr/solr-ref-guide/modules/query-guide/pages/json-request-api.adoc
index fe9227d3025..dbf45513351 100644
--- a/solr/solr-ref-guide/modules/query-guide/pages/json-request-api.adoc
+++ b/solr/solr-ref-guide/modules/query-guide/pages/json-request-api.adoc
@@ -53,7 +53,7 @@ curl http://localhost:8983/solr/techproducts/query -d
'json={"query":"memory"}'
=== JSON Parameter Merging
If multiple `json` parameters are provided in a single request, Solr attempts
to merge the parameter values together before processing the request.
-The JSON Request API has several properties (`filter`, `fields`, etc) which
accept multiple values. During the merging process, all values for these
"multivalued" properties are retained. Many properties though (`query`,
`limit`, etc.) can have only a single value. When multiple parameter values
conflict with one another a single value is chosen based on the following
precedence rules:
+The JSON Request API has several properties (`filter`, `fields`, etc.) which
accept multiple values. During the merging process, all values for these
"multivalued" properties are retained. Many properties though (`query`,
`limit`, etc.) can have only a single value. When multiple parameter values
conflict with one another a single value is chosen based on the following
precedence rules:
* Traditional query parameters (`q`, `rows`, etc.) take first precedence and
are used over any other specified values.
* `json`-prefixed query parameters are considered next.
diff --git a/solr/solr-ref-guide/modules/query-guide/pages/loading.adoc
b/solr/solr-ref-guide/modules/query-guide/pages/loading.adoc
index f2d03b10516..e2d04080847 100644
--- a/solr/solr-ref-guide/modules/query-guide/pages/loading.adoc
+++ b/solr/solr-ref-guide/modules/query-guide/pages/loading.adoc
@@ -174,7 +174,7 @@ data to a SolrCloud collection for indexing.
The `update` function adds documents to Solr in batches and returns a tuple
for each batch with summary information about the batch and load.
In the example below the `update` expression is run using Zeppelin-Solr
because the data set is small.
-For larger loads it's best to run the load from a curl command where the
output of the `update` function can be spooled to disk.
+For larger loads, it's best to run the load from a curl command where the
output of the `update` function can be spooled to disk.
image::math-expressions/update.png[]
@@ -378,7 +378,7 @@ image::math-expressions/havingId.png[]
==== Skipping
The `gt` (greater than) function can be used on the `recNum` field to filter
the result set to
-records with a recNum greater then a specific value:
+records with a recNum greater than a specific value:
image::math-expressions/skipping.png[]
diff --git a/solr/solr-ref-guide/modules/query-guide/pages/logs.adoc
b/solr/solr-ref-guide/modules/query-guide/pages/logs.adoc
index 6246008ac7a..33adbdc3cf3 100644
--- a/solr/solr-ref-guide/modules/query-guide/pages/logs.adoc
+++ b/solr/solr-ref-guide/modules/query-guide/pages/logs.adoc
@@ -18,7 +18,7 @@
This section of the user guide provides an introduction to Solr log analytics.
-NOTE: This is an appendix of the xref::math-expressions.adoc[Visual Guide to
Streaming Expressions and Math Expressions].
+NOTE: This is an appendix of the xref:math-expressions.adoc[Visual Guide to
Streaming Expressions and Math Expressions].
All the functions described below are covered in detail in the guide.
See the xref:math-start.adoc[] chapter to learn how to get started with
visualizations and Apache Zeppelin.
@@ -94,7 +94,7 @@ By looking at this sample we can quickly learn about the
*fields* available in t
=== Time Period
Each log record contains a time stamp in the `date_dt` field.
-Its often useful to understand what time period the logs cover and how many
log records have been indexed.
+It's often useful to understand what time period the logs cover and how many
log records have been indexed.
The `stats` function can be run to display this information.
@@ -144,7 +144,7 @@ The example below breaks this down further by adding a
query on the `type_s` fie
image::math-expressions/logs-time-series2.png[]
-Notice the query activity accounts for more then half of the burst of log
records between 21:27 and 21:52.
+Notice the query activity accounts for more than half of the burst of log
records between 21:27 and 21:52.
But the query activity does not account for the large spike in log activity
that follows.
We can account for that spike by changing the search to include only *update*,
*commit*, and *deleteByQuery* records in the logs.
@@ -205,7 +205,7 @@ Scatter plots can be used to visualize random samples of
the `qtime_i`
field.
The example below demonstrates a scatter plot of 500 random samples from the
`ptest1` collection of log records.
-In this example, `qtime_i` is plotted on the y-axis and the x-axis is simply a
sequence to spread the query times out across the plot.
+In this example, `qtime_i` is plotted on the y-axis, and the x-axis is simply
a sequence to spread the query times out across the plot.
NOTE: The `x` field is included in the field list.
The `random` function automatically generates a sequence for the x-axis when
`x` is included in the field list.
@@ -296,18 +296,18 @@ image::math-expressions/qtime-series.png[]
== Performance Troubleshooting
-If query analysis determines that queries are not performing as expected then
log analysis can also be used to troubleshoot the cause of the slowness.
+If query analysis determines that queries are not performing as expected, then
log analysis can also be used to troubleshoot the cause of the slowness.
The section below demonstrates several approaches for locating the source of
query slowness.
=== Slow Nodes
In a distributed search the final search performance is only as fast as the
slowest responding shard in the cluster.
-Therefore one slow node can be responsible for slow overall search time.
+Therefore, one slow node can be responsible for slow overall search time.
The fields `core_s`, `replica_s` and `shard_s` are available in the log
records.
These fields allow average query time to be calculated by *core*, *replica* or
*shard*.
-The `core_s` field is particularly useful as its the most granular element and
the naming convention often includes the collection, shard and replica
information.
+The `core_s` field is particularly useful as it's the most granular element
and the naming convention often includes the collection, shard and replica
information.
The example below uses the `facet` function to calculate `avg(qtime_i)` by
core.
diff --git
a/solr/solr-ref-guide/modules/query-guide/pages/machine-learning.adoc
b/solr/solr-ref-guide/modules/query-guide/pages/machine-learning.adoc
index 55aaba56948..aaa763ea231 100644
--- a/solr/solr-ref-guide/modules/query-guide/pages/machine-learning.adoc
+++ b/solr/solr-ref-guide/modules/query-guide/pages/machine-learning.adoc
@@ -110,7 +110,7 @@ The `count(*)` field populates the values in the cells of
the matrix.
The `distance` function is then used to compute the distance matrix for the
columns of the matrix using `cosine` distance.
This produces a distance matrix that shows distance between complaint types
based on the zip codes they appear in.
-Finally the `zplot` function is used to plot the distance matrix as a heat map.
+Finally, the `zplot` function is used to plot the distance matrix as a heat
map.
Notice that the heat map has been configured so that the intensity of color
increases as the distance between vectors decreases.
@@ -182,7 +182,7 @@ These are the zip codes that are most similar to the 10280
zip code based on the
K-nearest neighbor regression is a non-linear, bivariate and multivariate
regression method.
KNN regression is a lazy learning technique which means it does not fit a
model to the training set in advance.
-Instead the entire training set of observations and outcomes are held in
memory and predictions are made by averaging the outcomes of the k-nearest
neighbors.
+Instead, the entire training set of observations and outcomes are held in
memory and predictions are made by averaging the outcomes of the k-nearest
neighbors.
The `knnRegress` function is used to perform nearest neighbor regression.
@@ -229,7 +229,7 @@ These predictions will be used to determine how well the
KNN regression performe
The error, or *residuals*, for the regression are then calculated by
subtracting the *predicted* quality from the *observed* quality.
The `ebeSubtract` function is used to perform the element-by-element
subtraction between the two vectors.
-Finally the `zplot` function formats the predictions and errors for the
visualization of the *residual plot*.
+Finally, the `zplot` function formats the predictions and errors for the
visualization of the *residual plot*.
image::math-expressions/redwine1.png[]
@@ -262,7 +262,7 @@ From this plot we can see the probability of getting
prediction errors between -
The `knnRegression` function has three additional parameters that make it
suitable for many different regression scenarios.
. Any of the distance measures can be used for the regression simply by adding
the function to the call.
-This allows for regression analysis over sparse vectors (`cosine`), dense
vectors and geo-spatial lat/lon vectors (`haversineMeters`).
+This allows for regression analysis over sparse vectors (`cosine`), dense
vectors and geospatial lat/lon vectors (`haversineMeters`).
+
Sample syntax:
+
@@ -330,9 +330,9 @@ The matrix is transposed so each row contains a single
latitude, longitude point
The `dbscan` function is then used to cluster the latitude and longitude
points.
Notice that the `dbscan` function in the example has four parameters.
-* `obs` : The observation matrix of lat/lon points
+* `obs`: The observation matrix of lat/lon points
-* `eps` : The distance between points to be considered a cluster.
+* `eps`: The distance between points to be considered a cluster.
100 meters in the example.
* `min points`: The minimum points in a cluster for the cluster to be returned
by the function.
@@ -386,7 +386,7 @@ The plot is dense enough so the outlines of the different
boroughs are visible i
Each cluster is shown in a different color.
This plot provides interesting insight into the densities of rat sightings
throughout the five boroughs of New York City.
-For example it highlights a cluster of dense sightings in Brooklyn at cluster1
+For example, it highlights a cluster of dense sightings in Brooklyn at cluster1
surrounded by less dense but still high activity clusters.
=== Plotting the Centroids
@@ -412,7 +412,7 @@ K-means clustering produces centroids or *prototype*
vectors which can be used t
In this example the key features of the centroids are extracted to represent
the key phrases for clusters of TF-IDF term vectors.
NOTE: The example below works with TF-IDF _term vectors_.
-The section xref:term-vectors.adoc[] offers a full explanation of this
features.
+The section xref:term-vectors.adoc[] offers a full explanation of these
features.
In the example the `search` function returns documents where the `review_t`
field matches the phrase "star wars".
The `select` function is run over the result set and applies the `analyze`
function which uses the analyzer attached to the schema field `text_bigrams` to
re-analyze the `review_t` field.
@@ -420,7 +420,7 @@ This analyzer returns bigrams which are then annotated to
documents in a field c
The `termVectors` function then creates TD-IDF term vectors from the bigrams
stored in the `terms` field.
The `kmeans` function is then used to cluster the bigram term vectors into 5
clusters.
-Finally the top 5 features are extracted from the centroids and returned.
+Finally, the top 5 features are extracted from the centroids and returned.
Notice that the features are all bigram phrases with semantic significance.
[source,text]
@@ -565,7 +565,7 @@ This expression returns the following response:
== Fuzzy K-Means Clustering
-The `fuzzyKmeans` function is a soft clustering algorithm which allows vectors
to be assigned to more then one cluster.
+The `fuzzyKmeans` function is a soft clustering algorithm which allows vectors
to be assigned to more than one cluster.
The `fuzziness` parameter is a value between `1` and `2` that determines how
fuzzy to make the cluster assignment.
After the clustering has been performed the `getMembershipMatrix` function can
be called on the clustering result to return a matrix describing the
probabilities of cluster membership for each vector.
@@ -597,7 +597,7 @@ When operating on a matrix the rows of the matrix are
scaled.
=== Min/Max Scaling
The `minMaxScale` function scales a vector or matrix between a minimum and
maximum value.
-By default it will scale between `0` and `1` if min/max values are not
provided.
+By default, it will scale between `0` and `1` if min/max values are not
provided.
Below is a plot of a sine wave, with an amplitude of 1, before and after it
has been scaled between -5 and 5.
@@ -654,7 +654,7 @@ Below is a plot of a sine wave, with an amplitude of 1,
before and after it has
image::math-expressions/standardize.png[]
-Below is a simple example of of a standardized matrix.
+Below is a simple example of a standardized matrix.
Notice that once brought into the same scale the vectors are the same.
[source,text]
diff --git a/solr/solr-ref-guide/modules/query-guide/pages/math-start.adoc
b/solr/solr-ref-guide/modules/query-guide/pages/math-start.adoc
index 8a3b6551825..eaf050b740a 100644
--- a/solr/solr-ref-guide/modules/query-guide/pages/math-start.adoc
+++ b/solr/solr-ref-guide/modules/query-guide/pages/math-start.adoc
@@ -96,7 +96,7 @@ The visualizations in this guide were performed with Apache
Zeppelin using the Z
=== Zeppelin-Solr Interpreter
An Apache Zeppelin interpreter for Solr allows streaming expressions and math
expressions to be executed and results visualized in Zeppelin.
-The instructions for installing and configuring Zeppelin-Solr can be found on
the Github repository for the project:
+The instructions for installing and configuring Zeppelin-Solr can be found on
the GitHub repository for the project:
https://github.com/lucidworks/zeppelin-solr
Once installed the Solr Interpreter can be configured to connect to your Solr
instance.
diff --git
a/solr/solr-ref-guide/modules/query-guide/pages/numerical-analysis.adoc
b/solr/solr-ref-guide/modules/query-guide/pages/numerical-analysis.adoc
index bf8b412099b..4800e055e52 100644
--- a/solr/solr-ref-guide/modules/query-guide/pages/numerical-analysis.adoc
+++ b/solr/solr-ref-guide/modules/query-guide/pages/numerical-analysis.adoc
@@ -55,7 +55,7 @@ These are the new zoomed in x-axis points, between 0 and 3.
Notice that we are sampling a specific area of the curve.
Then the `predict` function is used to predict y-axis points for the sampled
x-axis, for all three interpolation functions.
-Finally all three prediction vectors are plotted with the sampled x-axis
points.
+Finally, all three prediction vectors are plotted with the sampled x-axis
points.
The red line is the `lerp` interpolation, the blue line is the `akima` and the
purple line is the `spline` interpolation.
You can see they each produce different curves in between the control points.
@@ -64,7 +64,7 @@ You can see they each produce different curves in between the
control points.
=== Smoothing Interpolation
The `loess` function is a smoothing interpolator which means it doesn't derive
a function that passes through the original control points.
-Instead the `loess` function returns a function that smooths the original
control points.
+Instead, the `loess` function returns a function that smooths the original
control points.
A technique known as local regression is used to compute the smoothed curve.
The size of the neighborhood of the local regression can be adjusted to
control how close the new curve conforms to the original control points.
diff --git a/solr/solr-ref-guide/modules/query-guide/pages/other-parsers.adoc
b/solr/solr-ref-guide/modules/query-guide/pages/other-parsers.adoc
index 56023635026..0fc92b17268 100644
--- a/solr/solr-ref-guide/modules/query-guide/pages/other-parsers.adoc
+++ b/solr/solr-ref-guide/modules/query-guide/pages/other-parsers.adoc
@@ -73,7 +73,7 @@ For a BooleanQuery with no `must` queries, one or more
`should` queries must mat
A list of queries that *must* appear in matching documents.
However, unlike `must`, the score of filter queries is ignored.
Also, these queries are cached in filter cache.
-To avoid caching add either `cache=false` as local parameter, or
`"cache":"false"` property to underneath Query DLS Object.
+To avoid caching add either `cache=false` as local parameter, or
`"cache":"false"` property to the underlying Query DSL Object.
`mm`::
+
@@ -140,7 +140,7 @@ q={!bool must=foo}
`BoostQParser` extends the `QParserPlugin` and creates a boosted query from
the input value.
The main value is any query to be "wrapped" and "boosted" -- only documents
which match that query will match the final query produced by this parser.
-Parameter `b` is a xref:function-queries.adoc#available-functions[function] to
be evaluated against each document that matches the original query, and the
result of the function will be multiplied into into the final score for that
document.
+Parameter `b` is a xref:function-queries.adoc#available-functions[function] to
be evaluated against each document that matches the original query, and the
result of the function will be multiplied into the final score for that
document.
=== Boost Query Parser Examples
@@ -151,7 +151,7 @@ Creates a query `name:foo` which is boosted (scores are
multiplied) by the funct
q={!boost b=log(popularity)}name:foo
----
-Creates a query `name:foo` which has it's scores multiplied by the _inverse_
of the numeric `price` field -- effectively "demoting" documents which have a
high `price` by lowering their final score:
+Creates a query `name:foo` which has its scores multiplied by the _inverse_ of
the numeric `price` field -- effectively "demoting" documents which have a high
`price` by lowering their final score:
[source,text]
----
@@ -516,7 +516,7 @@
http://localhost:8983/solr/my_graph/query?fl=id&q={!graph+from=in_edge+to=out_ed
----
The examples shown so far have all used a query for a single document
(`"id:A"`) as the root node for the graph traversal, but any query can be used
to identify multiple documents to use as root nodes.
-The next example demonstrates using the `maxDepth` parameter to find all nodes
that are at most one edge away from an root node with a value in the `foo`
field less then or equal to 10:
+The next example demonstrates using the `maxDepth` parameter to find all nodes
that are at most one edge away from a root node with a value in the `foo` field
less than or equal to 10:
[source,text]
----
@@ -686,7 +686,7 @@ The queries measure Jaccard similarity between the query
string and MinHash fiel
The parser supports two modes of operation.
The first, when tokens are generated from text by normal analysis; and the
second, when explicit tokens are provided.
-Currently the score returned by the query reflects the number of top level
elements that match and is *not* normalised between 0 and 1.
+Currently, the score returned by the query reflects the number of top level
elements that match and is *not* normalised between 0 and 1.
`sim`::
+
@@ -866,7 +866,7 @@ In this case, generating a score of 1 if one hash matches
and a score of 512 if
A banded query mixes conjunctions and disjunctions.
We could have 256 bands each of two queries ANDed together, 128 with 4 hashes
ANDed together etc.
With fewer bands query performance increases but we may miss some matches.
-There is a trade off between speed and accuracy.
+There is a trade-off between speed and accuracy.
With 64 bands the score will range from 0 to 64 (the number of bands ORed
together)
Given the required similarity and an acceptable true positive rate, the query
parser computes the appropriate band size^[1]^.
@@ -886,7 +886,7 @@ Even a single match may indicate some kind of similarity
either in meaning, styl
For a general introduction see "Mining of Massive Datasets"^[1]^.
-For documents of ~1500 words expect an index size overhead of ~10%; your
milage will vary.
+For documents of ~1500 words expect an index size overhead of ~10%; your
mileage will vary.
512 hashes would be expected to represent ~2500 words well.
Using a set of MinHash values was proposed in the initial paper^[2]^ but
provides a biased estimate of Jaccard similarity.
@@ -1070,7 +1070,7 @@ Find all documents with the phrase "foo bar" where term
"foo" has a payload grea
== Prefix Query Parser
`PrefixQParser` extends the `QParserPlugin` by creating a prefix query from
the input value.
-Currently no analysis or value transformation is done to create this prefix
query.
+Currently, no analysis or value transformation is done to create this prefix
query.
The parameter is `f`, the field.
The string after the prefix declaration is treated as a wildcard query.
@@ -1231,7 +1231,7 @@ The non-unary operators (everything but `NOT`) support
both infix `(a AND b AND
`SwitchQParser` is a `QParserPlugin` that acts like a "switch" or "case"
statement.
-The primary input string is trimmed and then prefixed with `case.` for use as
a key to lookup a "switch case" in the parser's local params.
+The primary input string is trimmed and then prefixed with `case.` for use as
a key to look up a "switch case" in the parser's local params.
If a matching local param is found the resulting parameter value will then be
parsed as a subquery, and returned as the parse result.
The `case` local param can be optionally be specified as a switch case to
match missing (or blank) input strings.
@@ -1251,7 +1251,7 @@ In the examples below, the result of each query is "XXX":
{!switch case.foo=qqq case.bar=XXX case.yak=zzz} bar
----
-.The result will fallback to the default.
+.The result will fall back to the default.
[source,text]
----
{!switch case.foo=qqq case.bar=zzz default=XXX}asdf
@@ -1289,7 +1289,7 @@ Using the example configuration below, clients can
optionally specify the custom
== Term Query Parser
`TermQParser` extends the `QParserPlugin` by creating a single term query from
the input value equivalent to `readableToIndexed()`.
-This is useful for generating filter queries from the external human readable
terms returned by the faceting or terms components.
+This is useful for generating filter queries from the external human-readable
terms returned by the faceting or terms components.
The only parameter is `f`, for the field.
Example:
@@ -1308,7 +1308,7 @@ If no analysis or transformation is desired for any type
of field, see the <<Raw
`TermsQParser` functions similarly to the <<Term Query Parser,Term Query
Parser>> but takes in multiple values separated by commas and returns documents
matching any of the specified values.
-This can be useful for generating filter queries from the external human
readable terms returned by the faceting or terms components, and may be more
efficient in some cases than using the xref:standard-query-parser.adoc[] to
generate a boolean query since the default implementation `method` avoids
scoring.
+This can be useful for generating filter queries from the external
human-readable terms returned by the faceting or terms components, and may be
more efficient in some cases than using the xref:standard-query-parser.adoc[]
to generate a boolean query since the default implementation `method` avoids
scoring.
This query parser takes the following parameters:
diff --git
a/solr/solr-ref-guide/modules/query-guide/pages/pagination-of-results.adoc
b/solr/solr-ref-guide/modules/query-guide/pages/pagination-of-results.adoc
index da9f34af047..d575bd6d412 100644
--- a/solr/solr-ref-guide/modules/query-guide/pages/pagination-of-results.adoc
+++ b/solr/solr-ref-guide/modules/query-guide/pages/pagination-of-results.adoc
@@ -26,7 +26,7 @@ In Solr, this basic paginated searching is supported using
the `start` and `rows
=== Basic Pagination Examples
-The easiest way to think about simple pagination, is to simply multiply the
page number you want (treating the "first" page number as "0") by the number of
rows per page; such as in the following pseudo-code:
+The easiest way to think about simple pagination, is to simply multiply the
page number you want (treating the "first" page number as "0") by the number of
rows per page; such as in the following pseudocode:
[source,plain]
----
@@ -87,7 +87,7 @@ For a ten shard index, ten million entries must be retrieved
and sorted to figur
As an alternative to increasing the "start" parameter to request subsequent
pages of sorted results, Solr supports using a "Cursor" to scan through results.
Cursors in Solr are a logical concept that doesn't involve caching any state
information on the server.
-Instead the sort values of the last document returned to the client are used
to compute a "mark" representing a logical point in the ordered space of sort
values.
+Instead, the sort values of the last document returned to the client are used
to compute a "mark" representing a logical point in the ordered space of sort
values.
That "mark" can be specified in the parameters of subsequent requests to tell
Solr where to continue.
=== Using Cursors
@@ -127,7 +127,7 @@ Requiring that the uniqueKey field be used as a clause in
the sort criteria guar
==== Fetch All Docs
-The pseudo-code shown here shows the basic logic involved in fetching all
documents matching a query using a cursor:
+The pseudocode shown here shows the basic logic involved in fetching all
documents matching a query using a cursor:
[source,plain]
----
@@ -145,7 +145,7 @@ while (not $done) {
}
----
-Using SolrJ, this pseudo-code would be:
+Using SolrJ, this pseudocode would be:
[source,java]
----
@@ -254,9 +254,9 @@ In a nutshell: When fetching all results matching a query
using `cursorMark`, th
[TIP]
====
-One way to ensure that a document will never be returned more then once, is to
use the uniqueKey field as the primary (and therefore: only significant) sort
criterion.
+One way to ensure that a document will never be returned more than once, is to
use the uniqueKey field as the primary (and therefore: only significant) sort
criterion.
-In this situation, you will be guaranteed that each document is only returned
once, no matter how it may be be modified during the use of the cursor.
+In this situation, you will be guaranteed that each document is only returned
once, no matter how it may be modified during the use of the cursor.
====
=== "Tailing" a Cursor
@@ -270,7 +270,7 @@ Client applications can continuously poll a cursor using a
`sort=timestamp asc,
Another common example is when you have uniqueKey values that always increase
as new documents are created, and you can continuously poll a cursor using
`sort=id asc` to be notified about new documents.
-The pseudo-code for tailing a cursor is only a slight modification from our
early example for processing all docs matching a query:
+The pseudocode for tailing a cursor is only a slight modification from our
early example for processing all docs matching a query:
[source,plain]
----
diff --git
a/solr/solr-ref-guide/modules/query-guide/pages/probability-distributions.adoc
b/solr/solr-ref-guide/modules/query-guide/pages/probability-distributions.adoc
index b47e37ead03..9ed35fc5329 100644
---
a/solr/solr-ref-guide/modules/query-guide/pages/probability-distributions.adoc
+++
b/solr/solr-ref-guide/modules/query-guide/pages/probability-distributions.adoc
@@ -114,7 +114,7 @@ image::math-expressions/poisson.png[]
==== binomialDistribution
-The visualization below shows a binomial distribution with a 100 trials and
.15 probability of success.
+The visualization below shows a binomial distribution with 100 trials and .15
probability of success.
image::math-expressions/binomial.png[]
@@ -326,7 +326,7 @@ The covariance matrix describes the covariance between
`filesize_d` and `respons
The `multivariateNormalDistribution` function is then called with the array of
means for the two fields and the covariance matrix.
The model for the multivariate normal distribution is assigned to variable `g`.
-Finally five samples are drawn from the multivariate normal distribution.
+Finally, five samples are drawn from the multivariate normal distribution.
[source,text]
----
diff --git
a/solr/solr-ref-guide/modules/query-guide/pages/query-re-ranking.adoc
b/solr/solr-ref-guide/modules/query-guide/pages/query-re-ranking.adoc
index d1eb28e302e..0cf15be3ff6 100644
--- a/solr/solr-ref-guide/modules/query-guide/pages/query-re-ranking.adoc
+++ b/solr/solr-ref-guide/modules/query-guide/pages/query-re-ranking.adoc
@@ -19,7 +19,7 @@
Query Re-Ranking allows you to run a simple query (A) for matching documents
and then re-rank the top N documents using the scores from a more complex query
(B).
Since the more costly ranking from query B is only applied to the top _N_
documents, it will have less impact on performance then just using the complex
query B by itself.
-The trade off is that documents which score very low using the simple query A
may not be considered during the re-ranking phase, even if they would score
very highly using query B.
+The trade-off is that documents which score very low using the simple query A
may not be considered during the re-ranking phase, even if they would score
very highly using query B.
== Specifying a Ranking Query
@@ -39,7 +39,7 @@ You can also configure a custom
{solr-javadocs}/core/org/apache/solr/search/QPar
=== ReRank Query Parser
-The `rerank` parser wraps a query specified by an local parameter, along with
additional parameters indicating how many documents should be re-ranked, and
how the final scores should be computed:
+The `rerank` parser wraps a query specified by a local parameter, along with
additional parameters indicating how many documents should be re-ranked, and
how the final scores should be computed:
`reRankQuery`::
+
@@ -96,7 +96,7 @@ min and max are positive integers. Example
`reRankMainScale=0-1` rescales the ma
|Optional |Default: `add`
|===
+
-By default the score from the reRankQuery multiplied by the `reRankWeight` is
added to the original score.
+By default, the score from the reRankQuery multiplied by the `reRankWeight` is
added to the original score.
In the example below using the default `add` behaviour, the top 1000 documents
matching the query "greetings" will be re-ranked using the query "(hi hello hey
hiya)".
The resulting scores for each of those 1000 documents will be 3 times their
score from the "(hi hello hey hiya)", plus the score from the original
"greetings" query:
diff --git a/solr/solr-ref-guide/modules/query-guide/pages/regression.adoc
b/solr/solr-ref-guide/modules/query-guide/pages/regression.adoc
index 36cd11f2658..ad98774bc7e 100644
--- a/solr/solr-ref-guide/modules/query-guide/pages/regression.adoc
+++ b/solr/solr-ref-guide/modules/query-guide/pages/regression.adoc
@@ -165,7 +165,7 @@ image::math-expressions/linear.png[]
=== Residuals
The difference between the observed value and the predicted value is known as
the residual.
-There isn't a specific function to calculate the residuals but vector math can
used to perform the calculation.
+There isn't a specific function to calculate the residuals but vector math can
be used to perform the calculation.
In the example below the predictions are stored in variable `p`.
The `ebeSubtract` function is then used to subtract the predictions from the
actual `response_d` values stored in variable `y`.
diff --git
a/solr/solr-ref-guide/modules/query-guide/pages/response-writers.adoc
b/solr/solr-ref-guide/modules/query-guide/pages/response-writers.adoc
index 6fc40c65ef7..229a3a85014 100644
--- a/solr/solr-ref-guide/modules/query-guide/pages/response-writers.adoc
+++ b/solr/solr-ref-guide/modules/query-guide/pages/response-writers.adoc
@@ -39,7 +39,7 @@ The list below describe shows the most common settings for
the `wt` parameter, w
== JSON Response Writer
-The default Solr Response Writer is the `JsonResponseWriter`, which formats
output in JavaScript Object Notation (JSON), a lightweight data interchange
format specified in specified in RFC 4627.
+The default Solr Response Writer is the `JsonResponseWriter`, which formats
output in JavaScript Object Notation (JSON), a lightweight data interchange
format specified in RFC 4627.
The default response writer is used when:
* the `wt` parameter is not specified in the request, or
@@ -88,7 +88,7 @@ The default mime type for the JSON writer is
`application/json`, however this ca
</queryResponseWriter>
----
-WARNING: If you using the JSON formatted response with JSONP to query across
boundaries, having Solr respond with `text/plain` mime type when the
+WARNING: If you are using the JSON formatted response with JSONP to query
across boundaries, having Solr respond with `text/plain` mime type when the
browser expects `application/json` will trigger the browser to block the
request.
=== JSON-Specific Parameters
@@ -245,7 +245,7 @@ For production you probably want a much higher value.
<language>en-us</language>
<docs>http://localhost:8983/solr</docs>
<item>
- <title>iPod & iPod Mini USB 2.0 Cable</title>
+ <title>iPod & iPod Mini USB 2.0 Cable</title>
<link>
http://localhost:8983/solr/select?q=id:IW-02
</link>
diff --git a/solr/solr-ref-guide/modules/query-guide/pages/search-sample.adoc
b/solr/solr-ref-guide/modules/query-guide/pages/search-sample.adoc
index 9dbdade8fda..dddf2e729c0 100644
--- a/solr/solr-ref-guide/modules/query-guide/pages/search-sample.adoc
+++ b/solr/solr-ref-guide/modules/query-guide/pages/search-sample.adoc
@@ -161,9 +161,9 @@ The `facet` function supports any combination of the
following aggregate functio
=== facet2D
-The `facet2D` function performs two dimensional aggregations that can be
visualized as heat maps or pivoted into matrices and operated on by machine
learning functions.
+The `facet2D` function performs two-dimensional aggregations that can be
visualized as heat maps or pivoted into matrices and operated on by machine
learning functions.
-`facet2D` has different syntax and behavior then a two dimensional `facet`
function which does not control the number of unique facets of each dimension.
+`facet2D` has different syntax and behavior then a two-dimensional `facet`
function which does not control the number of unique facets of each dimension.
The `facet2D` function has the `dimensions` parameter which controls the
number of unique facets for the *x* and *y* dimensions.
The example below visualizes the output of the `facet2D` function.
@@ -201,7 +201,7 @@ The `significantTerms` function can often provide insights
that cannot be gleane
The example below illustrates the difference between the `facet` function and
the `significantTerms` function.
In the first example the `facet` function aggregates the top 5 complaint types
in Brooklyn.
-This returns the five most common complaint types in Brooklyn, but it's not
clear that these terms appear more frequently in Brooklyn then then the other
boroughs.
+This returns the five most common complaint types in Brooklyn, but it's not
clear that these terms appear more frequently in Brooklyn than the other
boroughs.
image::math-expressions/significantTermsCompare.png[]
@@ -242,11 +242,11 @@ The `gather` parameter tells the nodes expression to
gather the `ticker_s` symbo
The `count(*)` parameter counts the occurrences of the tickers.
This will count the number of times each ticker appears in the breadth first
search.
-Finally the `top` function selects the top 5 tickers by count and returns them.
+Finally, the `top` function selects the top 5 tickers by count and returns
them.
The result below shows the ticker symbols in the `nodes` field and the counts
for each node.
Notice *jpm* is first, which shows how many days *jpm* had a change greater
then .25 in this time period.
-The next set of ticker symbols (*mtb*, *slvb*, *gs* and *pnc*) are the symbols
with highest number of days with a change greater then .25 on the same days
that *jpm* had a change greater then .25.
+The next set of ticker symbols (*mtb*, *slvb*, *gs* and *pnc*) are the symbols
with the highest number of days with a change greater then .25 on the same days
that *jpm* had a change greater then .25.
image::math-expressions/nodestab.png[]
diff --git
a/solr/solr-ref-guide/modules/query-guide/pages/searching-nested-documents.adoc
b/solr/solr-ref-guide/modules/query-guide/pages/searching-nested-documents.adoc
index 2a92df2d37e..4ccf81f9dd5 100644
---
a/solr/solr-ref-guide/modules/query-guide/pages/searching-nested-documents.adoc
+++
b/solr/solr-ref-guide/modules/query-guide/pages/searching-nested-documents.adoc
@@ -37,7 +37,7 @@
include::indexing-guide:page$indexing-nested-documents.adoc[tag=sample-indexing-
By default, documents that match a query do not include any of their nested
children in the response.
The `[child]` Doc Transformer Can be used enrich query results with the
documents' descendants.
-For a detailed explanation of this transformer, and specifics on it's syntax &
limitations, please refer to the section
xref:document-transformers.adoc#child-childdoctransformerfactory[[child] -
ChildDocTransformerFactory].
+For a detailed explanation of this transformer, and specifics on its syntax &
limitations, please refer to the section
xref:document-transformers.adoc#child-childdoctransformerfactory[[child] -
ChildDocTransformerFactory].
A simple query matching all documents with a description that includes
"staplers":
@@ -174,7 +174,7 @@ Note that in the above example, the `/` characters in the
`\_nest_path_` were "d
* One level of `\` escaping is necessary to prevent the `/` from being
interpreted as a
{lucene-javadocs}/queryparser/org/apache/lucene/queryparser/classic/package-summary.html#Regexp_Searches[Regex
Query]
* An additional level of "escaping the escape character" is necessary because
the `of` local parameter is a quoted string; so we need a second `\` to ensure
the first `\` is preserved and passed as is to the query parser.
-(You can see that only a single level of of `\` escaping is needed in the body
of the query string -- to prevent the Regex syntax -- because it's not a
quoted string local param).
+(You can see that only a single level of `\` escaping is needed in the body of
the query string -- to prevent the Regex syntax -- because it's not a quoted
string local param).
You may find it more convenient to use
xref:local-params.adoc#parameter-dereferencing[parameter references] in
conjunction with xref:other-parsers.adoc[other parsers] that do not treat `/`
as a special character to express the same query in a more verbose form:
diff --git a/solr/solr-ref-guide/modules/query-guide/pages/simulations.adoc
b/solr/solr-ref-guide/modules/query-guide/pages/simulations.adoc
index ca5915f0af7..297b70bb99a 100644
--- a/solr/solr-ref-guide/modules/query-guide/pages/simulations.adoc
+++ b/solr/solr-ref-guide/modules/query-guide/pages/simulations.adoc
@@ -118,7 +118,7 @@ The distribution visualizes the probability of the
different total returns from
image::math-expressions/randomwalk5.png[]
-The `probability` and `cumulativeProbability` functions can then used to learn
more about the `empiricalDistribution`.
+The `probability` and `cumulativeProbability` functions can then be used to
learn more about the `empiricalDistribution`.
For example the `probability` function can be used to calculate the
probability of a non-negative return from 100 days of stock returns.
The example below uses the `probability` function to return the probability of
a return between the range of 0 and 40 from the `empiricalDistribution` of the
simulation.
@@ -166,7 +166,7 @@ image::math-expressions/corrsim2.png[]
=== Covariance Matrix
-A covariance matrix is actually whats needed by the
`multiVariateNormalDistribution` as it contains both the variance of the two
stock return vectors and the covariance between the two vectors.
+A covariance matrix is actually what's needed by the
`multiVariateNormalDistribution` as it contains both the variance of the two
stock return vectors and the covariance between the two vectors.
The `cov` function will compute the covariance matrix for the columns of a
matrix.
The example below demonstrates how to compute the covariance matrix by adding
the `all` and `cvx` vectors as rows to a matrix.
diff --git a/solr/solr-ref-guide/modules/query-guide/pages/spatial-search.adoc
b/solr/solr-ref-guide/modules/query-guide/pages/spatial-search.adoc
index 37aa5802bd2..dadaeef4cb1 100644
--- a/solr/solr-ref-guide/modules/query-guide/pages/spatial-search.adoc
+++ b/solr/solr-ref-guide/modules/query-guide/pages/spatial-search.adoc
@@ -63,7 +63,7 @@ Use `x y` (a space) if RPT.
For PointType however, use `x,y` (a comma).
If you'd rather use a standard industry format, Solr supports
https://en.wikipedia.org/wiki/Well-known_text[WKT] and
http://geojson.org/[GeoJSON].
-However it's much bulkier than the raw coordinates for such simple data.
+However, it's much bulkier than the raw coordinates for such simple data.
(Not supported by PointType)
=== Indexing GeoJSON and WKT
@@ -709,9 +709,9 @@ Some of the attributes are in common with the RPT field
like geo, units, worldBo
To index a box, add a field value to a bbox field that's a string in the
WKT/CQL ENVELOPE syntax.
Example: `ENVELOPE(-10, 20, 15, 10)` which is minX, maxX, maxY, minY order.
The parameter ordering is unintuitive but that's what the spec calls for.
-Alternatively, you could provide a rectangular polygon in WKT (or GeoJSON if
you set set `format="GeoJSON"`).
+Alternatively, you could provide a rectangular polygon in WKT (or GeoJSON if
you set `format="GeoJSON"`).
-To search, you can use the `{!bbox}` query parser, or the range syntax e.g.,
`[10,-10 TO 15,20]`, or the ENVELOPE syntax wrapped in parenthesis with a
leading search predicate.
+To search, you can use the `{!bbox}` query parser, or the range syntax e.g.,
`[10,-10 TO 15,20]`, or the ENVELOPE syntax wrapped in parentheses with a
leading search predicate.
The latter is the only way to choose a predicate other than Intersects.
For example:
diff --git a/solr/solr-ref-guide/modules/query-guide/pages/spell-checking.adoc
b/solr/solr-ref-guide/modules/query-guide/pages/spell-checking.adoc
index 0adfd4c10fc..2c30bd4fb6b 100644
--- a/solr/solr-ref-guide/modules/query-guide/pages/spell-checking.adoc
+++ b/solr/solr-ref-guide/modules/query-guide/pages/spell-checking.adoc
@@ -401,7 +401,7 @@ This is ignored if `spellcheck.collate` is false.
+
This parameter specifies the maximum number of documents that should be
collected when testing potential collations against the index.
A value of `0` indicates that all documents should be collected, resulting in
exact hit-counts.
-Otherwise an estimation is provided as a performance optimization in cases
where exact hit-counts are unnecessary – the higher the value specified, the
more precise the estimation.
+Otherwise, an estimation is provided as a performance optimization in cases
where exact hit-counts are unnecessary – the higher the value specified, the
more precise the estimation.
+
When `spellcheck.collateExtendedResults` is `false`, the optimization is
always used as if `1` had been specified.
diff --git a/solr/solr-ref-guide/modules/query-guide/pages/sql-query.adoc
b/solr/solr-ref-guide/modules/query-guide/pages/sql-query.adoc
index c956c589abb..f6f768fc465 100644
--- a/solr/solr-ref-guide/modules/query-guide/pages/sql-query.adoc
+++ b/solr/solr-ref-guide/modules/query-guide/pages/sql-query.adoc
@@ -39,10 +39,10 @@ More information about how to Solr supports SQL queries for
Solr is described in
=== Solr Collections and DB Tables
In a standard `SELECT` statement such as `SELECT <expressions> FROM <table>`,
the table names correspond to Solr collection names.
-Table names are case insensitive.
+Table names are case-insensitive.
Column names in the SQL query map directly to fields in the Solr index for the
collection being queried.
-These identifiers are case sensitive.
+These identifiers are case-sensitive.
Aliases are supported, and can be referenced in the `ORDER BY` clause.
The `SELECT *` syntax to indicate all fields is only supported for queries
with a `LIMIT` clause.
@@ -65,7 +65,7 @@ Solr supports a broad range of SQL syntax.
.SQL Parser is Case Insensitive
[IMPORTANT]
====
-The SQL parser being used by Solr to translate the SQL statements is case
insensitive.
+The SQL parser being used by Solr to translate the SQL statements is
case-insensitive.
However, for ease of reading, all examples on this page use capitalized
keywords.
====
@@ -199,7 +199,7 @@ If the `ORDER BY` clause contains the exact fields in the
`GROUP BY` clause, the
If the `ORDER BY` clause contains different fields than the `GROUP BY` clause,
a limit of 100 is automatically applied.
To increase this limit you must specify a value in the `LIMIT` clause.
-Order by fields are case sensitive.
+Order by fields are case-sensitive.
==== OFFSET with FETCH
@@ -295,7 +295,7 @@ The Column Identifiers can contain both fields in the Solr
index and aggregate f
The supported aggregate functions are:
* `COUNT(*)`: Counts the number of records over a set of buckets.
-* `SUM(field)`: Sums a numeric field over over a set of buckets.
+* `SUM(field)`: Sums a numeric field over a set of buckets.
* `AVG(field)`: Averages a numeric field over a set of buckets.
* `MIN(field)`: Returns the min value of a numeric field over a set of buckets.
* `MAX(field)`: Returns the max value of a numerics over a set of buckets.
@@ -352,7 +352,7 @@ The request handlers used for the SQL interface are
configured to load implicitl
The `/sql` handler is the front end of the Parallel SQL interface.
All SQL queries are sent to the `/sql` handler to be processed.
The handler also coordinates the distributed MapReduce jobs when running
`GROUP BY` and `SELECT DISTINCT` queries in `map_reduce` mode.
-By default the `/sql` handler will choose worker nodes from its own collection
to handle the distributed operations.
+By default, the `/sql` handler will choose worker nodes from its own
collection to handle the distributed operations.
In this default scenario the collection where the `/sql` handler resides acts
as the default worker collection for MapReduce queries.
By default, the `/sql` request handler is configured as an implicit handler,
meaning that it is always enabled in every Solr installation and no further
configuration is required.
@@ -388,7 +388,7 @@ Like the `/sql` request handler, the `/stream` and
`/export` request handlers ar
In some cases, fields used in SQL queries must be configured as DocValue
fields.
If queries are unlimited, all fields must be DocValue fields.
-If queries are limited (with the `limit` clause) then fields do not have to be
have DocValues enabled.
+If queries are limited (with the `limit` clause) then fields do not have to
have DocValues enabled.
.Multi-valued Fields
[IMPORTANT]
@@ -483,7 +483,7 @@ One of the parameters of the request is the
`aggregationMode`, which defines if
=== Parallelized Queries
The Parallel SQL architecture consists of three logical tiers: a *SQL* tier, a
*Worker* tier, and a *Data Table* tier.
-By default the SQL and Worker tiers are collapsed into the same physical
SolrCloud collection.
+By default, the SQL and Worker tiers are collapsed into the same physical
SolrCloud collection.
==== SQL Tier
diff --git
a/solr/solr-ref-guide/modules/query-guide/pages/standard-query-parser.adoc
b/solr/solr-ref-guide/modules/query-guide/pages/standard-query-parser.adoc
index 06dd8112a11..46bb8a6c407 100644
--- a/solr/solr-ref-guide/modules/query-guide/pages/standard-query-parser.adoc
+++ b/solr/solr-ref-guide/modules/query-guide/pages/standard-query-parser.adoc
@@ -211,7 +211,7 @@ This is a
<<differences-between-lucenes-classic-query-parser-and-solrs-standard-
.Matching `NaN` values with wildcards
====
For most fields, unbounded range queries, `field:[* TO *]`, are equivalent to
existence queries, `field: *` .
-However for float/double types that support `NaN` values, these two queries
perform differently.
+However, for float/double types that support `NaN` values, these two queries
perform differently.
* `field:*` matches all existing values, including `NaN`
* `field:[* TO *]` matches all real values, excluding `NaN`
diff --git a/solr/solr-ref-guide/modules/query-guide/pages/statistics.adoc
b/solr/solr-ref-guide/modules/query-guide/pages/statistics.adoc
index 2427c90e292..c2cbb9f11b0 100644
--- a/solr/solr-ref-guide/modules/query-guide/pages/statistics.adoc
+++ b/solr/solr-ref-guide/modules/query-guide/pages/statistics.adoc
@@ -166,7 +166,7 @@ The list of tuples with the counts is then stored in
variable *a*.
Then an `array` of labels is created and set to variable *l*.
-Finally the `zplot` function is used to plot the labels vector and the
`count(*)` column.
+Finally, the `zplot` function is used to plot the labels vector and the
`count(*)` column.
Notice the `col` function is used inside of the `zplot` function to extract
the counts from the `stats` results.
image::math-expressions/custom-hist.png[]
@@ -176,7 +176,7 @@ image::math-expressions/custom-hist.png[]
The `freqTable` function returns a frequency distribution for a discrete data
set.
The `freqTable` function doesn't create bins like the histogram.
-Instead it counts the occurrence of each discrete data value and returns a
list of tuples with the frequency statistics for each value.
+Instead, it counts the occurrence of each discrete data value and returns a
list of tuples with the frequency statistics for each value.
Below is an example of a frequency table built from a result set of rounded
*differences* in daily opening stock prices for the stock ticker *amzn*.
@@ -193,7 +193,7 @@ This will provide an array of price differences for each
day which will show dai
Then the `round` function is used to round the price differences to the
nearest integer to create a vector of discrete values.
The `round` function in this example is effectively *binning* continuous data
at integer boundaries.
-Finally the `freqTable` function is run on the discrete values to calculate
the frequency table.
+Finally, the `freqTable` function is run on the discrete values to calculate
the frequency table.
[source,text]
----
@@ -388,7 +388,7 @@ The `corr` function is then used correlate the *columns* of
the matrix.
This produces a correlation matrix that shows how complaint types are
correlated based on the zip codes they appear in.
Another way to look at this is it shows how the different complaint types tend
to co-occur across zip codes.
-Finally the `zplot` function is used to plot the correlation matrix as a heat
map.
+Finally, the `zplot` function is used to plot the correlation matrix as a heat
map.
image::math-expressions/corrmatrix.png[]
@@ -524,7 +524,7 @@ When this expression is sent to the `/stream` handler it
responds with:
== Transformations
-In statistical analysis its often useful to transform data sets before
performing statistical calculations.
+In statistical analysis it's often useful to transform data sets before
performing statistical calculations.
The statistical function library includes the following commonly used
transformations:
* `rank`: Returns a numeric array with the rank-transformed value of each
element of the original array.
diff --git a/solr/solr-ref-guide/modules/query-guide/pages/stats-component.adoc
b/solr/solr-ref-guide/modules/query-guide/pages/stats-component.adoc
index 32614751ca0..d388cf5b834 100644
--- a/solr/solr-ref-guide/modules/query-guide/pages/stats-component.adoc
+++ b/solr/solr-ref-guide/modules/query-guide/pages/stats-component.adoc
@@ -156,7 +156,7 @@ This statistic is computed for all field types but is not
computed by default.
`cardinality`::
A statistical approximation (currently using the
https://en.wikipedia.org/wiki/HyperLogLog[HyperLogLog] algorithm) of the number
of distinct values in the field/function in all of the documents in the set.
-This calculation is much more efficient then using the `countDistinct` option,
but may not be 100% accurate.
+This calculation is much more efficient than using the `countDistinct` option,
but may not be 100% accurate.
+
Input for this option can be floating point number between `0.0` and `1.0`
indicating how aggressively the algorithm should try to be accurate: `0.0`
means use as little memory as possible; `1.0` means use as much memory as
needed to be as accurate as possible.
`true` is supported as an alias for `0.3`.
diff --git
a/solr/solr-ref-guide/modules/query-guide/pages/stream-decorator-reference.adoc
b/solr/solr-ref-guide/modules/query-guide/pages/stream-decorator-reference.adoc
index 447bb1b8d56..1c2d8afccdf 100644
---
a/solr/solr-ref-guide/modules/query-guide/pages/stream-decorator-reference.adoc
+++
b/solr/solr-ref-guide/modules/query-guide/pages/stream-decorator-reference.adoc
@@ -392,7 +392,7 @@ It was designed specifically to work with models trained
using the xref:stream-s
The `classify` function uses the
xref:stream-source-reference.adoc#model[`model` function] to retrieve a stored
model and then scores a stream of tuples using the model.
The tuples read by the classifier must contain a text field that can be used
for classification.
The classify function uses a Lucene analyzer to extract the features from the
text so the model can be applied.
-By default the `classify` function looks for the analyzer using the name of
text field in the tuple.
+By default, the `classify` function looks for the analyzer using the name of
text field in the tuple.
If the Solr schema on the worker node does not contain this field, the
analyzer can be looked up in another field by specifying the `analyzerField`
parameter.
Each tuple that is classified is assigned two scores:
@@ -400,7 +400,7 @@ Each tuple that is classified is assigned two scores:
* probability_d*: A float between 0 and 1 which describes the probability that
the tuple belongs to the class.
This is useful in the classification use case.
-* score_d*: The score of the document that has not be squashed between 0 and 1.
+* score_d*: The score of the document that has not been squashed between 0 and
1.
The score may be positive or negative.
The higher the score the better the document fits the class.
This un-squashed score will be useful in query re-ranking and recommendation
use cases.
@@ -410,7 +410,7 @@ This score is particularly useful when multiple high
ranking documents have a pr
* `model expression`: (Mandatory) Retrieves the stored logistic regression
model.
* `field`: (Mandatory) The field in the tuples to apply the classifier to.
-By default the analyzer for this field in the schema will be used extract the
features.
+By default, the analyzer for this field in the schema will be used extract the
features.
* `analyzerField`: (Optional) Specifies a different field to find the analyzer
from in the schema.
=== classify Syntax
@@ -526,7 +526,7 @@ daemon(id="uniqueId",
----
The sample code above shows a `daemon` function wrapping an `update` function,
which is wrapping a `topic` function.
-When this expression is sent to the `/stream` handler, the `/stream` hander
sees the `daemon` function and keeps it in memory where it will run at
intervals.
+When this expression is sent to the `/stream` handler, the `/stream` handler
sees the `daemon` function and keeps it in memory where it will run at
intervals.
In this particular example, the `daemon` function will run the `update`
function every second.
The `update` function is wrapping a
xref:stream-source-reference.adoc#topic[`topic` function], which will stream
tuples that match the `topic` function query in batches.
Each subsequent call to the topic will return the next batch of tuples for the
topic.
@@ -596,7 +596,7 @@ DaemonStream daemonStream = new DaemonStream(topicStream,
// The und
"daemonId", // The
id of the daemon
1000, // The
interval at which to run the internal stream
500); // The
internal queue size for the daemon stream. Tuples will be placed in the queue
- // as
they are read by the internal internal thread.
+ // as
they are read by the internal thread.
//
Calling read() on the daemon stream reads records from the internal queue.
daemonStream.setStreamContext(context);
@@ -636,7 +636,7 @@ This is similar to the `<<update,update()>>` function
described below.
=== delete Parameters
-* `destinationCollection`: (Mandatory) The collection where the tuples will
deleted.
+* `destinationCollection`: (Mandatory) The collection where the tuples will be
deleted.
* `batchSize`: (Optional, defaults to `250`) The delete batch size.
* `pruneVersionField`: (Optional, defaults to `false`) Whether to prune
`\_version_` values from tuples
* `StreamExpression`: (Mandatory)
@@ -727,7 +727,7 @@ The `executor` function has an internal thread pool that
runs tasks that compile
This function can also be parallelized across worker nodes by wrapping it in
the <<parallel,`parallel`>> function to provide parallel execution of
expressions across a cluster.
The `executor` function does not do anything specific with the output of the
expressions that it runs.
-Therefore the expressions that are executed must contain the logic for pushing
tuples to their destination.
+Therefore, the expressions that are executed must contain the logic for
pushing tuples to their destination.
The <<update,update function>> can be included in the expression being
executed to send the tuples to a SolrCloud collection for storage.
This model allows for asynchronous execution of jobs where the output is
stored in a SolrCloud collection where it can be accessed as the job progresses.
@@ -815,7 +815,7 @@ having(rollup(over=a_s,
----
-In this example, the `having` expression iterates the aggregated tuples from
the `rollup` expression and emits all tuples where the field `sum(a_i)` is
greater then 100 and less then 110.
+In this example, the `having` expression iterates the aggregated tuples from
the `rollup` expression and emits all tuples where the field `sum(a_i)` is
greater than 100 and less than 110.
== leftOuterJoin
@@ -1167,7 +1167,7 @@ The `parallel` function maintains the sort order of the
tuples returned by the w
For example if you sort on year, month and day you could partition on year
only as long as there are enough different years to spread the tuples around
the worker nodes.
Solr allows sorting on more than 4 fields, but you cannot specify more than 4
partitionKeys for speed considerations.
-Also it's overkill to specify many `partitionKeys` when we one or two keys
could be enough to spread the tuples.
+Also, it's overkill to specify many `partitionKeys` when we one or two keys
could be enough to spread the tuples.
Parallel stream was designed when the underlying search stream will emit a lot
of tuples from the collection.
If the search stream only emits a small subset of the data from the collection
using `parallel` could potentially be slower.
@@ -1299,7 +1299,7 @@ The group operation also serves as an example reduce
operation that can be refer
[IMPORTANT]
====
The reduce function relies on the sort order of the underlying stream.
-Accordingly the sort order of the underlying stream must be aligned with the
group by field.
+Accordingly, the sort order of the underlying stream must be aligned with the
group by field.
====
=== reduce Parameters
@@ -1490,7 +1490,7 @@ The `update` function wraps another functions and sends
the tuples to a SolrClou
=== update Parameters
-* `destinationCollection`: (Mandatory) The collection where the tuples will
indexed.
+* `destinationCollection`: (Mandatory) The collection where the tuples will be
indexed.
* `batchSize`: (Optional, defaults to `250`) The indexing batch size.
* `pruneVersionField`: (Optional, defaults to `true`) Whether to prune
`\_version_` values from tuples
* `StreamExpression`: (Mandatory)
diff --git
a/solr/solr-ref-guide/modules/query-guide/pages/stream-evaluator-reference.adoc
b/solr/solr-ref-guide/modules/query-guide/pages/stream-evaluator-reference.adoc
index ff4ef07db7b..61c17c96b31 100644
---
a/solr/solr-ref-guide/modules/query-guide/pages/stream-evaluator-reference.adoc
+++
b/solr/solr-ref-guide/modules/query-guide/pages/stream-evaluator-reference.adoc
@@ -18,9 +18,9 @@
// under the License.
-Stream evaluators are different then stream sources or stream decorators.
+Stream evaluators are different from stream sources or stream decorators.
Both stream sources and stream decorators return streams of tuples.
-Stream evaluators are more like a traditional function that evaluates its
parameters and returns an result.
+Stream evaluators are more like a traditional function that evaluates its
parameters and returns a result.
That result can be a single value, array, map or other structure.
Stream evaluators can be nested so that the output of an evaluator becomes the
input for another evaluator.
@@ -513,7 +513,7 @@ number | matrix: Either the covariance or covariance matrix.
== cumulativeProbability
The `cumulativeProbability` function returns the cumulative probability of a
random variable within a probability distribution.
-The cumulative probability is the total probability of all random variables
less then or equal to a random variable.
+The cumulative probability is the total probability of all random variables
less than or equal to a random variable.
=== cumulativeProbability Parameters
@@ -814,7 +814,7 @@ eor(eq(fieldA,fieldB),eq(fieldC,fieldD)) // true iff either
fieldA == fieldB or
The `eq` function will return whether all the parameters are equal, as per
Java's standard `equals(...)` function.
The function accepts parameters of any type, but will fail to execute if all
the parameters are not of the same type.
That is, all are Boolean, all are String, or all are Numeric.
-If any any parameters are null and there is at least one parameter that is not
null then false will be returned.
+If any parameters are null and there is at least one parameter that is not
null then false will be returned.
Returns a boolean value.
=== eq Parameters
@@ -1085,7 +1085,7 @@ number: the sum of all the values in the matrix.
The `gt` function will return whether the first parameter is greater than the
second parameter.
The function accepts numeric or string parameters, but will fail to execute if
all the parameters are not of the same type.
That is, all are String or all are Numeric.
-If any any parameters are null then an error will be raised.
+If any parameters are null then an error will be raised.
Returns a boolean value.
=== gt Parameters
@@ -1110,7 +1110,7 @@ gt(add(fieldA,fieldB),6) // fieldA + fieldB > 6
The `gteq` function will return whether the first parameter is greater than or
equal to the second parameter.
The function accepts numeric and string parameters, but will fail to execute
if all the parameters are not of the same type.
That is, all are String or all are Numeric.
-If any any parameters are null then an error will be raised.
+If any parameters are null then an error will be raised.
Returns a boolean value.
=== gteq Parameters
@@ -1338,7 +1338,7 @@ kolmogorovSmirnov(normalDistribution(10, 2), sampleSet)
The `lt` function will return whether the first parameter is less than the
second parameter.
The function accepts numeric or string parameters, but will fail to execute if
all the parameters are not of the same type.
That is, all are String or all are Numeric.
-If any any parameters are null then an error will be raised.
+If any parameters are null then an error will be raised.
Returns a boolean value.
=== lt Parameters
@@ -1363,7 +1363,7 @@ lt(add(fieldA,fieldB),6) // fieldA + fieldB < 6
The `lteq` function will return whether the first parameter is less than or
equal to the second parameter.
The function accepts numeric and string parameters, but will fail to execute
if all the parameters are not of the same type.
That is, all are String or all are Numeric.
-If any any parameters are null then an error will be raised.
+If any parameters are null then an error will be raised.
Returns a boolean value.
=== lteq Parameters
diff --git
a/solr/solr-ref-guide/modules/query-guide/pages/stream-source-reference.adoc
b/solr/solr-ref-guide/modules/query-guide/pages/stream-source-reference.adoc
index 3808a812f4c..39352261a2f 100644
--- a/solr/solr-ref-guide/modules/query-guide/pages/stream-source-reference.adoc
+++ b/solr/solr-ref-guide/modules/query-guide/pages/stream-source-reference.adoc
@@ -22,7 +22,7 @@
The `search` function searches a SolrCloud collection and emits a stream of
tuples that match the query.
This is very similar to a standard Solr query, and uses many of the same
parameters.
-This expression allows you to specify a request hander using the `qt`
parameter.
+This expression allows you to specify a request handler using the `qt`
parameter.
By default, the `/select` handler is used.
The `/select` handler can be used for simple rapid prototyping of expressions.
For production, however, you will most likely want to use the `/export`
handler which is designed to `sort` and `export` entire result sets.
@@ -316,7 +316,7 @@ After the model is retrieved it can be used by the
xref:stream-decorator-referen
A single model tuple is fetched and returned based on the *id* parameter.
The model is retrieved by matching the *id* parameter with a model name in the
index.
-If more then one iteration of the named model is stored in the index, the
highest iteration is selected.
+If more than one iteration of the named model is stored in the index, the
highest iteration is selected.
=== Caching with model
@@ -399,12 +399,12 @@ The foreground and background counts are global for the
collection.
* `limit`: (Optional, Default 20) The max number of terms to return.
* `minDocFreq`: (Optional, Defaults to 5 documents) The minimum number of
documents the term must appear in on a shard.
This is a float value.
-If greater then 1.0 then it's considered the absolute number of documents.
-If less then 1.0 it's treated as a percentage of documents.
+If greater than 1.0 then it's considered the absolute number of documents.
+If less than 1.0 it's treated as a percentage of documents.
* `maxDocFreq`: (Optional, Defaults to 30% of documents) The maximum number of
documents the term can appear in on a shard.
This is a float value.
-If greater then 1.0 then it's considered the absolute number of documents.
-If less then 1.0 it's treated as a percentage of documents.
+If greater than 1.0 then it's considered the absolute number of documents.
+If less than 1.0 it's treated as a percentage of documents.
* `minTermLength`: (Optional, Default 4) The minimum length of the term to be
considered significant.
=== significantTerms Syntax
@@ -420,7 +420,7 @@ significantTerms(collection1,
minTermLength="5")
----
-In the example above the `significantTerms` function is querying `collection1`
and returning at most 50 significant terms from the `authors` field that appear
in 10 or more documents but not more then 20% of the corpus.
+In the example above the `significantTerms` function is querying `collection1`
and returning at most 50 significant terms from the `authors` field that appear
in 10 or more documents but not more than 20% of the corpus.
== shortestPath
diff --git
a/solr/solr-ref-guide/modules/query-guide/pages/streaming-expressions.adoc
b/solr/solr-ref-guide/modules/query-guide/pages/streaming-expressions.adoc
index f61bca2477a..502e41fa7c5 100644
--- a/solr/solr-ref-guide/modules/query-guide/pages/streaming-expressions.adoc
+++ b/solr/solr-ref-guide/modules/query-guide/pages/streaming-expressions.adoc
@@ -26,7 +26,7 @@
Streaming expressions exposes the capabilities of SolrCloud as composable
functions.
These functions provide a system for searching, transforming, analyzing, and
visualizing data stored in SolrCloud collections.
-At a high level there a four main capabilities that will be explored in the
documentation:
+At a high level there are four main capabilities that will be explored in the
documentation:
* *Searching*, sampling and aggregating results from Solr.
@@ -121,7 +121,7 @@ A full reference to all available decorator expressions is
available in xref:str
Math expressions are a vector and matrix math library that can be combined
with streaming expressions to perform analysis and build mathematical models
of the result sets.
From a language standpoint math expressions are a sub-language of streaming
expressions that don't return streams of tuples.
-Instead they operate on and return numbers, vectors, matrices and mathematical
models.
+Instead, they operate on and return numbers, vectors, matrices and
mathematical models.
The documentation will show how to combine streaming expressions and math
expressions.
diff --git a/solr/solr-ref-guide/modules/query-guide/pages/suggester.adoc
b/solr/solr-ref-guide/modules/query-guide/pages/suggester.adoc
index 3907ecfdabe..fa0e78d3859 100644
--- a/solr/solr-ref-guide/modules/query-guide/pages/suggester.adoc
+++ b/solr/solr-ref-guide/modules/query-guide/pages/suggester.adoc
@@ -181,7 +181,7 @@ Use `buildOnCommit` to rebuild the dictionary with every
soft commit, or `buildO
+
Some lookup implementations may take a long time to build, especially with
large indexes.
In such cases, using `buildOnCommit` or `buildOnOptimize`, particularly with a
high frequency of soft commits is not recommended.
-Instead build the suggester at a lower frequency by manually issuing requests
with `suggest.build=true`.
+Instead, build the suggester at a lower frequency by manually issuing requests
with `suggest.build=true`.
`buildOnStartup`::
+
diff --git a/solr/solr-ref-guide/modules/query-guide/pages/tagger-handler.adoc
b/solr/solr-ref-guide/modules/query-guide/pages/tagger-handler.adoc
index 41ecaf0aae5..81b0f176779 100644
--- a/solr/solr-ref-guide/modules/query-guide/pages/tagger-handler.adoc
+++ b/solr/solr-ref-guide/modules/query-guide/pages/tagger-handler.adoc
@@ -161,7 +161,7 @@ Let this default to `false` unless you know that such
tokens can't be avoided.
+
A boolean flag that causes stopwords (or any condition causing positions to
skip like >255 char words) to be ignored as if they aren't there.
Otherwise, the behavior is to treat them as breaks in tagging on the
presumption your indexed text-analysis configuration doesn't have a
`StopWordFilter` defined.
-By default the indexed analysis chain is checked for the presence of a
`StopWordFilter` and if found then `ignoreStopWords` is true if unspecified.
+By default, the indexed analysis chain is checked for the presence of a
`StopWordFilter` and if found then `ignoreStopWords` is true if unspecified.
You probably shouldn't have a `StopWordFilter` configured and probably won't
need to set this parameter either.
`xmlOffsetAdjust`::
@@ -202,7 +202,7 @@ declare a field type, 2 fields, and a copy-field.
The critical part
up-front is to define the "tag" field type.
-There are many many ways to
+There are many ways to
configure text analysis; and we're not going to get into those choices
here.
But an important bit is the `ConcatenateGraphFilterFactory` at the
diff --git
a/solr/solr-ref-guide/modules/query-guide/pages/term-vector-component.adoc
b/solr/solr-ref-guide/modules/query-guide/pages/term-vector-component.adoc
index cd11b8617a0..ca92dcbb0d4 100644
--- a/solr/solr-ref-guide/modules/query-guide/pages/term-vector-component.adoc
+++ b/solr/solr-ref-guide/modules/query-guide/pages/term-vector-component.adoc
@@ -18,7 +18,7 @@
The TermVectorComponent is a search component designed to return additional
information about documents matching your search.
-For each document in the response, the TermVectorCcomponent can return the
term vector, the term frequency, inverse document frequency, position, and
offset information.
+For each document in the response, the TermVectorComponent can return the term
vector, the term frequency, inverse document frequency, position, and offset
information.
== Term Vector Component Configuration
diff --git a/solr/solr-ref-guide/modules/query-guide/pages/transform.adoc
b/solr/solr-ref-guide/modules/query-guide/pages/transform.adoc
index fa0dc933058..5cf5e5d58ed 100644
--- a/solr/solr-ref-guide/modules/query-guide/pages/transform.adoc
+++ b/solr/solr-ref-guide/modules/query-guide/pages/transform.adoc
@@ -65,7 +65,7 @@ The example below is using the `isNull` function inside of
`select` function to
The `if` function takes 3 parameters.
The first is a boolean expression, in this case `isNull`.
The `if` function returns the second parameter if the boolean function returns
true, and the third parameter if it returns false.
-In this case `isNull` is always true because its checking for a field in the
tuples that is not included in the result set.
+In this case `isNull` is always true because it's checking for a field in the
tuples that is not included in the result set.
image::math-expressions/select2.png[]
@@ -100,7 +100,7 @@ image::math-expressions/search-resort.png[]
== Rollups
The `rollup` and `hashRollup` functions can be used to perform aggregations
over result sets.
-This is different then the `facet`, `facet2D` and `timeseries` aggregation
functions which push the aggregations into the search engine using the JSON
facet API.
+This is different from the `facet`, `facet2D` and `timeseries` aggregation
functions which push the aggregations into the search engine using the JSON
facet API.
The `rollup` function performs map-reduce style rollups, which requires the
result stream be sorted by the grouping fields.
This allows for aggregations over very high cardinality fields.
diff --git a/solr/solr-ref-guide/modules/query-guide/pages/variables.adoc
b/solr/solr-ref-guide/modules/query-guide/pages/variables.adoc
index 2df2df6acd4..c8b0c1d3ecb 100644
--- a/solr/solr-ref-guide/modules/query-guide/pages/variables.adoc
+++ b/solr/solr-ref-guide/modules/query-guide/pages/variables.adoc
@@ -185,7 +185,7 @@ When this expression is sent to the `/stream` handler it
responds with:
}
----
-Using this approach variables can by visualized using Zeppelin-Solr.
+Using this approach variables can be visualized using Zeppelin-Solr.
In the example below the arrays are shown in table format.
image::math-expressions/variables.png[]