janhoy commented on code in PR #851:
URL: https://github.com/apache/solr/pull/851#discussion_r870215743
##########
solr/solr-ref-guide/modules/upgrade-notes/pages/major-changes-in-solr-9.adoc:
##########
@@ -61,333 +61,126 @@ A rolling upgrade from Solr 8 to Solr 9 requires the
following multiple restart
It is always strongly recommended that you fully reindex your documents after
a major version upgrade. For details, see the
xref:indexing-guide:reindexing.adoc[] section, which covers several strategies
for how to reindex.
-In Solr 8, it's possible to add docValues to a schema without re-indexing via
`UninvertDocValuesMergePolicy`, an advanced/expert utility.
-Due to changes in Lucene 9, that isn't possible anymore; the component was
removed.
-
-== Solr 9.0 Raw Notes (NOT YET EDITED)
-
-_(raw; not yet edited)_
-
-
-
-
-
-* SOLR-13854, SOLR-13858: SolrMetricProducer / SolrInfoBean APIs have changed
and third-party components that implement these APIs need to be updated.
-
-* SOLR-14344: Remove Deprecated HttpSolrClient.RemoteSolrException and
HttpSolrClient.RemoteExcecutionException.
-All the usages are replaced by BaseHttpSolrClient.RemoteSolrException and
BaseHttpSolrClient.RemoteExcecutionException.
-
-* SOLR-15409: Zookeeper client libraries upgraded to 3.7.0, which may not be
compatible with your existing server installations
-
-* SOLR-15809: Get rid of blacklist/whitelist terminology. JWTAuthPlugin
parameter `algWhitelist` is now `algAllowlist`. The old parameter will still
- work in 9.x. Environment variables `SOLR_IP_WHITELIST` and
`SOLR_IP_BLACKLIST` are no longer supported, but replaced with
`SOLR_IP_ALLOWLIST` and `SOLR_IP_DENYLIST`.
-
-* SOLR-11623: Every request handler in Solr now implements
PermissionNameProvider. Any custom or 3rd party request handler must also do
this
-
-* SOLR-14142: Jetty low level request-logging in NCSA format is now enabled by
default, with a retention of 3 days worth of logs.
- This may require some more disk space for logs than was the case in 8.x. See
Reference Guide chapter "Configuring Logging" for how to change this.
-
-* SOLR-15944: The Tagger's JSON response format now always uses an object/map
to represent each tag instead of an array.
-
-* SOLR-15842: Async responses for backups now correctly aggregate and return
information.
-In previous versions there was a field returned in async backup status
responses, `Response`. This has now been renamed to `msg`, to better fit other
collections API responses.
-The `response` field is now a map, containing information about the backup
(`startTime`, `indexSizeMB`, `indexFileCount`, etc.).
-
-* SOLR-15982: For collection's snapshot backup request responses additional
fields `indexVersion`, `indexFileCount`, etc. were added similar to incremental
backup request responses.
-Also, both snapshot and incremental backup request responses will now contain
`starTime` and `endTime`.
-Snapshot backup shard's response were updated to add fields `indexFileCount`
and `endTime`, snapshot delete shard's response were updated to add fields
`startTime` and `endTime`.
-Previous fields `fileCount`, `snapshotCompletedAt` and `snapshotDeletedAt` of
backup and delete shard's responses are now deprecated and will be removed in
future releases.
-All date/time fields of backup and delete related shard's responses have been
updated to use `Instance` instead of `Date`, meaning the output will be in the
standard ISO 8601 Format.
-
-* SOLR-15884: In Backup request responses, the `response` key now uses a map
to return information instead of a list.
-This is only applicable for users returning information in JSON format, which
is the default behavior.
-
-* SOLR-14660: HDFS storage support has been moved to a module. Existing Solr
configurations do not need any HDFS-related
-changes, however the module needs to be installed - see the section
xref:deployment-guide:solr-on-hdfs.adoc[].
-
-* SOLR-16040: If you are using the HDFS backup repository, you need to change
the repository class to
`org.apache.solr.hdfs.backup.repository.HdfsBackupRepository` - see the
xref:deployment-guide:backup-restore.adoc#hdfsbackuprepository[HDFS Backup
Repository] section.
-
-* SOLR-13989: Hadoop authentication support has been moved to the hadoop-auth
module. Existing Solr configurations do not need any Hadoop authentication
related
-changes, however the module needs to be installed - see the section
xref:deployment-guide:hadoop-authentication-plugin.adoc[].
-
-* SOLR-15904: SQL support has been moved to the sql module. Existing Solr
configurations do not need any SQL related
-changes, however the module needs to be installed - see the section
xref:query-guide:sql-query.adoc[].
-
-* SOLR-15950: The folder $SOLR_HOME/userfiles, used by the "cat" streaming
expression, is no longer created automatically on startup. The user must create
this folder.
-
-* SOLR-15097: JWTAuthPlugin has been moved to a module. Users need to add the
module to classpath. The plugin has also
- changed package name to `org.apache.solr.security.jwt`, but can still be
loaded as shortform `class="solr.JWTAuthPlugin"`.
-
-* SOLR-14401: Metrics: Only SearchHandler and subclasses have "local" metrics
now.
-It's now tracked as if it's another handler with a "[shard]" suffix, e.g.
"/select[shard]".
-There are no longer ".distrib." named metrics; all metrics are assumed to be
such except
-"[shard]". The default Prometheus exporter config splits that component to a
new label
-named "internal". The sample Grafana dashboard now filters to include or
exclude this.
-
-== New Features & Enhancements
-
-// Fill these sub headings
-
-=== Docker
-
-=== Security
-
-=== Scalability
-
-=== Modules
-
-* HDFS
-* Hadoop-auth
-* GCS-repository
-* JWT-auth
-* Scripting
-* SQL
-
-
-=== Gradle build
-
-=== Other
-
-* SOLR-13671: Allow 'var' keyword in Java sources
-
-// TBD
-
-* Replica placement plugins
-
-* Rate limiting and task management
-
-* Certificate Auth Plugin
-
-* SQL Query interface in UI
-
-== Configuration and Default Parameter Changes
-
-// TODO: Move into sub headings
-
-=== Schema Changes in 9.0
-
-=== Indexing Changes in 9.0
-
-=== Query Changes in 9.0
-
-=== Authentication & Security Changes in 9.0
-
-=== UI Changes in 9.0
-
-=== Dependency Updates in 9.0
-
-// RAW notes below
-
-* SOLR-7530: TermsComponent's JSON response format was changed so that "terms"
property carries per field arrays by default regardless of distrib, terms.list,
terms.ttf parameters.
-This affects JSON based response format but not others
-
-* SOLR-14036: Implicit /terms handler now returns terms across all shards in
SolrCloud instead of only the local core.
-Users/apps may be assuming the old behavior.
-A request can be modified via the standard distrib=false param to only use the
local core receiving the request.
-
-* SOLR-13783: In situations where a NamedList must be output as plain text,
commas between key-value pairs will now be followed by a space (e.g.,
{shape=square, color=yellow} rather than {shape=square,color=yellow}) for
consistency with other `java.util.Map` implementations based on `AbstractMap`.
-
-* SOLR-11725: JSON aggregations uses corrected sample formula to compute
standard deviation and variance.
-The computation of stdDev and variance in JSON aggregation is same as
StatsComponent.
-
-* SOLR-14012: unique and hll aggregations always returns long value
irrespective of standalone or solcloud
-
-* SOLR-11775: Return long value for facet count in Json Facet module
irrespective of number of shards
-
-* SOLR-15276: V2 API call to look up async request status restful style of
"/cluster/command-status/1000" instead of
"/cluster/command-status?requestid=1000".
-
-* SOLR-14972: The default port of prometheus exporter has changed from 9983 to
8989, so you may need to adjust your configuration after upgrade.
-
-* SOLR-15471: The language identification "whitelist" configuration is now an
"allowlist" to better convey the meaning of the property
-
+In Solr 8, it was possible to add docValues to a schema without re-indexing
via `UninvertDocValuesMergePolicy`, an advanced/expert utility.
+Due to changes in Lucene 9, that isn't possible any more.
+
+== Solr 9.0
+=== Querying and Indexing
+* Dense Vector "Neural" Search through DenseVectorField fieldType and
K-Nearest-Neighbor (KNN) Query Parser.
+* Admin UI support for SQL Querying.
+* New snowball stemmers: Hindi, Indonesian, Nepali, Serbian, Tamil, and
Yiddish.
+* New NorwegianNormalizationFilter
+* Implicit `/terms` handler now returns terms across all shards in SolrCloud
instead of only the local core.
+Users/apps may be assuming the old behavior. A request can be modified via the
standard `distrib=false` param to only use the local core receiving the request.
+* SQL support has been moved to the sql module. Existing Solr configurations
do not need any SQL related changes, however the module needs to be installed -
see the section xref:query-guide:sql-query.adoc[].
+* SOLR-11725: JSON aggregations uses corrected sample formula to compute
standard deviation and variance. The computation of stdDev and variance in JSON
aggregation is same as StatsComponent.
+* SOLR-11775: Facet count in Json Facet module always returns a long value for
irrespective of number of shards.
* SOLR-12891: MacroExpander will no longer will expand URL parameters inside
of the 'expr' parameter (used by streaming expressions).
-Additionally, users are advised to use the 'InjectionDefense' class when
constructing streaming expressions that include user supplied data to avoid
risks similar to SQL injection.
-The legacy behavior of expanding the 'expr' parameter can be reinstated with
-DStreamingExpressionMacros=true passed to the JVM at startup
-
-* SOLR-13324: URLClassifyProcessor#getCanonicalUrl now throws
MalformedURLException rather than hiding it.
-Although the present code is unlikely to produce such an exception it may be
possible in future changes or in subclasses.
-Currently this change should only effect compatibility of custom code
overriding this method.
-
-* SOLR-14510: The `writeStartDocumentList` in `TextResponseWriter` now
receives an extra boolean parameter representing the "exactness" of the
`numFound` value (exact vs approximation).
-Any custom response writer extending `TextResponseWriter` will need to
implement this abstract method now (instead previous with the same name but
without the new boolean parameter).
-
-* SOLR-15259: hl.fragAlignRatio now defaults to 0.33 to be faster and maybe
looks nicer.
-
+Additionally, users are advised to use the 'InjectionDefense' class when
constructing streaming expressions that include user supplied data to avoid
risks similar to SQL injection. The legacy behavior of expanding the 'expr'
parameter can be reinstated with -DStreamingExpressionMacros=true passed to the
JVM at startup
* SOLR-9376: The response format for field values serialized as raw XML (via
the `[xml]` raw value DocTransformer
and `wt=xml`) has changed. Previously, values were dropped in directly as
top-level child elements of each `<doc>`,
obscuring associated field names and yielding inconsistent `<doc>` structure.
As of version 9.0, raw values are
wrapped in a `<raw name="field_name">[...]</raw>` element at the top level of
each `<doc>` (or within an enclosing
`<arr name="field_name"><raw>[...]</raw></arr>` element for multi-valued
fields). Existing clients that parse field
values serialized in this way will need to be updated accordingly.
-
-* SOLR-9575: Solr no longer requires a `solr.xml` in `$SOLR_HOME`. If one is
not found, Solr will instead use the default one from
`$SOLR_TIP/server/solr/solr.xml`. You can revert to the pre-9.0 behaviour by
setting environment variable `SOLR_SOLRXML_REQUIRED=true` or system property
`-Dsolr.solrxml.required=true`. Solr also does not require a `zoo.cfg` in
`$SOLR_HOME` if started with embedded zookeeper.
-
* SOLR-12901: Highlighting: hl.method=unified is the new default. Use
hl.method=original
- to switch back if needed.
-
-* SOLR-12055 introduces async logging by default. There's a small window where
log messages may be lost in the event of some hard crash.
-Switch back to synchronous logging if this is unacceptable, see comments in
the log4j2 configuration files (log4j2.xml by default).
-
-=== solr.xml maxBooleanClauses now enforced recursively
-
-Lucene 9.0 has additional safety checks over previous versions that impact how
the `solr.xml` global
xref:configuration-guide:configuring-solr-xml#global-maxbooleanclauses[`maxBooleanClauses`]
option is enforced.
-
-In previous versions of Solr, this option was a hard limit on the number of
clauses in any `BooleanQuery` object - but it was only enforced for the
_direct_ clauses.
-Starting with Solr 9, this global limit is now also enforced against the total
number of clauses in a _nested_ query structure.
-
-Users who upgrade from prior versions of Solr may find that some requests
involving complex internal query structures (Example: long query strings using
`edismax` with many `qf` and `pf` fields that include query time synonym
expansion) which worked in the past now hit this limit and fail.
-
-User's in this situation are advised to consider the complexity f their
queries/configuration, and increase the value of
xref:configuration-guide:configuring-solr-xml#global-maxbooleanclauses[`maxBooleanClauses`]
if warranted.
-
-=== Log4J configuration & Solr MDC values
-
-link:http://www.slf4j.org/apidocs/org/slf4j/MDC.html[MDC] values that Solr
sets for use by Logging calls (such as the collection name, shard name, replica
name, etc...) have been modified to now be "bare" values, with out the special
single character prefixes that were included in past version.
-For example: In 8.x Log messages for a collection named "gettingstarted" would
have an MDC value with a key `collection` mapped to a value of
`c:gettingstarted`, in 9.x the value will simply be `gettingstarted`.
-
-Solr's default `log4j2.xml` configuration file has been modified to prepend
these same prefixes to MDC values when included in Log messages as part of the
`<PatternLayout/>`.
-Users who have custom logging configurations that wish to ensure Solr 9.x logs
are consistently formatted after upgrading will need to make similar changes to
their logging configuration files. See
link:https://issues.apache.org/jira/browse/SOLR-15630[SOLR-15630] for more
details.
-
-
-=== base_url removed from stored state
-
-If you're able to upgrade SolrJ to 8.8.x for all of your client applications,
then you can set `-Dsolr.storeBaseUrl=false` (introduced in Solr 8.8.1) to
better align the stored state in Zookeeper with future versions of Solr; as of
Solr 9.x, the `base_url` will no longer be persisted in stored state.
-However, if you are not able to upgrade SolrJ to 8.8.x for all client
applications, then you should set `-Dsolr.storeBaseUrl=true` so that Solr will
continue to store the `base_url` in Zookeeper.
-For background, see: SOLR-12182 and SOLR-15145.
-
-Support for the `solr.storeBaseUrl` system property will be removed in Solr
10.x and `base_url` will no longer be stored.
-
-* Solr's distributed tracing no longer incorporates a special
`samplePercentage` SolrCloud cluster property.
-Instead, consult the documentation for the tracing system you use on how to
sample the traces.
-Consequently, if you use a Tracer at all, you will always have traces and thus
trace IDs in logs.
-What percentage of them get reported to a tracing server is up to you.
-
-* JaegerTracerConfigurator no longer recognizes any configuration in solr.xml.
- It is now completely configured via System properties and/or Environment
variables as documented by Jaeger.
-
-=== Schema Changes
-
-* `LegacyBM25SimilarityFactory` has been removed.
-
-* SOLR-13593 SOLR-13690 SOLR-13691: Allow to look up analyzer components by
their SPI names in field type configuration.
-
-=== Authentication & Security Changes
-
-* The property `blockUnknown` in the BasicAuthPlugin and the JWTAuthPlugin now
defaults to `true`.
-This change is backward incompatible.
-If you need the pre-9.0 default behavior, you need to explicitly set
`blockUnknown:false` in `security.json`.
+to switch back if needed.
+* solr.xml `maxBooleanClauses` is now enforced recursively. Users who upgrade
from prior versions of Solr may find that some requests involving complex
internal query structures (Example: long query strings using `edismax` with
many `qf` and `pf` fields that include query time synonym expansion) which
worked in the past now hit this limit and fail. Users in this situation are
advised to consider the complexity of their queries/configuration, and increase
the value of
xref:configuration-guide:configuring-solr-xml#global-maxbooleanclauses[`maxBooleanClauses`]
if warranted.
+* Atomic/partial updates to nested documents now _require_ the `\_root_` field
to clearly show the document isn't a root document. Solr 8 would fallback on
the `\_route_` param but no longer.
+=== Security
+* Certificate Authentication Plugin, enabling end-to-end use of x509 client
certificates for Authentication and Authorization.
+* Improved security when using PKI Authentication plugin.
+* Upgrade to Zookeeper 3.7, allowing for TLS protected ZK communication.
+* All request handlers support security permissions for access.
+* Ability to disable admin UI through a system property.
+* The property blockUnknown in the BasicAuthPlugin and the JWTAuthPlugin now
defaults to true instead of false. This change is backward incompatible. If you
need the pre-9.0 default behavior, you need to explicitly set
`blockUnknown:false` in `security.json`.
+* Solr now runs with the Java security manager enabled by default. Hadoop
users may need to disable this.
+* Solr now binds to localhost network interface by default for better out of
the box security.
+Administrators that need Solr exposed more broadly can change the
`SOLR_JETTY_HOST` property in their Solr include (solr.in.sh/solr.in.cmd) file.
+* Solr embedded zookeeper only binds to localhost by default. This embedded
zookeeper should not be used in production.
+If you rely upon the previous behavior, then you can change the
`clientPortAddress` in `solr/server/solr/zoo.cfg`
+* Jetty low level request-logging in NCSA format is now enabled by default,
with a retention of 3 days worth of logs.
+This may require some more disk space for logs than was the case in 8.x. See
Reference Guide chapter "Configuring Logging" for how to change this.
+* SOLR-13989: Hadoop authentication support has been moved to the hadoop-auth
module. Existing Solr configurations do not need any Hadoop authentication
related changes, however the module needs to be installed - see the section
xref:deployment-guide:hadoop-authentication-plugin.adoc[].
Review Comment:
```suggestion
* Hadoop authentication support has been moved to the new `hadoop-auth`
module. Existing Solr configurations do not need any Hadoop authentication
related changes, however the module needs to be installed - see the section
xref:deployment-guide:hadoop-authentication-plugin.adoc[].
```
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]