This is an automated email from the ASF dual-hosted git repository.
nightowl888 pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/lucenenet.git
from 31ac9f6b7 Sonar changes required for #671 (#730)
new e8c34943e PERFORMANCE:
Lucene.Net.Document.CompressionTools::CompressString(): Eliminated unnecessary
ToCharArray() allocation
new eea43d74b PERFORMANCE:
Lucene.Net.Codecs.SimpleText.SimpleTextUtil::Write(): Removed unnecessary
ToCharArray() allocation
new e8d9d7b94 PERFORMANCE:
Lucene.Net.Analysis.CharFilters.HTMLStripCharFilter: Removed allocation during
parse of hexadecimal number by using J2N.Numerics.Int32 to specify index and
length. Also added a CharArrayFormatter struct to defer the allocation of
constructing a string until after an assertion failure.
new 98c52a864 PERFORMANCE: Lucene.Net.Analysis.Util.CharacterUtils: Use
spans and stackalloc to reduce heap allocations when lowercasing. Added system
property named "maxStackLimit" that defaults to 2048 bytes.
new 3abf2dbfe PERFORMANCE:
Lucene.Net.Analysis.Miscellaneous.StemmerOverrideFilter: Added overloads to Add
for ICharSequence and char[] to reduce allocations. Added guard clauses.
new 31bbe9f35 Lucene.Net.Util.TestUnicodeUtil::TestUTF8toUTF32(): Added
additional tests for ICharSequence and char[] overloads, changed the original
test to test string.
new e3606755c PERFORMANCE:
Lucene.Net.Analysis.Util.SegmentingTokenizerBase: Removed unnecessary string
allocations that were added during the port due to missing APIs.
new 3e63d1529 PERFORMANCE: Lucene.Net.Analysis.Ja.GraphvizFormatter:
Removed unnecessary surfaceForm string allocation.
new fca99681e PERFORMANCE: Lucene.Net.Analysis.In.IndicNormalizer:
Replaced static constructor with inline LoadScripts() method. Moved location of
scripts field to ensure decompositions is initialized first.
new d660b9d51 PERFORMANCE: Lucene.Net.Analysis.In.IndicNormalizer:
Refactored ScriptData to change Dictionary<Regex, ScriptData> to
List<ScriptData> and eliminated unnecessary hashtable lookup. Use static fields
for unknownScript and [ThreadStatic] previousScriptData to optimize character
script matching.
new e72315a75 PERFORMANCE: Lucene.Net.Analysis.Th.ThaiWordBreaker: Removed
unnecessary string allocations and concatenation. Use CharsRef to reuse the
same memory. Removed Regex and replaced with UnicodeSet to detect Thai code
points.
new 56c8e08e0 PERFORAMANCE: Lucene.Net.Analysis.Ga.IrishLowerCaseFilter:
Use stack and spans to reduce allocations and improve throughput.
new 1d74f980a PERFORMANCE: Lucene.Net.Analysis.Util.OpenStringBuilder:
Added overloads of UnsafeWrite() for string an ICharSequence. Optimized
Append() methods to call UnsafeWrite with index and count to optimize the
operation depending on the type of object passed.
new 3fbee37ed PERFORMANCE: Lucene.Net.Analsis.Util.HTMLStripCharFilter:
Refactored to remove YyText property (method) which allocates a string every
time it is called. Instead, we pass the underlying array to
J2N.Numerics.TryParse() and OpenStringBuilder.Append() with the calculated
startIndex and length to directly copy the characters without allocating
substrings.
The 14 revisions listed above as "new" are entirely new to this
repository and will be described in separate emails. The revisions
listed as "add" were already present in the repository and have only
been added to this reference.
Summary of changes:
.build/dependencies.props | 1 +
.../Analysis/CharFilter/HTMLStripCharFilter.cs | 222 ++++++++++++---------
.../Analysis/Ga/IrishLowerCaseFilter.cs | 15 +-
.../Analysis/In/IndicNormalizer.cs | 121 +++++++----
.../Miscellaneous/StemmerOverrideFilter.cs | 116 ++++++++++-
.../Analysis/Nl/DutchAnalyzer.cs | 2 +-
.../Analysis/Th/ThaiTokenizer.cs | 24 ++-
.../Analysis/Th/ThaiWordFilter.cs | 4 +-
.../Analysis/Util/CharacterUtils.cs | 40 ++--
.../Analysis/Util/OpenStringBuilder.cs | 54 +++--
.../Analysis/Util/SegmentingTokenizerBase.cs | 6 +-
.../Lucene.Net.Analysis.Common.csproj | 8 +
.../GraphvizFormatter.cs | 7 +-
src/Lucene.Net.Codecs/SimpleText/SimpleTextUtil.cs | 2 +-
.../Analysis/In/TestIndicNormalizer.cs | 10 +-
.../Miscellaneous/TestStemmerOverrideFilter.cs | 27 +++
.../Configuration/TestConfigurationService.cs | 8 +
.../Startup.cs | 3 +-
src/Lucene.Net.Tests/Support/TestApiConsistency.cs | 2 +-
src/Lucene.Net.Tests/Util/TestUnicodeUtil.cs | 82 ++++++++
src/Lucene.Net/Document/CompressionTools.cs | 4 +-
src/Lucene.Net/Lucene.Net.csproj | 5 +-
.../Support/Text/CharArrayFormatter.cs} | 20 +-
src/Lucene.Net/Util/Constants.cs | 7 +-
src/Lucene.Net/Util/UnicodeUtil.cs | 149 +++++++++++++-
25 files changed, 721 insertions(+), 218 deletions(-)
copy src/{Lucene.Net.QueryParser/Flexible/Core/Nodes/MatchNoDocsQueryNode.cs
=> Lucene.Net/Support/Text/CharArrayFormatter.cs} (61%)