This is an automated email from the ASF dual-hosted git repository.
nightowl888 pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/lucenenet.git.
from 5c43454 Ported Lucene.Net.Analysis.OpenNLP + tests
new ef98efb Lucene.Net.Support.TreeDictionary: Normalized whitespace
new 8286214 Lucene.Net.Support.TreeSet: Finished implementation of
ISet<T> and added tests (LUCENENET-616)
new 88e7e4e Lucene.Net.Support.TreeSet: Fixed potential thread safety
issue with IntersectWith() by utilizing existing RetainAll() method.
new 7635aa7 Lucene.Net.Support.TreeDictionary: Created facade
TreeDictionary around C5's TreeDictionary with the same API as
System.Collections.Generic.SortedDictionary including input/output types and
exceptions. (LUCENENET-612, LUCENENET-616)
new aa630a0 Lucene.Net.Support.HashMap, Lucene.Net.Support.LinkedHashMap:
Corrected documentation - TreeDictionary doesn't allow duplicate keys.
new 1e1931c Lucene.Net.Support.TreeSet: Normalized whitespace
new 70b00ee Lucene.Net.Tests.Grouping.GroupFacetCollectorTest: Fixed
logging print issue
new 4c0084c Lucene.Net.Support.EnumerableExtensions: Added
TakeAllButLast() extension methods to IEnumerable<T>
new 0ab4eb7 BUG:
Lucene.Net.Highlighter.PostingsHighlight.PostingsHighlighter: SortedSet<T> has
the wrong behavior for getting a range of values (second argument is supposed
to be exclusive), so swapped in TreeSet<T> (LUCENENET-619)
new 7aae74a
Lucene.Net.Tests.Highlighter.VectorHighlight.SimpleFragmentsBuilderTest::TestRandomDiscreteMultiValueHighlighting():
Utilize BCL .Append(), ToString(), Count to reduce chances of issues caused by
bugs in extension methods
new b56aa78 Lucene.Net.Support.IdentityComparer: Added singleton pattern
to prevent creating unnecessary instances of IdentityComparer at runtime.
new bce6015 Lucene.Net.Support.Collections: Factored out AddAll(ISet<T>,
IEnumerable<T>) in favor of set.UnionWith()
new d33defb Corrected example in lucene-cli docs for
analysis-kuromoji-build-dictionary
new 1cb33f6 Lucene.Net.Analysis.Kuromoji: Changed GetInstance() >
Instance property for ConnectionCosts, TokenInfoDictionary, and
UnknownDictionary
new a5b7496 Lucene.Net.Analysis.SmartCn, Lucene.Net.Benchmark: Removed
out of date package release notes that only applied to icu-dotnet (we are now
using ICU4N instead).
new 77d57ac Lucene.Net.Util.Int32sRef: Added TODO about making the main
field into an indexer.
new 85b38e6 BUG: Updated platform detection on .NET Framework
new 71fcc19 Ported Lucene.Net.Analysis.Morfologik + tests
new 40dad16 Lucene.Net.Analysis.OpenNLP: Added Lucene compatibility
version, made some minor code improvements
new aa6b856
Lucene.Net.Analysis.Common.Analysis.Miscellaneous.PerFieldAnalyzerWrapper:
Added documentation to clarify that the IDictionary<TKey, TValue> type that is
provided will determine how the component behaves. (LUCENENET-612,
LUCENENET-616)
The 20 revisions listed above as "new" are entirely new to this
repository and will be described in separate emails. The revisions
listed as "add" were already present in the repository and have only
been added to this reference.
Summary of changes:
Lucene.Net.sln | 16 +-
build/Dependencies.props | 3 +
.../publish-test-results-for-test-projects.yml | 20 +
build/build.ps1 | 12 +
global.json | 9 +-
.../Miscellaneous/PerFieldAnalyzerWrapper.cs | 32 +-
.../Dict/CharacterDefinition.cs | 5 +-
.../Dict/ConnectionCosts.cs | 5 +-
.../Dict/TokenInfoDictionary.cs | 10 +-
.../Dict/UnknownDictionary.cs | 7 +-
.../JapaneseTokenizer.cs | 6 +-
.../Lucene.Net.Analysis.Morfologik.csproj} | 43 +-
.../Morfologik/MorfologikAnalyzer.cs | 79 ++
.../Morfologik/MorfologikFilter.cs | 183 +++
.../Morfologik/MorfologikFilterFactory.cs | 105 ++
.../IMorphosyntacticTagsAttribute.cs} | 36 +-
.../MorphosyntacticTagsAttribute.cs | 105 ++
src/Lucene.Net.Analysis.Morfologik/Uk/README | 11 +
.../Uk/UkrainianMorfologikAnalyzer.cs | 177 +++
.../Uk/mapping_uk.txt} | 12 +-
.../Uk/stopwords.txt | 1269 ++++++++++++++++++++
src/Lucene.Net.Analysis.Morfologik/Uk/tagset.txt | 170 +++
.../Uk/ukrainian.dict | Bin 0 -> 6929502 bytes
.../Uk/ukrainian.info | 10 +
.../OpenNLPChunkerFilter.cs | 7 +-
.../OpenNLPChunkerFilterFactory.cs | 5 +-
.../OpenNLPLemmatizerFilter.cs | 4 +-
.../OpenNLPLemmatizerFilterFactory.cs | 7 +-
.../OpenNLPPOSFilter.cs | 7 +-
.../OpenNLPPOSFilterFactory.cs | 5 +-
.../OpenNLPSentenceBreakIterator.cs | 3 +-
.../OpenNLPTokenizer.cs | 7 +-
.../OpenNLPTokenizerFactory.cs | 3 +-
.../Tools/NLPChunkerOp.cs | 5 +-
.../Tools/NLPLemmatizerOp.cs | 3 +-
.../Tools/NLPNERTaggerOp.cs | 3 +-
.../Tools/NLPPOSTaggerOp.cs | 5 +-
.../Tools/NLPSentenceDetectorOp.cs | 3 +-
.../Tools/NLPTokenizerOp.cs | 3 +-
.../Tools/OpenNLPOpsFactory.cs | 25 +-
.../Lucene.Net.Analysis.SmartCn.csproj | 1 -
.../Lucene.Net.Benchmark.csproj | 1 -
.../PostingsHighlight/PostingsHighlighter.cs | 7 +-
src/Lucene.Net.Queries/Function/ValueSource.cs | 2 +-
.../Properties/AssemblyInfo.cs | 2 +
.../Analysis/MockCharFilter.cs | 5 +-
.../Store/MockDirectoryWrapper.cs | 2 +-
.../Dict/TestTokenInfoDictionary.cs | 4 +-
.../TestJapaneseTokenizer.cs | 2 +-
.../Lucene.Net.Tests.Analysis.Morfologik.csproj | 47 +
.../Morfologik/TestMorfologikAnalyzer.cs | 235 ++++
.../Morfologik/TestMorfologikFilterFactory.cs | 107 ++
.../Morfologik/custom-dictionary.dict | Bin 0 -> 90 bytes
.../Morfologik/custom-dictionary.info | 24 +
.../Morfologik/custom-dictionary.input | 2 +
.../Uk/TestUkrainianAnalyzer.cs | 92 ++
.../TestOpenNLPChunkerFilterFactory.cs | 2 +-
.../TestOpenNLPLemmatizerFilterFactory.cs | 6 +-
.../TestOpenNLPPOSFilterFactory.cs | 4 +-
.../TestOpenNLPSentenceBreakIterator.cs | 3 +-
.../TestOpenNLPTokenizerFactory.cs | 3 +-
.../GroupFacetCollectorTest.cs | 2 +-
.../VectorHighlight/SimpleFragmentsBuilderTest.cs | 24 +-
.../Support/{ => C5}/TestTreeDictionary.cs | 2 +-
.../Support/TestEnumerableExtensions.cs} | 36 +-
src/Lucene.Net.Tests/Support/TestTreeDictionary.cs | 699 +++++++----
src/Lucene.Net.Tests/Support/TestTreeSet.cs | 202 ++++
src/Lucene.Net/Index/DocumentsWriterPerThread.cs | 4 +-
src/Lucene.Net/Index/ParallelAtomicReader.cs | 4 +-
src/Lucene.Net/Index/ParallelCompositeReader.cs | 6 +-
src/Lucene.Net/Index/SegmentCommitInfo.cs | 2 +-
src/Lucene.Net/Index/SegmentCoreReaders.cs | 2 +-
src/Lucene.Net/Support/Collections.cs | 10 -
src/Lucene.Net/Support/EnumerableExtensions.cs | 67 +-
src/Lucene.Net/Support/HashMap.cs | 2 +-
src/Lucene.Net/Support/IdentityComparer.cs | 40 +-
src/Lucene.Net/Support/IdentityHashMap.cs | 8 +-
src/Lucene.Net/Support/IdentityHashSet.cs | 6 +-
src/Lucene.Net/Support/LinkedHashMap.cs | 2 +-
src/Lucene.Net/Support/TreeDictionary.cs | 1229 ++++++++++++++++++-
src/Lucene.Net/Support/TreeSet.cs | 385 ++++--
src/Lucene.Net/Util/Constants.cs | 8 +-
src/Lucene.Net/Util/IntsRef.cs | 2 +-
.../docs/analysis/kuromoji-build-dictionary.md | 2 +-
84 files changed, 5241 insertions(+), 474 deletions(-)
copy src/{Lucene.Net.Analysis.OpenNLP/Lucene.Net.Analysis.OpenNLP.csproj =>
Lucene.Net.Analysis.Morfologik/Lucene.Net.Analysis.Morfologik.csproj} (50%)
create mode 100644
src/Lucene.Net.Analysis.Morfologik/Morfologik/MorfologikAnalyzer.cs
create mode 100644
src/Lucene.Net.Analysis.Morfologik/Morfologik/MorfologikFilter.cs
create mode 100644
src/Lucene.Net.Analysis.Morfologik/Morfologik/MorfologikFilterFactory.cs
copy
src/{Lucene.Net.Analysis.ICU/Analysis/Icu/TokenAttributes/ScriptAttribute.cs =>
Lucene.Net.Analysis.Morfologik/Morfologik/TokenAttributes/IMorphosyntacticTagsAttribute.cs}
(53%)
create mode 100644
src/Lucene.Net.Analysis.Morfologik/Morfologik/TokenAttributes/MorphosyntacticTagsAttribute.cs
create mode 100644 src/Lucene.Net.Analysis.Morfologik/Uk/README
create mode 100644
src/Lucene.Net.Analysis.Morfologik/Uk/UkrainianMorfologikAnalyzer.cs
copy src/{Lucene.Net.Tests.Analysis.Common/Analysis/Miscellaneous/keep-1.txt
=> Lucene.Net.Analysis.Morfologik/Uk/mapping_uk.txt} (72%)
create mode 100644 src/Lucene.Net.Analysis.Morfologik/Uk/stopwords.txt
create mode 100644 src/Lucene.Net.Analysis.Morfologik/Uk/tagset.txt
create mode 100644 src/Lucene.Net.Analysis.Morfologik/Uk/ukrainian.dict
create mode 100644 src/Lucene.Net.Analysis.Morfologik/Uk/ukrainian.info
create mode 100644
src/Lucene.Net.Tests.Analysis.Morfologik/Lucene.Net.Tests.Analysis.Morfologik.csproj
create mode 100644
src/Lucene.Net.Tests.Analysis.Morfologik/Morfologik/TestMorfologikAnalyzer.cs
create mode 100644
src/Lucene.Net.Tests.Analysis.Morfologik/Morfologik/TestMorfologikFilterFactory.cs
create mode 100644
src/Lucene.Net.Tests.Analysis.Morfologik/Morfologik/custom-dictionary.dict
create mode 100644
src/Lucene.Net.Tests.Analysis.Morfologik/Morfologik/custom-dictionary.info
create mode 100644
src/Lucene.Net.Tests.Analysis.Morfologik/Morfologik/custom-dictionary.input
create mode 100644
src/Lucene.Net.Tests.Analysis.Morfologik/Uk/TestUkrainianAnalyzer.cs
copy src/Lucene.Net.Tests/Support/{ => C5}/TestTreeDictionary.cs (99%)
copy
src/{Lucene.Net.Tests.Benchmark/Support/TestEnglishNumberFormatExtensions.cs =>
Lucene.Net.Tests/Support/TestEnumerableExtensions.cs} (52%)