[
https://issues.apache.org/jira/browse/LUCENE-6973?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15098798#comment-15098798
]
Shai Erera commented on LUCENE-6973:
------------------------------------
I ran the tests and {{TestRandomChains}} fails with this:
{noformat}
[junit4] 2> NOTE: Windows 7 6.1 amd64/Oracle Corporation 1.8.0_40
(64-bit)/cpus=8,threads=1,free=393306544,total=510656512
[junit4] 2> NOTE: All tests run in this JVM: [TestRandomChains]
[junit4] 2> NOTE: reproduce with: ant test -Dtestcase=TestRandomChains
-Dtests.seed=5FF882C20C905C54 -Dtests.slow=true -Dtests.locale=cs
-Dtests.timezone=America/Buenos_Aires -Dtests.asserts=true
-Dtests.file.encoding=ISO-8859-1
[junit4] ERROR 0.00s | TestRandomChains (suite) <<<
[junit4] > Throwable #1: java.lang.AssertionError: public
org.apache.lucene.analysis.miscellaneous.DateRecognizerFilter(org.apache.lucene.analysis.TokenStream,java.text.DateFormat)
has unsupported parameter types
[junit4] > at
__randomizedtesting.SeedInfo.seed([5FF882C20C905C54]:0)
[junit4] > at
org.apache.lucene.analysis.core.TestRandomChains.beforeClass(TestRandomChains.java:233)
[junit4] > at java.lang.Thread.run(Thread.java:745)
{noformat}
I tracked it to {{argProducers}} not having DataFormat defined. Is it OK to add
it?
> Improve TeeSinkTokenFilter
> --------------------------
>
> Key: LUCENE-6973
> URL: https://issues.apache.org/jira/browse/LUCENE-6973
> Project: Lucene - Core
> Issue Type: Improvement
> Reporter: Shai Erera
> Assignee: Shai Erera
> Priority: Minor
> Fix For: 5.5, Trunk
>
> Attachments: LUCENE-6973.patch, LUCENE-6973.patch, LUCENE-6973.patch,
> LUCENE-6973.patch, LUCENE-6973.patch
>
>
> {{TeeSinkTokenFilter}} can be improved in several ways, as it's written today:
> The most major one is removing {{SinkFilter}} which just doesn't work and is
> confusing. E.g., if you set a {{SinkFilter}} which filters tokens, the
> attributes on the stream such as {{PositionIncrementAttribute}} are not
> updated. Also, if you update any attribute on the stream, you affect other
> {{SinkStreams}} ... It's best if we remove this confusing class, and let
> consumers reuse existing {{TokenFilters}} by chaining them to the sink stream.
> After we do that, we can make all the cached states a single (immutable)
> list, which is shared between all the sink streams, so we don't need to keep
> many references around, and also deal with {{WeakReference}}.
> Besides that there are some other minor improvements to the code that will
> come after we clean up this class.
> From a backwards-compatibility standpoint, I don't think that {{SinkFilter}}
> is actually used anywhere (since it just ... confusing and doesn't work as
> expected), and therefore I believe it won't affect anyone. If however someone
> did implement a {{SinkFilter}}, it should be trivial to convert it to a
> {{TokenFilter}} and chain it to the {{SinkStream}}.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]