[ 
https://issues.apache.org/jira/browse/LUCENE-6973?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15098798#comment-15098798
 ] 

Shai Erera commented on LUCENE-6973:
------------------------------------

I ran the tests and {{TestRandomChains}} fails with this:

{noformat}
   [junit4]   2> NOTE: Windows 7 6.1 amd64/Oracle Corporation 1.8.0_40 
(64-bit)/cpus=8,threads=1,free=393306544,total=510656512
   [junit4]   2> NOTE: All tests run in this JVM: [TestRandomChains]
   [junit4]   2> NOTE: reproduce with: ant test  -Dtestcase=TestRandomChains 
-Dtests.seed=5FF882C20C905C54 -Dtests.slow=true -Dtests.locale=cs 
-Dtests.timezone=America/Buenos_Aires -Dtests.asserts=true 
-Dtests.file.encoding=ISO-8859-1
   [junit4] ERROR   0.00s | TestRandomChains (suite) <<<
   [junit4]    > Throwable #1: java.lang.AssertionError: public 
org.apache.lucene.analysis.miscellaneous.DateRecognizerFilter(org.apache.lucene.analysis.TokenStream,java.text.DateFormat)
 has unsupported parameter types
   [junit4]    >        at 
__randomizedtesting.SeedInfo.seed([5FF882C20C905C54]:0)
   [junit4]    >        at 
org.apache.lucene.analysis.core.TestRandomChains.beforeClass(TestRandomChains.java:233)
   [junit4]    >        at java.lang.Thread.run(Thread.java:745)
{noformat}

I tracked it to {{argProducers}} not having DataFormat defined. Is it OK to add 
it?

> Improve TeeSinkTokenFilter
> --------------------------
>
>                 Key: LUCENE-6973
>                 URL: https://issues.apache.org/jira/browse/LUCENE-6973
>             Project: Lucene - Core
>          Issue Type: Improvement
>            Reporter: Shai Erera
>            Assignee: Shai Erera
>            Priority: Minor
>             Fix For: 5.5, Trunk
>
>         Attachments: LUCENE-6973.patch, LUCENE-6973.patch, LUCENE-6973.patch, 
> LUCENE-6973.patch, LUCENE-6973.patch
>
>
> {{TeeSinkTokenFilter}} can be improved in several ways, as it's written today:
> The most major one is removing {{SinkFilter}} which just doesn't work and is 
> confusing. E.g., if you set a {{SinkFilter}} which filters tokens, the 
> attributes on the stream such as {{PositionIncrementAttribute}} are not 
> updated. Also, if you update any attribute on the stream, you affect other 
> {{SinkStreams}} ... It's best if we remove this confusing class, and let 
> consumers reuse existing {{TokenFilters}} by chaining them to the sink stream.
> After we do that, we can make all the cached states a single (immutable) 
> list, which is shared between all the sink streams, so we don't need to keep 
> many references around, and also deal with {{WeakReference}}.
> Besides that there are some other minor improvements to the code that will 
> come after we clean up this class.
> From a backwards-compatibility standpoint, I don't think that {{SinkFilter}} 
> is actually used anywhere (since it just ... confusing and doesn't work as 
> expected), and therefore I believe it won't affect anyone. If however someone 
> did implement a {{SinkFilter}}, it should be trivial to convert it to a 
> {{TokenFilter}} and chain it to the {{SinkStream}}.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to