[ 
https://issues.apache.org/jira/browse/LUCENE-6973?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15097291#comment-15097291
 ] 

Uwe Schindler commented on LUCENE-6973:
---------------------------------------

bq. Btw. in case there is preference for ArrayDeque over ArrayList, 
CachingTokenFilter could also be changed for that.

I don't think we need a Deque here or there. We only add elements at the end, 
so ArrayList is fine, also for the States here. ArrayDeque is only better as 
replacement for a LinkedList, where you want to remove or insert elements at 
start of list (where a LinkedList is fine, but ArrayList involves copying on 
each insert/delete). But LinkedList was also wrong for the use-case here. We 
just add elements at the end and it gets unmodifiable at end. The iterator is 
happy, too. ArrayList is also slightly faster because it does not need to check 
2 bounds.

Shai: The Javadocs of the SinkFilter replacements are partly misleading (they 
still talk about sinks, although it is no longer related to TeeSink).

I will further look into this patch tomorrow, this was just my first thoughts.

> Improve TeeSinkTokenFilter
> --------------------------
>
>                 Key: LUCENE-6973
>                 URL: https://issues.apache.org/jira/browse/LUCENE-6973
>             Project: Lucene - Core
>          Issue Type: Improvement
>            Reporter: Shai Erera
>            Assignee: Shai Erera
>            Priority: Minor
>             Fix For: 5.5, Trunk
>
>         Attachments: LUCENE-6973.patch, LUCENE-6973.patch
>
>
> {{TeeSinkTokenFilter}} can be improved in several ways, as it's written today:
> The most major one is removing {{SinkFilter}} which just doesn't work and is 
> confusing. E.g., if you set a {{SinkFilter}} which filters tokens, the 
> attributes on the stream such as {{PositionIncrementAttribute}} are not 
> updated. Also, if you update any attribute on the stream, you affect other 
> {{SinkStreams}} ... It's best if we remove this confusing class, and let 
> consumers reuse existing {{TokenFilters}} by chaining them to the sink stream.
> After we do that, we can make all the cached states a single (immutable) 
> list, which is shared between all the sink streams, so we don't need to keep 
> many references around, and also deal with {{WeakReference}}.
> Besides that there are some other minor improvements to the code that will 
> come after we clean up this class.
> From a backwards-compatibility standpoint, I don't think that {{SinkFilter}} 
> is actually used anywhere (since it just ... confusing and doesn't work as 
> expected), and therefore I believe it won't affect anyone. If however someone 
> did implement a {{SinkFilter}}, it should be trivial to convert it to a 
> {{TokenFilter}} and chain it to the {{SinkStream}}.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to