[ https://issues.apache.org/jira/browse/SOLR-10351?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15952245#comment-15952245 ]
David Smiley commented on SOLR-10351: ------------------------------------- Wouldn't the NLP processing as advertised in the title of this issue be most likely to put it's processing into analysis _attributes_? This stream evaluator only emits the character data attribute. BTW Please use try-finally (even try-with-resources style) to close token-streams wherever possible. Analyzer internal parts are internally shared in thread-locals and the ramifications can be nasty on the entire Solr node if at any time one filter has a bug or something on a particular value. Your Solr node then becomes poisoned in a sense and only a restart will fix the ailment. > Add analyze Stream Evaluator to support streaming NLP > ----------------------------------------------------- > > Key: SOLR-10351 > URL: https://issues.apache.org/jira/browse/SOLR-10351 > Project: Solr > Issue Type: New Feature > Security Level: Public(Default Security Level. Issues are Public) > Reporter: Joel Bernstein > Assignee: Joel Bernstein > Labels: NLP, Streaming > Fix For: 6.6 > > Attachments: SOLR-10351.patch, SOLR-10351.patch, SOLR-10351.patch, > SOLR-10351.patch > > > The *analyze* Stream Evaluator uses a Solr analyzer to return a collection of > tokens from a *text field*. The collection of tokens can then be streamed out > by the *cartesianProduct* Streaming Expression or attached to documents as > multi-valued fields by the *select* Streaming Expression. > This allows Streaming Expressions to leverage all the existing tokenizers and > filters and provides a place for future NLP analyzers to be added to > Streaming Expressions. > Sample syntax: > {code} > cartesianProduct(expr, analyze(analyzerField, textField) as outfield ) > {code} > {code} > select(expr, analyze(analyzerField, textField) as outfield ) > {code} > Combined with Solr's batch text processing capabilities this provides an > entire parallel NLP framework. Solr's batch processing capabilities are > described here: > *Batch jobs, Parallel ETL and Streaming Text Transformation* > http://joelsolr.blogspot.com/2016/10/solr-63-batch-jobs-parallel-etl-and.html -- This message was sent by Atlassian JIRA (v6.3.15#6346) --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org