David, I know I’d love to see the code! I’ve been working on streaming expression called “bump” that triggers a atomic update reindex process on a document. I’m using it as part of a relevancy experimentation workflow where I add a new copyField or change my schema analyzers, and then I “bump” each document to cause it to reindex. This way I don’t need to reindex from source.
I’m planning up pushing up a Github repo with this function. Eric > On Feb 27, 2020, at 8:01 AM, David '-1' Schmid > <david.sch...@vis.uni-stuttgart.de> wrote: > > Hello again! > > On 25.02.20 22:39, David Smiley wrote: >> I haven't worked on streaming expressions yet but I did a little bit of >> digging around. I think the ClassifyStream might be somewhat similar to >> learn from. It takes a stream of docs, not unlike what you want. And >> crucially it implements setStreamContext with an implementation which >> demonstrates how to get access to a SolrCore. From a core, you can get a >> SolrIndexSearcher. [...] > > That worked beautifully! Or let's say: I got it working, the code is not > beautiful, as is. > Would this be interesting/relevant enough to be adopted upstream? > > If so, should I open up a JIRA ticket? > > best regards, > David > > > >> On Fri, Feb 21, 2020 at 8:05 AM David '-1' Schmid >> <david.sch...@vis.uni-stuttgart.de >> <mailto:david.sch...@vis.uni-stuttgart.de>> wrote: >> Hello dear developers! >> I've been wondering if I'd be able to adapt the current >> TaggerRequestHandler for using it within the /stream request handler. >> Starting out is a tad confusing, which I expected since I have >> almost no >> experience with the solr/lucene codebase. >> My goal is as follows: I want to use the result of a previous >> select(coll1, ...) as input for adding tags to the result document. >> Possibly: >> tag( >> select(...), field_to_analyze_for_tags, >> collection_with_tag_dict, tag_dict_field, >> ... // remaining tagger configuration options >> ) >> I'm currently stuck at some steps in writing a >> 'public class TaggerStream extends TupleStream implements Expressible' >> at two points: >> == Problem 1: Getting 'terms' == >> The TaggerRequestHandler gets a SolrIndexSearcher via the request >> > final SolrIndexSearcher searcher = req.getSearcher(); >> Which in turn is used to to acquire the terms >> > Terms terms = searcher.getSlowAtomicReader().terms(indexedField); >> which are used for tagging. >> I've tried finding something that will yield the equivalent, but as you >> might have guessed: I didn't find anything so far. >> == Problem 2: Multiple Shards == >> I guess, this might come up sooner or later, hence this is related to >> SOLR-14190 (requesting the tagger to work across multiple shards). >> I suspect (mind: I really don't know) that acquiring the terms will >> have >> to do something with that, at least when we need to merge the results >> from multiple shards, but I have not yet found any code that does that. >> Might have been blinded by my confusion, tho. >> I'd be thankful if someone can help with any pointers regarding this. >> best regards, >> David >> --------------------------------------------------------------------- >> To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org >> <mailto:dev-unsubscr...@lucene.apache.org> >> For additional commands, e-mail: dev-h...@lucene.apache.org >> <mailto:dev-h...@lucene.apache.org> > > --------------------------------------------------------------------- > To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org > For additional commands, e-mail: dev-h...@lucene.apache.org > _______________________ Eric Pugh | Founder & CEO | OpenSource Connections, LLC | 434.466.1467 | http://www.opensourceconnections.com <http://www.opensourceconnections.com/> | My Free/Busy <http://tinyurl.com/eric-cal> Co-Author: Apache Solr Enterprise Search Server, 3rd Ed <https://www.packtpub.com/big-data-and-business-intelligence/apache-solr-enterprise-search-server-third-edition-raw> This e-mail and all contents, including attachments, is considered to be Company Confidential unless explicitly stated otherwise, regardless of whether attachments are marked as such.