Thanks Erick, at first glance I didn't understood your suggestion. But trying to sort the terms per index it make sense, absolutely make sense :))) Thanks for the suggestion, adding the prefix it very easy to implement.
On Mon, Jul 16, 2018 at 4:34 PM Erick Erickson <erickerick...@gmail.com> wrote: > There's no real way I know of to do what you want except to use > TermsComponent. > > Note that you don't have to extract all of them, just advance the two > lists until you find > enough terms in long_title that aren't in short_title, extract, say, > 1,000 terms at a time. > > You can also start with various prefixes (even individual letters) to > get some from > different places. Basically you're paginating through terms by using > terms.lower. > > Do note, though, that you get the _indexed_ term after all > transformations, say lower > casing, WordDelimter(Graph)FilterFactory, stemming etc. > > Best, > Erick > > On Mon, Jul 16, 2018 at 7:22 AM, Vincenzo D'Amore <v.dam...@gmail.com> > wrote: > > Hi all, > > > > I have a question for you, Solr Gurus :) > > > > there is an index where there are two fields: short_title and long_title. > > As the field names suggest, this two fields are very similar, the long > > title has just more terms in it. > > > > So, looking at all the documents I have in the index, I would like to > > extract all the terms that are present in the long_title title only. > > > > Could you suggest me, if it is possibile, how to figure out from this > > problem? > > > > I've tried with the term component, and it should return all the terms > > present in a field but what happens when I have millions of terms? > > > > I thought to use the termcomponent or luke, but the only doable way I've > > found is download the entire list of terms present in both the fields and > > remove a term that is present in both the lists. > > > > I need this because I would like to write a test that try few terms > present > > only in the long_title. > > > > Thanks for your time, > > Vincenzo > > > > -- > > Vincenzo D'Amore > -- Vincenzo D'Amore