Thanks Erick, at first glance I didn't understood your suggestion.
But trying to sort the terms per index it make sense, absolutely make sense
:)))
Thanks for the suggestion, adding the prefix it very easy to implement.



On Mon, Jul 16, 2018 at 4:34 PM Erick Erickson <erickerick...@gmail.com>
wrote:

> There's no real way I know of to do what you want except to use
> TermsComponent.
>
> Note that you don't have to extract all of them, just advance the two
> lists until you find
> enough terms in long_title that aren't in short_title, extract, say,
> 1,000 terms at a time.
>
> You can also start with various prefixes (even individual letters) to
> get some from
> different places. Basically you're paginating through terms by using
> terms.lower.
>
> Do note, though, that you get the _indexed_ term after all
> transformations, say lower
> casing, WordDelimter(Graph)FilterFactory, stemming etc.
>
> Best,
> Erick
>
> On Mon, Jul 16, 2018 at 7:22 AM, Vincenzo D'Amore <v.dam...@gmail.com>
> wrote:
> > Hi all,
> >
> > I have a question for you, Solr Gurus :)
> >
> > there is an index where there are two fields: short_title and long_title.
> > As the field names suggest, this two fields are very similar, the long
> > title has just more terms in it.
> >
> > So, looking at all the documents I have in the index, I would like to
> > extract all the terms that are present in the long_title title only.
> >
> > Could you suggest me, if it is possibile, how to figure out from this
> > problem?
> >
> > I've tried with the term component, and it should return all the terms
> > present in a field but what happens when I have millions of terms?
> >
> > I thought to use the termcomponent or luke, but the only doable way I've
> > found is download the entire list of terms present in both the fields and
> > remove a term that is present in both the lists.
> >
> > I need this because I would like to write a test that try few terms
> present
> > only in the long_title.
> >
> > Thanks for your time,
> > Vincenzo
> >
> > --
> > Vincenzo D'Amore
>


-- 
Vincenzo D'Amore

Reply via email to