For the test, can't you just use synthetic data where you know the terms from the start?
Otherwise maybe something from streaming expressions will help, but it needs SolrCloud. Regards, Alex On Mon, Jul 16, 2018, 10:22 AM Vincenzo D'Amore, <v.dam...@gmail.com> wrote: > Hi all, > > I have a question for you, Solr Gurus :) > > there is an index where there are two fields: short_title and long_title. > As the field names suggest, this two fields are very similar, the long > title has just more terms in it. > > So, looking at all the documents I have in the index, I would like to > extract all the terms that are present in the long_title title only. > > Could you suggest me, if it is possibile, how to figure out from this > problem? > > I've tried with the term component, and it should return all the terms > present in a field but what happens when I have millions of terms? > > I thought to use the termcomponent or luke, but the only doable way I've > found is download the entire list of terms present in both the fields and > remove a term that is present in both the lists. > > I need this because I would like to write a test that try few terms present > only in the long_title. > > Thanks for your time, > Vincenzo > > -- > Vincenzo D'Amore >