If you want to ignore a field being sent to Solr, you can set indexed=false and stored=false for that field in schema.xml. It will take up room in schema.xml but zero room on disk.
wunder Walter Underwood wun...@wunderwood.org http://observer.wunderwood.org/ (my blog) > On Sep 17, 2020, at 10:23 AM, Alexandre Rafalovitch <arafa...@gmail.com> > wrote: > > Solr has a whole pipeline that you can run during document ingesting before > the actual indexing happens. It is called Update Request Processor (URP) > and is defined in solrconfig.xml or in an override file. Obviously, since > you are indexing from SolrJ client, you have even more flexibility, but it > is good to know about anyway. > > You can read all about it at: > https://lucene.apache.org/solr/guide/8_6/update-request-processors.html and > see the extensive list of processors you can leverage. The specific > mentioned one is this one: > https://lucene.apache.org/solr/8_6_0//solr-core/org/apache/solr/update/processor/StatelessScriptUpdateProcessorFactory.html > > Just a word of warning that Stateless URP is using Javascript, which is > getting a bit of a complicated story as underlying JVM is upgraded (Oracle > dropped their javascript engine in JDK 14). So if one of the simpler URPs > will do the job or a chain of them, that may be a better path to take. > > Regards, > Alex. > > > On Thu, 17 Sep 2020 at 13:13, Steven White <swhite4...@gmail.com> wrote: > >> Thanks Erick. Where can I learn more about "stateless script update >> processor factory". I don't know what you mean by this. >> >> Steven >> >> On Thu, Sep 17, 2020 at 1:08 PM Erick Erickson <erickerick...@gmail.com> >> wrote: >> >>> 1000 fields is fine, you'll waste some cycles on bookkeeping, but I >> really >>> doubt you'll notice. That said, are these fields used for searching? >>> Because you do have control over what gous into the index if you can put >> a >>> "stateless script update processor factory" in your update chain. There >> you >>> can do whatever you want, including combine all the fields into one and >>> delete the original fields. There's no point in having your index >> cluttered >>> with unused fields, OTOH, it may not be worth the effort just to satisfy >> my >>> sense of aesthetics 😉 >>> >>> On Thu, Sep 17, 2020, 12:59 Steven White <swhite4...@gmail.com> wrote: >>> >>>> Hi Eric, >>>> >>>> Yes, this is coming from a DB. Unfortunately I have no control over >> the >>>> list of fields. Out of the 1000 fields that there maybe, no document, >>> that >>>> gets indexed into Solr will use more then about 50 and since i'm >> copying >>>> the values of those fields to the catch-all field and the catch-all >> field >>>> is my default search field, I don't expect any problem for having 1000 >>>> fields in Solr's schema, or should I? >>>> >>>> Thanks >>>> >>>> Steven >>>> >>>> >>>> On Thu, Sep 17, 2020 at 8:23 AM Erick Erickson < >> erickerick...@gmail.com> >>>> wrote: >>>> >>>>> “there over 1000 of them[fields]” >>>>> >>>>> This is often a red flag in my experience. Solr will handle that many >>>>> fields, I’ve seen many more. But this is often a result of >>>>> “database thinking”, i.e. your mental model of how all this data >>>>> is from a DB perspective rather than a search perspective. >>>>> >>>>> It’s unwieldy to have that many fields. Obviously I don’t know the >>>>> particulars of >>>>> your app, and maybe that’s the best design. Particularly if many of >> the >>>>> fields >>>>> are sparsely populated, i.e. only a small percentage of the documents >>> in >>>>> your >>>>> corpus have any value for that field then taking a step back and >>> looking >>>>> at the design might save you some grief down the line. >>>>> >>>>> For instance, I’ve seen designs where instead of >>>>> field1:some_value >>>>> field2:other_value…. >>>>> >>>>> you use a single field with _tokens_ like: >>>>> field:field1_some_value >>>>> field:field2_other_value >>>>> >>>>> that drops the complexity and increases performance. >>>>> >>>>> Anyway, just a thought you might want to consider. >>>>> >>>>> Best, >>>>> Erick >>>>> >>>>>> On Sep 16, 2020, at 9:31 PM, Steven White <swhite4...@gmail.com> >>>> wrote: >>>>>> >>>>>> Hi everyone, >>>>>> >>>>>> I figured it out. It is as simple as creating a List<String> and >>> using >>>>>> that as the value part for SolrInputDocument.addField() API. >>>>>> >>>>>> Thanks, >>>>>> >>>>>> Steven >>>>>> >>>>>> >>>>>> On Wed, Sep 16, 2020 at 9:13 PM Steven White <swhite4...@gmail.com >>> >>>>> wrote: >>>>>> >>>>>>> Hi everyone, >>>>>>> >>>>>>> I want to avoid creating a <copyField dest="CatchAll" >>>>>>> source="OneFieldOfMany"/> in my schema (there will be over 1000 of >>>> them >>>>> and >>>>>>> maybe more so managing it will be a pain). Instead, I want to use >>>> SolrJ >>>>>>> API to do what <copyField/> does. Any example of how I can do >> this? >>>> If >>>>>>> there is an example online, that would be great. >>>>>>> >>>>>>> Thanks in advance. >>>>>>> >>>>>>> Steven >>>>>>> >>>>> >>>>> >>>> >>> >>