Guess it depends on how many scripts you want to maintain/things to do by hand, but in any case, it is the best route, multiple indexing services/processes, and skip the DIH all together, it wasnt that great of an idea in the first place. it was super clever, and I appreciate the work that went into it, but even something as basic as a sql query with the right joins->csv->solr file input could replace it
On Fri, Jul 22, 2022 at 2:50 PM Andy Lester <a...@petdance.com> wrote: > > > > On Jul 22, 2022, at 1:39 PM, Dave <hastings.recurs...@gmail.com> wrote: > > > > Oooooh look into perls fork manager module, > > > > https://metacpan.org/pod/Parallel::ForkManager < > https://metacpan.org/pod/Parallel::ForkManager> > > I’m aware of the numerous tools like that (I’ve been doing Perl since the > 90s https://metacpan.org/author/PETDANCE), but for as often as we have to > do the full import (maybe every couple of months on a schema change) it was > easier to just assign 1/10th of the records to each of ten updaters that > run concurrently. For normal day-to-day incremental, our updater runs > every five or ten minutes and sends them to Solr. > > The other huge win was getting core swapping working. Build the new core > with the new schema, index it for an hour, and swap old with new. So > nice. No downtime for schema changes. > > Andy