Awesome thanks Pat, Thanks for the suggestions. I'm using real-time indices for some models... I did find that initial indexing with real-time for 10m records takes a very very long time ;-)
Sharding may be our best approach. Thanks again! Jeremy On Monday, February 8, 2021 at 9:31:16 PM UTC-8 Pat Allan wrote: > Hi Jeremy, > > It’s great to hear that TS and Sphinx are still working well for you :) > > In terms of expanding its scale in a reliable sense, I’ve a couple of > thoughts: > > > - You may want to consider real-time indices instead of SQL-backed > indices. This removes the need for deltas and merging, and thus for full > reindex calls. That said, this only works if all updates/inserts are done > in ways that invoke ActiveRecord callbacks, as that’s how the real-time > updates happen. > - Sharding your larger models may also be an option? Especially if > there’s clear boundaries between what’s being updated - the more static > records can be in certain shards that don’t require the full reindex, > whereas the more frequent changes are kept to other indices that get > reprocessed daily. This would keep the reprocessing time down. > > > These two approaches could be used together, too - sharded real-time > indices. > > As to whether there’s an upper limit of records - not that I’m aware of, > but I’m not the best person to ask. It may be worth asking on Sphinx’s own > forum, and/or through the Manticore team’s channels as well (given they’re > a fork of Sphinx that seems to be getting far more frequent updates - and > it works as a drop-in replacement for Sphinx, so Thinking Sphinx doesn’t > complain at all). > > Hope this helps! > > — > Pat > > On 6 Feb 2021, at 7:28 am, '[email protected]' via Thinking Sphinx < > [email protected]> wrote: > > We've been using Sphinx and ThinkingSphinx and Shphinx since 2008 and it's > always been amazing. > > We generally have ~1m records indexed across a variety of fields and > attributes using it to filter users selections. It's super fast and works > great. > > We also are using it in a larger scale and the records are now around 10m > and growing. Indexing itself is time consuming on this one. We're also > using a delta index and ts:merge but do need to re-index for updated caches > once each day. > > Do we see an upper end to the number of records we can index? > > Thanks! Jeremy > > -- > You received this message because you are subscribed to the Google Groups > "Thinking Sphinx" group. > To unsubscribe from this group and stop receiving emails from it, send an > email to [email protected]. > To view this discussion on the web visit > https://groups.google.com/d/msgid/thinking-sphinx/40b6f71c-bfd4-4788-a8c7-9a9d35ad7952n%40googlegroups.com > > <https://groups.google.com/d/msgid/thinking-sphinx/40b6f71c-bfd4-4788-a8c7-9a9d35ad7952n%40googlegroups.com?utm_medium=email&utm_source=footer> > . > > > -- You received this message because you are subscribed to the Google Groups "Thinking Sphinx" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. To view this discussion on the web visit https://groups.google.com/d/msgid/thinking-sphinx/e97a5d3d-39fb-42f2-ae15-4192907f3640n%40googlegroups.com.
