Re: [ts] Realistic limit to # of indexed records?

'[email protected]' via Thinking Sphinx Tue, 09 Feb 2021 13:56:15 -0800

Awesome thanks Pat,

Thanks for the suggestions. I'm using real-time indices for some models... 
I did find that initial indexing with real-time for 10m records takes a 
very very long time ;-)


Sharding may be our best approach. 

Thanks again! Jeremy

On Monday, February 8, 2021 at 9:31:16 PM UTC-8 Pat Allan wrote:

> Hi Jeremy,
>
> It’s great to hear that TS and Sphinx are still working well for you :)
>
> In terms of expanding its scale in a reliable sense, I’ve a couple of 
> thoughts:
>
>
>    - You may want to consider real-time indices instead of SQL-backed 
>    indices. This removes the need for deltas and merging, and thus for full 
>    reindex calls. That said, this only works if all updates/inserts are done 
>    in ways that invoke ActiveRecord callbacks, as that’s how the real-time 
>    updates happen.
>    - Sharding your larger models may also be an option? Especially if 
>    there’s clear boundaries between what’s being updated - the more static 
>    records can be in certain shards that don’t require the full reindex, 
>    whereas the more frequent changes are kept to other indices that get 
>    reprocessed daily. This would keep the reprocessing time down.
>
>
> These two approaches could be used together, too - sharded real-time 
> indices.
>
> As to whether there’s an upper limit of records - not that I’m aware of, 
> but I’m not the best person to ask. It may be worth asking on Sphinx’s own 
> forum, and/or through the Manticore team’s channels as well (given they’re 
> a fork of Sphinx that seems to be getting far more frequent updates - and 
> it works as a drop-in replacement for Sphinx, so Thinking Sphinx doesn’t 
> complain at all).
>
> Hope this helps!
>
> — 
> Pat
>
> On 6 Feb 2021, at 7:28 am, '[email protected]' via Thinking Sphinx <
> [email protected]> wrote:
>
> We've been using Sphinx and ThinkingSphinx and Shphinx since 2008 and it's 
> always been amazing.
>
> We generally have ~1m records indexed across a variety of fields and 
> attributes using it to filter users selections. It's super fast and works 
> great.
>
> We also are using it in a larger scale and the records are now around 10m 
> and growing. Indexing itself is time consuming on this one. We're also 
> using a delta index and ts:merge but do need to re-index for updated caches 
> once each day.
>
> Do we see an upper end to the number of records we can index?
>
> Thanks! Jeremy
>
> -- 
> You received this message because you are subscribed to the Google Groups 
> "Thinking Sphinx" group.
> To unsubscribe from this group and stop receiving emails from it, send an 
> email to [email protected].
> To view this discussion on the web visit 
> https://groups.google.com/d/msgid/thinking-sphinx/40b6f71c-bfd4-4788-a8c7-9a9d35ad7952n%40googlegroups.com
>  
> <https://groups.google.com/d/msgid/thinking-sphinx/40b6f71c-bfd4-4788-a8c7-9a9d35ad7952n%40googlegroups.com?utm_medium=email&utm_source=footer>
> .
>
>
>

-- 
You received this message because you are subscribed to the Google Groups 
"Thinking Sphinx" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To view this discussion on the web visit 
https://groups.google.com/d/msgid/thinking-sphinx/e97a5d3d-39fb-42f2-ae15-4192907f3640n%40googlegroups.com.

Re: [ts] Realistic limit to # of indexed records?

Reply via email to