Great to hear you’ve made some progress on this issue - hopefully the database challenges are easy enough to solve!
— Pat > On 2 Dec. 2016, at 12:00 am, Simon <[email protected] > <mailto:[email protected]>> wrote: > > The value for Status.active.id is the same as it has been. I also double > checked this and older entries included in the index have the same status_id > as the newer entries not being included. > > The configuration file is being updated when I run a full ts:index or > ts:rebuild. > > The sql_query value is currently this: > sql_query = SELECT SQL_NO_CACHE `entries`.`id` * 1 + 0 AS `id` , > `entries`.`title` AS `title`, `entries`.`body` AS `body`, `entries`.`id` AS > `sphinx_internal_id`, 3940594292 AS `class_crc`, 0 AS `sphinx_deleted`, > `entries`.`journal_id` AS `journal_id`, > UNIX_TIMESTAMP(`entries`.`created_at`) AS `created_at`, > UNIX_TIMESTAMP(`entries`.`opened_at`) AS `opened_at`, `entries`.`status_id` > AS `status_id` FROM `entries` WHERE `entries`.`id` >= $start AND > `entries`.`id` <= $end AND status_id = 1 GROUP BY `entries`.`id` ORDER BY > NULL > > If I change that to simply do a 'SELECT COUNT(*) FROM `entries` WHERE > status_id = 1 GROUP BY `entries`.`id` ORDER BY NULL' the count I get back is > 19,940,635. Which actually matches the number of records that I'm seeing on > the ts:reindex call. > > The DB I run the indexing and searching on is actually a slave DB, and when I > run the same query on the master I get back a much larger number > (20,527,906). So it seems the issue may not be with Sphinx at all, but with > my master-slave setup (which is even more confusing because the slave status > shows as 0 seconds behind the master). Anyways, looks like I need to turn my > attention to non-sphinx issues. > > Thanks again Pat for taking the time to respond, and for getting me thinking > about some different possibilities. It's greatly appreciated! > > Cheers, > Simon > > > On Wednesday, 30 November 2016 07:47:18 UTC-5, Pat Allan wrote: > Hmm, this is certainly an odd one! Is it possible the Status.active.id > <http://status.active.id/> value has changed? > > The ts:reindex task *only* reindexes the data. ts:index, however, will both > regenerate the configuration file and reindex the data. Given you’ve been > running the former, that would explain why the delta indices are still > present in the generated configuration file. That said, running ts:rebuild > should regenerate the configuration file as well, so I’m wondering if that > file isn’t being updated for some reason? So, that’s where my next focus for > debugging on the server would be… > > … and if it *is* regenerating correctly, and the deltas are now removed, the > next question is: is the generated sql_query value for the source correct? > Can you take that query and modify it to use COUNT(*) and confirm how many > records it matches against? > > Also, just for reference: which version of the Thinking Sphinx gem are you > using? > >> On 30 Nov. 2016, at 11:41 pm, Simon <[email protected] <javascript:>> >> wrote: >> >> Hi Pat, >> >> Thanks so much for the response! The number of records actually does not >> appear to fully match, as a count on active entries returns 20,515,798. >> Also, after having my normal cron job that re-indexes run this morning, I >> noticed that the number of records collected is the exact same as before >> (19,940,635). Is there a limit to the number of records Sphinx can handle, >> or any other common scenarios that could be preventing new entries from >> getting included in the indexing? >> >> As for the delta indexes, we actually removed these a while ago from the >> index definition as they were causing some headaches. Our configuration >> file still includes the delta block, but this has never seemed to be an >> issue in indexing. I could remove the delta info from the config file >> (something I've actually been meaning to do), but I didn't want to introduce >> more variables into what might have changed while trying to trouble shoot >> this issue. >> >> Here is the search call, even though the record counts don't match, just in >> case it is helpful at all in continuing to try and figure this out: >> >> filters = { >> :journal_id => journal_ids, >> :status_id => Status.active.id <http://status.active.id/> >> } >> # Check to see if we are ordering in a specific way >> params[:order] ||= '@relevance DESC' >> case params[:order] >> when 'cad' >> order = 'created_at DESC' >> when 'ca' >> order = 'created_at ASC' >> else >> order = '@relevance DESC' >> end >> entries = Entry.search params[:criteria], :with => filters, :sort_mode >> => :extended, :order => order >> >> Thanks again, >> Simon >> >> On Wednesday, 30 November 2016 07:29:36 UTC-5, Pat Allan wrote: >> Hi Simon, >> >> I guess the first place I’d start is by verifying the number of records >> you’re expecting Sphinx to index. The log you shared says 19,940,635 - does >> that match Entry.count(:conditions => {:status_id => Status.active.id >> <http://status.active.id/>})? >> >> Also: the indexing output suggests there’s a delta index, but that’s not in >> the index definition - removed for brevity? >> >> And if the counts match, then can you share the search call you’re running >> to confirm newer records are not appearing? >> >> Cheers, >> >> — >> Pat >> >>> On 30 Nov. 2016, at 12:40 am, Simon <[email protected] <>> wrote: >>> >>> Hi, >>> >>> I'm having an issue that just started recently. Indexing appears to >>> complete successfully, but new entries are not appearing in search results >>> (older entries appear). >>> >>> This seems to have started after I tried a rake ts:rebuild instead of what >>> I normally used (rake ts.reindex). I have since switched back to a reindex, >>> but still nothing new seems to be getting picked up. Unfortunately I am >>> running an older version of Ruby (1.8.7), Rails (2.3.18) and Sphinx (Sphinx >>> 1.10-beta (r2420)). >>> >>> My model definition is as follows: >>> >>> define_index do >>> indexes title >>> indexes body >>> has journal_id >>> has created_at, opened_at, status_id >>> >>> where "status_id = #{Status.active.id <http://status.active.id/>}" >>> end >>> >>> The output of calling rake ts:reindex is: >>> >>> Sphinx 1.10-beta (r2420) >>> >>> Copyright (c) 2001-2010, Andrew Aksyonoff >>> >>> Copyright (c) 2008-2010, Sphinx Technologies Inc (http://sphinxsearch.com >>> <http://sphinxsearch.com/>) >>> >>> >>> >>> using config file >>> '/home/ubuntu/rails/penzu/config/pandora_readonly.sphinx.conf'... >>> >>> indexing index 'entry_core'... >>> >>> collected 19940635 docs, 34535.3 MB >>> >>> WARNING: sort_hits: merge_block_size=224 kb too low, increasing mem_limit >>> may improve performance >>> >>> sorted 6182.9 Mhits, 100.0% done >>> >>> total 19940635 docs, 34535317325 bytes >>> >>> total 16953.201 sec, 2037097 bytes/sec, 1176.21 docs/sec >>> >>> indexing index 'entry_delta'... >>> >>> collected 0 docs, 0.0 MB >>> >>> total 0 docs, 0 bytes >>> >>> total 0.155 sec, 0 bytes/sec, 0.00 docs/sec >>> >>> skipping non-plain index 'entry'... >>> >>> total 92992 reads, 1534.381 sec, 228.9 kb/call avg, 16.5 msec/call avg >>> >>> total 40583 writes, 192.954 sec, 1042.6 kb/call avg, 4.7 msec/call avg >>> >>> rotating indices: succesfully sent SIGHUP to searchd (pid=3802). >>> >>> >>> So it all appears successful, but no new results appear. So for instance, >>> if i add a new entry, and then call reindex, that entry is not found in >>> search results. But an entry with the same search term from a month ago >>> does appear in the results. >>> >>> I have tried a complete rake ts:index, I have tried deleting all of the >>> generated index files (entry_core.spp, entry_core.spi, etc.) but nothing >>> seems to make a difference. Does anybody have any ideas what might be >>> happening here, or any other suggestions for what I can try? >>> >>> Thanks, >>> Simon >>> >>> -- >>> You received this message because you are subscribed to the Google Groups >>> "Thinking Sphinx" group. >>> To unsubscribe from this group and stop receiving emails from it, send an >>> email to [email protected] <>. >>> To post to this group, send email to [email protected] <>. >>> Visit this group at https://groups.google.com/group/thinking-sphinx >>> <https://groups.google.com/group/thinking-sphinx>. >>> For more options, visit https://groups.google.com/d/optout >>> <https://groups.google.com/d/optout>. >> >> >> -- >> You received this message because you are subscribed to the Google Groups >> "Thinking Sphinx" group. >> To unsubscribe from this group and stop receiving emails from it, send an >> email to [email protected] <javascript:>. >> To post to this group, send email to [email protected] >> <javascript:>. >> Visit this group at https://groups.google.com/group/thinking-sphinx >> <https://groups.google.com/group/thinking-sphinx>. >> For more options, visit https://groups.google.com/d/optout >> <https://groups.google.com/d/optout>. > > > -- > You received this message because you are subscribed to the Google Groups > "Thinking Sphinx" group. > To unsubscribe from this group and stop receiving emails from it, send an > email to [email protected] > <mailto:[email protected]>. > To post to this group, send email to [email protected] > <mailto:[email protected]>. > Visit this group at https://groups.google.com/group/thinking-sphinx > <https://groups.google.com/group/thinking-sphinx>. > For more options, visit https://groups.google.com/d/optout > <https://groups.google.com/d/optout>. -- You received this message because you are subscribed to the Google Groups "Thinking Sphinx" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. To post to this group, send email to [email protected]. Visit this group at https://groups.google.com/group/thinking-sphinx. For more options, visit https://groups.google.com/d/optout.
