Hi Steve Thinking Sphinx uses left outer joins (because that's what ActiveRecord uses, and it's the best fit in this situation), and yes, one doc is one row.
There's nothing odd in your index definition - so let's have a look at the sql_query value of the source business_core_0, in config/development.sphinx.conf (or production - they should be the same). You should be able to run it manually, see if it returns the right results (perhaps over a particular step, to keep things reasonably fast and manageable). -- Pat On 15/12/2009, at 3:26 AM, Steve H wrote: > Thanks for the lightning-fast reply Pat. :) > > define_index do > indexes :name, :sortable => true > indexes categories_and_tags(:name), :as > => :category_names, :sortable => true > indexes [city.city, city.state_code], :as => :city, :sortable => > true > set_property :group_concat_max_len => 8192 > # attributes > has lat, lon > end > > Every business has at least one category or tag, but I'm kind of > assuming here that the indexer is doing a full left join (ie, not an > inner join). > > Could something in this block be causing the problem? And is 1 doc = > 1 row in the db? > > -Steve > > On Dec 13, 7:03 pm, Pat Allan <[email protected]> wrote: >> Hi Steve >> >> What does your define_index block for Business look like? >> >> -- >> Pat >> >> On 14/12/2009, at 1:44 PM, Steve H wrote: >> >> >> >>> OK first off -- I'm not 100% certain that 1 doc = 1 record. But if it >>> DOES then, sphinx definitely didn't collect enough. We have over 13 >>> million rows in our database, but this was the output after indexing: >> >>> ----------------------------------------------- >>> indexing index 'business_core'... >>> collected 1199 docs, 0.7 MB >>> collected 0 attr values >>> sorted 0.0 Mvalues, 100.0% done >>> sorted 0.1 Mhits, 100.0% done >>> total 1199 docs, 685923 bytes >>> total 1381.345 sec, 496.56 bytes/sec, 0.87 docs/sec >>> distributed index 'business' can not be directly indexed; skipping. >>> Started successfully (pid 32011). >>> ----------------------------------------------- >> >>> After reading some of the docs, I thought... maybe I have to increase >>> the sql_range_step in the sphinx.yml or increase the >>> group_concat_max_len property. So I did those things, but the output >>> after indexing is the same (albeit much much faster :) >> >>> ----------------------------------------------- >>> Stopped search daemon (pid 2810). >>> Generating Configuration to /home/steve/insiderpages/config/ >>> sphinx_dev.sphinx.conf >>> Sphinx 0.9.8.1-release (r1533) >>> Copyright (c) 2001-2008, Andrew Aksyonoff >> >>> using config file '/home/steve/insiderpages/config/ >>> sphinx_dev.sphinx.conf'... >>> indexing index 'business_core'... >>> collected 1199 docs, 0.1 MB >>> collected 0 attr values >>> sorted 0.0 Mvalues, 100.0% done >>> sorted 0.0 Mhits, 100.0% done >>> total 1199 docs, 79605 bytes >>> total 0.447 sec, 177901.38 bytes/sec, 2679.53 docs/sec >>> distributed index 'business' can not be directly indexed; skipping. >>> Started successfully (pid 6365). >>> ----------------------------------------------- >> >>> What's the problem? >> >>> -Steve >> >>> PS: >>> mysql> select count(id) from businesses; >>> +-----------+ >>> | count(id) | >>> +-----------+ >>> | 13395683 | >>> +-----------+ >>> 1 row in set (0.00 sec) >> >>> mysql> select max(id) from businesses; >>> +-------------+ >>> | max(id) | >>> +-------------+ >>> | 15250646126 | >>> +-------------+ >>> 1 row in set (0.00 sec) >> >>> -- >> >>> You received this message because you are subscribed to the Google Groups >>> "Thinking Sphinx" group. >>> To post to this group, send email to [email protected]. >>> To unsubscribe from this group, send email to >>> [email protected]. >>> For more options, visit this group >>> athttp://groups.google.com/group/thinking-sphinx?hl=en. > > -- > > You received this message because you are subscribed to the Google Groups > "Thinking Sphinx" group. > To post to this group, send email to [email protected]. > To unsubscribe from this group, send email to > [email protected]. > For more options, visit this group at > http://groups.google.com/group/thinking-sphinx?hl=en. > > -- You received this message because you are subscribed to the Google Groups "Thinking Sphinx" group. To post to this group, send email to [email protected]. To unsubscribe from this group, send email to [email protected]. For more options, visit this group at http://groups.google.com/group/thinking-sphinx?hl=en.
