Hi

Just to simplify everything, I stripped the model back so that we only
have one index:

define_index do
    indexes tl
end

I have redeployed, reconfigured, rebuilt and restarted with no luck.

Does anyone have any clues as to where I should even start looking to
find the problem?

Just to clarify - I have 24 records in the database containing the
string "kill a mocking" in the field named 'tl'. A SQL query returns
all 24 when I use select with 'tl LIKE %kill a mocking%'.

Doing a search through Thinking Sphinx and Sphinx, returns just 3
records. Upon indexing I am told that all of the records in the table
of been collected, i.e the same number of records are collected as
exist in the table.

Where am I going wrong?!

Thanks again.

Shaun


On Jan 21, 10:37 am, Shaun <[email protected]> wrote:
> Hello again!
>
> I am using TS and Sphinx to index over 4 million records in a table. I
> have just been playing with search and realised that I cannot retrieve
> records that should be being retrieved:
>
> My model:
>
> Class Product < ActiveRecord::Base
>
> define_index do
>     indexes tl
> end
>
> define_index 'product_genre' do
>     indexes g1, g2, g3, g4, g5
>
>     indexes has_image, :sortable => true
>
>     set_property :field_weights => {
>       :g1 => 20,
>       :g2 => 16,
>       :g3 => 12,
>       :g4 => 8,
>       :g5 => 4
>     }
>
>     has "CRC32(tl)", :as => :num_tl, :type => :integer
>     has "CRC32(isbn13)", :as => :num_isbn13, :type => :integer
>
>     has pdate
>
>   end
>
> end
>
> The output from rake ts:rebuild
>
> indexing index 'product_core'...
> collected 4226866 docs, 116.9 MB
> collected 4226863 attr values
> sorted 8.5 Mvalues, 100.0% done
> sorted 18.3 Mhits, 100.0% done
> total 4226866 docs, 116938081 bytes
> total 89.182 sec, 1311229 bytes/sec, 47395.93 docs/sec
> distributed index 'product' can not be directly indexed; skipping.
> indexing index 'product_genre_core'...
> collected 4226866 docs, 21.6 MB
> collected 4226073 attr values
> sorted 8.5 Mvalues, 100.0% done
> sorted 8.3 Mhits, 100.0% done
> total 4226866 docs, 21638653 bytes
> total 105.600 sec, 204909 bytes/sec, 40026.82 docs/sec
> distributed index 'product_genre' can not be directly indexed;
> skipping.
> total 160 reads, 0.508 sec, 3499.3 kb/call avg, 3.1 msec/call avg
> total 1157 writes, 0.862 sec, 948.5 kb/call avg, 0.7 msec/call avg
> Started successfully (pid 19233).
>
> It reports that is has collected all of the records in the table -
> 4226866 (not sure why attr values is 3 less?!)
>
> I want to find a product whose tl field contains "To kill a mocking
> bird"
>
> If I go into PHPMyAdmin and do:
>
> SELECT *
> FROM  `products`
> WHERE  `tl` LIKE  '%kill a mocking%'
>
> I get 24 records back.
>
> If I go into script/console and do:
>
> Product.search("kill a mocking", :index => "product_core").length
>
> I get 3 documents, all of which are not the actual document I'm
> looking for!
>
> I have tried other search queries and get similar results. Why would
> this be happening? I have deleted the searchd folder, reconfigured and
> rebuilt numerous times but to no avail. Same results every time.
>
> Any ideas as this is obviously a pretty major problem. Could it be to
> do with the fact that the database is too big? Even though Sphinx
> claims to handle much larger databases?
>
> I have already increased mem_limit to 256 as I was running out of
> memory during indexing. I get no errors from anyone and all seems to
> be working fine. It's just not 'working'!
>
> Thanks in advance.
>
> Shaun
-- 
You received this message because you are subscribed to the Google Groups 
"Thinking Sphinx" group.
To post to this group, send email to [email protected].
To unsubscribe from this group, send email to 
[email protected].
For more options, visit this group at 
http://groups.google.com/group/thinking-sphinx?hl=en.


Reply via email to