Hello again!
I am using TS and Sphinx to index over 4 million records in a table. I
have just been playing with search and realised that I cannot retrieve
records that should be being retrieved:
My model:
Class Product < ActiveRecord::Base
define_index do
indexes tl
end
define_index 'product_genre' do
indexes g1, g2, g3, g4, g5
indexes has_image, :sortable => true
set_property :field_weights => {
:g1 => 20,
:g2 => 16,
:g3 => 12,
:g4 => 8,
:g5 => 4
}
has "CRC32(tl)", :as => :num_tl, :type => :integer
has "CRC32(isbn13)", :as => :num_isbn13, :type => :integer
has pdate
end
end
The output from rake ts:rebuild
indexing index 'product_core'...
collected 4226866 docs, 116.9 MB
collected 4226863 attr values
sorted 8.5 Mvalues, 100.0% done
sorted 18.3 Mhits, 100.0% done
total 4226866 docs, 116938081 bytes
total 89.182 sec, 1311229 bytes/sec, 47395.93 docs/sec
distributed index 'product' can not be directly indexed; skipping.
indexing index 'product_genre_core'...
collected 4226866 docs, 21.6 MB
collected 4226073 attr values
sorted 8.5 Mvalues, 100.0% done
sorted 8.3 Mhits, 100.0% done
total 4226866 docs, 21638653 bytes
total 105.600 sec, 204909 bytes/sec, 40026.82 docs/sec
distributed index 'product_genre' can not be directly indexed;
skipping.
total 160 reads, 0.508 sec, 3499.3 kb/call avg, 3.1 msec/call avg
total 1157 writes, 0.862 sec, 948.5 kb/call avg, 0.7 msec/call avg
Started successfully (pid 19233).
It reports that is has collected all of the records in the table -
4226866 (not sure why attr values is 3 less?!)
I want to find a product whose tl field contains "To kill a mocking
bird"
If I go into PHPMyAdmin and do:
SELECT *
FROM `products`
WHERE `tl` LIKE '%kill a mocking%'
I get 24 records back.
If I go into script/console and do:
Product.search("kill a mocking", :index => "product_core").length
I get 3 documents, all of which are not the actual document I'm
looking for!
I have tried other search queries and get similar results. Why would
this be happening? I have deleted the searchd folder, reconfigured and
rebuilt numerous times but to no avail. Same results every time.
Any ideas as this is obviously a pretty major problem. Could it be to
do with the fact that the database is too big? Even though Sphinx
claims to handle much larger databases?
I have already increased mem_limit to 256 as I was running out of
memory during indexing. I get no errors from anyone and all seems to
be working fine. It's just not 'working'!
Thanks in advance.
Shaun
--
You received this message because you are subscribed to the Google Groups
"Thinking Sphinx" group.
To post to this group, send email to [email protected].
To unsubscribe from this group, send email to
[email protected].
For more options, visit this group at
http://groups.google.com/group/thinking-sphinx?hl=en.