Hi Simon Beyond getting hold of the data and app and running the indexing myself locally, I don't have any more suggestions on this I'm afraid - sorry. It could be that the indexing time is reasonable - I wish I had something on hand to compare it with.
-- Pat On 27/12/2010, at 10:01 AM, Simon wrote: > Yeah, I think so. It has over 1.5 GBs and I have the mem_limit set to > 256M. > > Simon > > On Dec 23, 3:36 am, Pat Allan <[email protected]> wrote: >> Does the VM have enough RAM? I'm running out of suggestions for the cause to >> be honest. And I don't have any similar sized datasets on hand to compare >> against. >> >> -- >> Pat >> >> On 21/12/2010, at 3:56 AM, Simon wrote: >> >>> I'm running it on a Ubuntu VM. The speed does improve a bit when I >>> remove body, but the number of Mhits that get sorted when I do drops >>> from just over 200 to 1.7. >> >>> On Dec 20, 1:37 am, Pat Allan <[email protected]> wrote: >>>> Ah, I just wanted to confirm whether there were joins or not. If not, then >>>> this definitely feels too slow. What machine are you running this on? And >>>> does the speed improve if you remove body from the index definition? >> >>>> -- >>>> Pat >> >>>> On 20/12/2010, at 12:38 AM, Simon wrote: >> >>>>> Hi, >> >>>>> Thanks for the reply. Not sure what you mean about the columns . >>>>> They are columns containing ids for other tables, that I am using to >>>>> limit my actual search queries. >> >>>>> Changing the sql_range_step has not seemed to make any noticeable >>>>> difference in the amount of time it takes. I have tried at the >>>>> default value, in 100,000 blocks, and with the huge value to try and >>>>> get all the values at once. I thought it seemed pretty slow too, >>>>> considering there are no joins or anything like that happening. >> >>>>> Thanks again, >> >>>>> Simon >> >>>>> On Dec 19, 2:33 am, Pat Allan <[email protected]> wrote: >>>>>> Hi Simon >> >>>>>> Is x_id and y_id the actual columns you're referencing? If not, can you >>>>>> provide exactly what your define_index block looks like? It will give me >>>>>> a better picture of whether your indexing is slow or not. >> >>>>>> Has changing the sql_range_step value made any difference? What happens >>>>>> if you put it back to the default of 1000? 30 minutes for 800,000 values >>>>>> does sound slow, for what appears to be quite a simple index definition. >> >>>>>> Cheers >> >>>>>> -- >>>>>> Pat >> >>>>>> On 18/12/2010, at 6:50 AM, Simon wrote: >> >>>>>>> Hi there, >> >>>>>>> I have a table I'm indexing that has roughly 800,000 rows. From >>>>>>> reading around online and in this group I feel like it's taking a long >>>>>>> time for my index to get generated. >> >>>>>>> I have the following in my model: >>>>>>> define_index do >>>>>>> indexes title, body >>>>>>> has x_id >>>>>>> has locked >>>>>>> has created_at, y_id >> >>>>>>> set_property :delta => true >>>>>>> end >> >>>>>>> In my sphinx.yml file, I have the following: >>>>>>> max_matches: 1000 >>>>>>> html_strip: 1 >>>>>>> sql_range_step: 10000000 >>>>>>> min_word_len: 3 >>>>>>> mem_limit: 256M >> >>>>>>> And here is sample output from running rake ts:index: >>>>>>> indexing index 'entry_core'... >>>>>>> collected 841492 docs, 1783.4 MB >>>>>>> collected 0 attr values >>>>>>> sorted 0.8 Mvalues, 100.0% done >>>>>>> sorted 205.5 Mhits, 100.0% done >>>>>>> total 841492 docs, 1783446653 bytes >>>>>>> total 1343.154 sec, 1327804.99 bytes/sec, 626.50 docs/sec >>>>>>> indexing index 'entry_delta'... >>>>>>> collected 0 docs, 0.0 MB >>>>>>> collected 0 attr values >>>>>>> sorted 0.0 Mvalues, nan% done >>>>>>> total 0 docs, 0 bytes >>>>>>> total 250.385 sec, 0.00 bytes/sec, 0.00 docs/sec >>>>>>> distributed index 'entry' can not be directly indexed; skipping. >> >>>>>>> It's taking around 25-30 minutes to run (without having any delta >>>>>>> indexes). That seems like quite a while compared to what I've seen as >>>>>>> sample times from other people. Does anybody have any suggestions for >>>>>>> what I could do to improve the performance, or any comments on the >>>>>>> speed of the indexing compared to what they have seen? >> >>>>>>> Thanks, >> >>>>>>> Simon >> >>>>>>> -- >>>>>>> You received this message because you are subscribed to the Google >>>>>>> Groups "Thinking Sphinx" group. >>>>>>> To post to this group, send email to [email protected]. >>>>>>> To unsubscribe from this group, send email to >>>>>>> [email protected]. >>>>>>> For more options, visit this group >>>>>>> athttp://groups.google.com/group/thinking-sphinx?hl=en. >> >>>>> -- >>>>> You received this message because you are subscribed to the Google Groups >>>>> "Thinking Sphinx" group. >>>>> To post to this group, send email to [email protected]. >>>>> To unsubscribe from this group, send email to >>>>> [email protected]. >>>>> For more options, visit this group >>>>> athttp://groups.google.com/group/thinking-sphinx?hl=en. >> >>> -- >>> You received this message because you are subscribed to the Google Groups >>> "Thinking Sphinx" group. >>> To post to this group, send email to [email protected]. >>> To unsubscribe from this group, send email to >>> [email protected]. >>> For more options, visit this group >>> athttp://groups.google.com/group/thinking-sphinx?hl=en. > > -- > You received this message because you are subscribed to the Google Groups > "Thinking Sphinx" group. > To post to this group, send email to [email protected]. > To unsubscribe from this group, send email to > [email protected]. > For more options, visit this group at > http://groups.google.com/group/thinking-sphinx?hl=en. > -- You received this message because you are subscribed to the Google Groups "Thinking Sphinx" group. To post to this group, send email to [email protected]. To unsubscribe from this group, send email to [email protected]. For more options, visit this group at http://groups.google.com/group/thinking-sphinx?hl=en.
