Hi Simon

Beyond getting hold of the data and app and running the indexing myself 
locally, I don't have any more suggestions on this I'm afraid - sorry. It could 
be that the indexing time is reasonable - I wish I had something on hand to 
compare it with.

-- 
Pat

On 27/12/2010, at 10:01 AM, Simon wrote:

> Yeah, I think so.  It has over 1.5 GBs and I have the mem_limit set to
> 256M.
> 
> Simon
> 
> On Dec 23, 3:36 am, Pat Allan <[email protected]> wrote:
>> Does the VM have enough RAM? I'm running out of suggestions for the cause to 
>> be honest. And I don't have any similar sized datasets on hand to compare 
>> against.
>> 
>> --
>> Pat
>> 
>> On 21/12/2010, at 3:56 AM, Simon wrote:
>> 
>>> I'm running it on a Ubuntu VM.  The speed does improve a bit when I
>>> remove body, but the number of Mhits that get sorted when I do drops
>>> from just over 200 to 1.7.
>> 
>>> On Dec 20, 1:37 am, Pat Allan <[email protected]> wrote:
>>>> Ah, I just wanted to confirm whether there were joins or not. If not, then 
>>>> this definitely feels too slow. What machine are you running this on? And 
>>>> does the speed improve if you remove body from the index definition?
>> 
>>>> --
>>>> Pat
>> 
>>>> On 20/12/2010, at 12:38 AM, Simon wrote:
>> 
>>>>> Hi,
>> 
>>>>> Thanks for the reply.  Not sure what you mean about the columns .
>>>>> They are columns containing ids for other tables, that I am using to
>>>>> limit my actual search queries.
>> 
>>>>> Changing the sql_range_step has not seemed to make any noticeable
>>>>> difference in the amount of time it takes.  I have tried at the
>>>>> default value, in 100,000 blocks, and with the huge value to try and
>>>>> get all the values at once.  I thought it seemed pretty slow too,
>>>>> considering there are no joins or anything like that happening.
>> 
>>>>> Thanks again,
>> 
>>>>> Simon
>> 
>>>>> On Dec 19, 2:33 am, Pat Allan <[email protected]> wrote:
>>>>>> Hi Simon
>> 
>>>>>> Is x_id and y_id the actual columns you're referencing? If not, can you 
>>>>>> provide exactly what your define_index block looks like? It will give me 
>>>>>> a better picture of whether your indexing is slow or not.
>> 
>>>>>> Has changing the sql_range_step value made any difference? What happens 
>>>>>> if you put it back to the default of 1000? 30 minutes for 800,000 values 
>>>>>> does sound slow, for what appears to be quite a simple index definition.
>> 
>>>>>> Cheers
>> 
>>>>>> --
>>>>>> Pat
>> 
>>>>>> On 18/12/2010, at 6:50 AM, Simon wrote:
>> 
>>>>>>> Hi there,
>> 
>>>>>>> I have a table I'm indexing that has roughly 800,000 rows.  From
>>>>>>> reading around online and in this group I feel like it's taking a long
>>>>>>> time for my index to get generated.
>> 
>>>>>>> I have the following in my model:
>>>>>>> define_index do
>>>>>>>    indexes title, body
>>>>>>>    has x_id
>>>>>>>    has locked
>>>>>>>    has created_at, y_id
>> 
>>>>>>>    set_property :delta => true
>>>>>>>  end
>> 
>>>>>>> In my sphinx.yml file, I have the following:
>>>>>>>  max_matches: 1000
>>>>>>>  html_strip: 1
>>>>>>>  sql_range_step: 10000000
>>>>>>>  min_word_len: 3
>>>>>>>  mem_limit: 256M
>> 
>>>>>>> And here is sample output from running rake ts:index:
>>>>>>> indexing index 'entry_core'...
>>>>>>> collected 841492 docs, 1783.4 MB
>>>>>>> collected 0 attr values
>>>>>>> sorted 0.8 Mvalues, 100.0% done
>>>>>>> sorted 205.5 Mhits, 100.0% done
>>>>>>> total 841492 docs, 1783446653 bytes
>>>>>>> total 1343.154 sec, 1327804.99 bytes/sec, 626.50 docs/sec
>>>>>>> indexing index 'entry_delta'...
>>>>>>> collected 0 docs, 0.0 MB
>>>>>>> collected 0 attr values
>>>>>>> sorted 0.0 Mvalues, nan% done
>>>>>>> total 0 docs, 0 bytes
>>>>>>> total 250.385 sec, 0.00 bytes/sec, 0.00 docs/sec
>>>>>>> distributed index 'entry' can not be directly indexed; skipping.
>> 
>>>>>>> It's taking around 25-30 minutes to run (without having any delta
>>>>>>> indexes).  That seems like quite a while compared to what I've seen as
>>>>>>> sample times from other people.  Does anybody have any suggestions for
>>>>>>> what I could do to improve the performance, or any comments on the
>>>>>>> speed of the indexing compared to what they have seen?
>> 
>>>>>>> Thanks,
>> 
>>>>>>> Simon
>> 
>>>>>>> --
>>>>>>> You received this message because you are subscribed to the Google 
>>>>>>> Groups "Thinking Sphinx" group.
>>>>>>> To post to this group, send email to [email protected].
>>>>>>> To unsubscribe from this group, send email to 
>>>>>>> [email protected].
>>>>>>> For more options, visit this group 
>>>>>>> athttp://groups.google.com/group/thinking-sphinx?hl=en.
>> 
>>>>> --
>>>>> You received this message because you are subscribed to the Google Groups 
>>>>> "Thinking Sphinx" group.
>>>>> To post to this group, send email to [email protected].
>>>>> To unsubscribe from this group, send email to 
>>>>> [email protected].
>>>>> For more options, visit this group 
>>>>> athttp://groups.google.com/group/thinking-sphinx?hl=en.
>> 
>>> --
>>> You received this message because you are subscribed to the Google Groups 
>>> "Thinking Sphinx" group.
>>> To post to this group, send email to [email protected].
>>> To unsubscribe from this group, send email to 
>>> [email protected].
>>> For more options, visit this group 
>>> athttp://groups.google.com/group/thinking-sphinx?hl=en.
> 
> -- 
> You received this message because you are subscribed to the Google Groups 
> "Thinking Sphinx" group.
> To post to this group, send email to [email protected].
> To unsubscribe from this group, send email to 
> [email protected].
> For more options, visit this group at 
> http://groups.google.com/group/thinking-sphinx?hl=en.
> 

-- 
You received this message because you are subscribed to the Google Groups 
"Thinking Sphinx" group.
To post to this group, send email to [email protected].
To unsubscribe from this group, send email to 
[email protected].
For more options, visit this group at 
http://groups.google.com/group/thinking-sphinx?hl=en.

Reply via email to