Sorry, it's the same index, I was just simplifying the names for the purpose of this post and missed one. Sorry for the confusion :)
If the change was made prior to 2.0.11, wouldn't that mean that the indexes previously would have been huge too? I'm not sure I understand what you mean about sql_field_string - do sql_field_strings take up significantly more space than sql_attr_strings do? On Saturday, July 20, 2013 11:18:40 PM UTC-4, Pat Allan wrote: > > Further to this: I guess I was wrong about 2.0.11 using ordinal attribute > types instead of string attribute types - that change must have come in > earlier. > > sql_attr_string is a standard string attribute (not ordinal), and > sql_field_string stores the field value as a string attribute of the same > name *as well as* the field. The latter removes the need for the _sort > suffix you'll spot in sortable attributes in 2.x releases. > > I wouldn't expect there to be any difference between these two in terms of > file size though. But just to compare apples with apples - you had > user_core file sizes previously, but now it's candidate_user_core. Are > there other large and unnecessary string attributes in the CandidateUser > index? > > -- > Pat > > On 19/07/2013, at 4:25 AM, Daniel Vandersluis wrote: > > > Yeah, I was using 2.0.11 previously. There does not seem to be any > difference with removing sortable: true from the index definition (for > resumes.document), except that this line disappears from the generated > configuration file: sql_field_string = document. This seems to at least let > indexer complete properly, but the index size is still huge: > > > > indexing index 'candidate_user_core'... > > collected 199704 docs, 8478.8 MB > > > > It also takes a long time to go through the sorting "Mhits" step now. I > see how TS2 added sql_attr_string for the sort columns whereas TS3 adds > sql_field_string - that's what you're talking about right? Is there any way > to either a) get around this issue, or b) force TS to use the ordinal type? > (everything should still work that way, correct?) > > > > Here's the options I set in thinking_sphinx.yml: > > > > development: > > address: localhost > > version: 2.0.8-release > > mem_limit: 256M > > > > enable_star: true > > min_prefix_len: 2 > > blend_chars: "@, -, &" > > html_strip: true > > max_matches: 25000 > > > > Is there any way I can speed this up / reduce the size? > > > > On Thursday, July 18, 2013 1:59:11 PM UTC-4, Pat Allan wrote: > > I think with 2.0.11 (what you were using previously, right?) TS uses the > ordinal attribute type, which stores an integer for each string (calculated > by grabbing all known values, putting them in order, returning the index of > each value). > > > > With TS v3 (and later 2.x releases if I remember correctly) it'll use > the native string attribute type (a relatively recent addition to Sphinx), > which means Sphinx is storing the real string value - which is much better > if you're sorting across more than one index (say, if you're using deltas, > or searching across multiple models). In this case, it would mean Sphinx is > now storing potentially a ton of data, instead of a 32-bit integer per > record. > > > > -- > > Pat > > > > On 19/07/2013, at 3:53 AM, Daniel Vandersluis wrote: > > > > > Thanks for the response, Pat - yes, it's the same index as the other > thread. Good point about sorting resumes, that shouldn't be there. However, > why would that make such a difference between TS2 and TS3 (see my other > post which I added at the same time as your response)? > > > > > > I will try removing the sortable on resumes and see what difference it > makes! > > > > > > On Thursday, July 18, 2013 1:49:13 PM UTC-4, Pat Allan wrote: > > > Hi Daniel > > > > > > If this is the same index as in the other thread, I'm guessing it's > the fact that you've got resumes.document sortable. A record with many > resumes and/or large document values could end up with massive values for > the underlying string attribute (that you'd sort by) - are you actually > sorting by this? Generally I'd be surprised if there's much point sorting > by large amounts of text. > > > > > > -- > > > Pat > > > > > > On 19/07/2013, at 3:09 AM, Daniel Vandersluis wrote: > > > > > > > Is there any reason that an index would grow in size when upgrading > from thinkingsphinx 2 to 3? The only differences in the configuration file > is changing port to mysql41, and changing version to 2.0.8-release, but an > index that used to be around 500MB is now resulting in this error: > > > > > > > > ERROR: index 'user_core': too many string attributes (current index > format allows up to 4 GB). > > > > > > > > Anyone have any idea why this would be? > > > > > > > > -- > > > > You received this message because you are subscribed to the Google > Groups "Thinking Sphinx" group. > > > > To unsubscribe from this group and stop receiving emails from it, > send an email to [email protected]. > > > > To post to this group, send email to [email protected]. > > > > Visit this group at http://groups.google.com/group/thinking-sphinx. > > > > For more options, visit https://groups.google.com/groups/opt_out. > > > > > > > > > > > > > > > > > > > > -- > > > You received this message because you are subscribed to the Google > Groups "Thinking Sphinx" group. > > > To unsubscribe from this group and stop receiving emails from it, send > an email to [email protected]. > > > To post to this group, send email to [email protected]. > > > Visit this group at http://groups.google.com/group/thinking-sphinx. > > > For more options, visit https://groups.google.com/groups/opt_out. > > > > > > > > > > > > > > -- > > You received this message because you are subscribed to the Google > Groups "Thinking Sphinx" group. > > To unsubscribe from this group and stop receiving emails from it, send > an email to [email protected] <javascript:>. > > To post to this group, send email to > > [email protected]<javascript:>. > > > Visit this group at http://groups.google.com/group/thinking-sphinx. > > For more options, visit https://groups.google.com/groups/opt_out. > > > > > > > -- You received this message because you are subscribed to the Google Groups "Thinking Sphinx" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. To post to this group, send email to [email protected]. Visit this group at http://groups.google.com/group/thinking-sphinx. For more options, visit https://groups.google.com/groups/opt_out.
