I would expect it to work with both has_one and has_many associations. To be 
honest, I expected it to work back at the start of this thread, so I'm not 
really sure what's changed to get things in the right state :)

On 2 Apr 2014, at 12:00 pm, Walter Lee Davis <[email protected]> wrote:

> Okay, well I'm taking it out, then, and let's see how the indexing goes this 
> time.
> 
> ... and, it's working fine. I wonder what I did to make it work again? The 
> only other significant change was to specify has_one instead of has_many on 
> the content association.
> 
> Walter
> 
> On Apr 1, 2014, at 8:54 PM, Pat Allan wrote:
> 
>> Yeah, I really don't think that line is doing anything at all. Also: if the 
>> model where the index is defined is Title, then what you've specified with 
>> that line is "Please include the association 'content' from the model 
>> returned by the association 'title', from the model where this index is 
>> defined. Thus, it's actually looking for an association 'title' in the Title 
>> model.
>> 
>> On 2 Apr 2014, at 11:49 am, Walter Lee Davis <[email protected]> wrote:
>> 
>>> It's a little confusing, I know. The titles table also has a field named 
>>> title in it. Here's a bit of the schema:
>>> 
>>> create_table "contents", :force => true do |t|
>>>  t.integer  "title_id"
>>>  t.text     "plain",       :limit => 2147483647
>>>  t.datetime "created_at"
>>>  t.datetime "updated_at"
>>> end
>>> 
>>> create_table "titles", :force => true do |t|
>>>  t.string   "title"
>>>  t.string   "list_title"
>>>  t.string   "sort_title"
>>>  t.boolean  "publish"
>>>  t.integer  "file_id"
>>>  t.integer  "copyright_id"
>>>  t.text     "teaser"
>>>  t.text     "bibliography"
>>>  t.integer  "publication_date"
>>>  t.integer  "edition_date"
>>>  t.string   "publication_place"
>>>  t.string   "publisher"
>>>  t.integer  "lf_store_id"
>>>  t.string   "title_type"
>>>  t.integer  "set",               :default => 0
>>>  t.datetime "created_at"
>>>  t.datetime "updated_at"
>>>  t.string   "slug"
>>>  t.integer  "role_id"
>>>  t.string   "xml_filename"
>>>  t.string   "author_name"
>>>  t.text     "toc"
>>> end
>>> 
>>> I didn't see any improvement in the indexing behavior until I added that 
>>> join statement, but as I said, I did several things at once, so I may just 
>>> be cargo-culting a little here.
>>> 
>>> Walter
>>> 
>>> On Apr 1, 2014, at 7:31 PM, Pat Allan wrote:
>>> 
>>>> Hi Walter
>>>> 
>>>> Great to know things are working now.
>>>> 
>>>> Just to clarify one thing: `join title.content` is not required - the join 
>>>> method is for specifying associations that are only used via custom SQL 
>>>> snippets (instead of the usual association/column chained methods).
>>>> 
>>>> But given you've specified that - is title an association, or a column? 
>>>> Because your first field indicates it's a column, but then you've used it 
>>>> with the join method as an association.
>>>> 
>>>> -- 
>>>> Pat
>>>> 
>>>> On 2 Apr 2014, at 12:54 am, Walter Lee Davis <[email protected]> wrote:
>>>> 
>>>>> Thanks, you've been great rubber ducks!
>>>>> 
>>>>> Just after I posted this, I tried "one more thing". (Actually, it was two 
>>>>> or more things at once -- bad practice, I know.) As I hinted, I changed 
>>>>> the relationship from has_many to has_one :content, and left the join in 
>>>>> place, and changed the name of the content.plain from :content to 
>>>>> :plain_text.
>>>>> 
>>>>> define_index do
>>>>> set_property :group_concat_max_len => 10.megabytes
>>>>> 
>>>>> indexes :title, :sortable => true
>>>>> indexes teaser
>>>>> join title.content
>>>>> indexes content.plain, :as => :plain_text
>>>>> indexes author_name, :sortable => true
>>>>> has created_at, updated_at
>>>>> where sanitize_sql(["publish", true])
>>>>> end
>>>>> 
>>>>> 
>>>>> Then in my search method, I removed a dangling :include => :contents, 
>>>>> parameter. Somewhere between these changes, the indexing process started 
>>>>> making sense (about 1GB of index, as predicted) and the search started 
>>>>> working -- and it started working faster, too. 
>>>>> 
>>>>> As my old creative director used to say of me, "Even a blind squirrel 
>>>>> finds a nut or two."
>>>>> 
>>>>> Thanks for all you do,
>>>>> 
>>>>> Walter
>>>>> 
>>>>> On Apr 1, 2014, at 9:23 AM, Walter Lee Davis wrote:
>>>>> 
>>>>>> Here's the info from the production.sphinx.conf, including the SQL:
>>>>>> 
>>>>>> source title_core_0
>>>>>> {
>>>>>> type = mysql
>>>>>> sql_host = 10.12.29.11
>>>>>> sql_user = user
>>>>>> sql_pass = pass
>>>>>> sql_db = oll2
>>>>>> sql_query_pre = SET SESSION group_concat_max_len = 10485760
>>>>>> sql_query_pre = SET NAMES utf8
>>>>>> sql_query_pre = SET TIME_ZONE = '+0:00'
>>>>>> sql_query = SELECT SQL_NO_CACHE `titles`.`id` * CAST(4 AS SIGNED) + 3 AS 
>>>>>> `id` , `titles`.`title` AS `title`, `titles`.`teaser` AS `teaser`, 
>>>>>> GROUP_CONCAT(DISTINCT IFNULL(`contents`.`plain`, '0') SEPARATOR ' ') AS 
>>>>>> `content`, `titles`.`author_name` AS `author_name`, `titles`.`id` AS 
>>>>>> `sphinx_internal_id`, 0 AS `sphinx_deleted`, 3942078319 AS `class_crc`, 
>>>>>> IFNULL('Title', '') AS `sphinx_internal_class`, IFNULL(`titles`.`title`, 
>>>>>> '') AS `title_sort`, IFNULL(`titles`.`author_name`, '') AS 
>>>>>> `author_name_sort`, UNIX_TIMESTAMP(`titles`.`created_at`) AS 
>>>>>> `created_at`, UNIX_TIMESTAMP(`titles`.`updated_at`) AS `updated_at` FROM 
>>>>>> `titles` LEFT OUTER JOIN `contents` ON `contents`.`title_id` = 
>>>>>> `titles`.`id` WHERE (`titles`.`id` >= $start AND `titles`.`id` <= $end 
>>>>>> AND publish) GROUP BY `titles`.`id` ORDER BY NULL
>>>>>> sql_query_range = SELECT IFNULL(MIN(`id`), 1), IFNULL(MAX(`id`), 1) FROM 
>>>>>> `titles` 
>>>>>> sql_attr_uint = sphinx_internal_id
>>>>>> sql_attr_uint = sphinx_deleted
>>>>>> sql_attr_uint = class_crc
>>>>>> sql_attr_timestamp = created_at
>>>>>> sql_attr_timestamp = updated_at
>>>>>> sql_attr_string = sphinx_internal_class
>>>>>> sql_attr_string = title_sort
>>>>>> sql_attr_string = author_name_sort
>>>>>> sql_query_info = SELECT * FROM `titles` WHERE `id` = (($id - 3) / 4)
>>>>>> }
>>>>>> 
>>>>>> index title_core
>>>>>> {
>>>>>> source = title_core_0
>>>>>> path = /data/www/db/sphinx/production/title_core
>>>>>> charset_type = utf-8
>>>>>> }
>>>>>> 
>>>>>> index title
>>>>>> {
>>>>>> type = distributed
>>>>>> local = title_core
>>>>>> }
>>>>>> 
>>>>>> This is after I added an explicit line to the define_index statement:
>>>>>> 
>>>>>> join title.contents
>>>>>> 
>>>>>> There was no difference in terms of the index size that I noted earlier, 
>>>>>> and I'm not astute enough to tell you if the join is correct in the 
>>>>>> generated SQL. Here's the entirety of the define_index method:
>>>>>> 
>>>>>> define_index do
>>>>>> set_property :group_concat_max_len => 10.megabytes
>>>>>> 
>>>>>> indexes :title, :sortable => true
>>>>>> indexes teaser
>>>>>> join title.contents
>>>>>> indexes contents.plain, :as => :content
>>>>>> indexes author_name, :sortable => true
>>>>>> has created_at, updated_at
>>>>>> where sanitize_sql(["publish", true])
>>>>>> end
>>>>>> 
>>>>>> Do you see anything odd about this? Should I change the relationship to 
>>>>>> contents from has_many to has_one? (The has_many was a relic from an 
>>>>>> earlier structure where I was splitting the plain text on an arbitrary 
>>>>>> length. This was before I found a way to increase the column width in my 
>>>>>> database to a size guaranteed to hold my largest title.)
>>>>>> 
>>>>>> Thanks again for your tireless support!
>>>>>> 
>>>>>> Walter
>>>>>> 
>>>>>> On Mar 22, 2014, at 7:49 PM, Pat Allan wrote:
>>>>>> 
>>>>>>> Your points on the expected size are spot on... the syntax you're using 
>>>>>>> (association_name.column_name) should ensure the SQL join is made 
>>>>>>> automatically, but you can confirm that by looking at the sql_query 
>>>>>>> value for the title_core index and see whether it's joining on 
>>>>>>> contents, and how the plain column is being used.
>>>>>>> 
>>>>>>> Is there anything odd/wrong in that SQL statement?
>>>>>>> 
>>>>>>> -- 
>>>>>>> Pat
>>>>>>> 
>>>>>>> On 23 Mar 2014, at 8:25 am, Walter Lee Davis <[email protected]> wrote:
>>>>>>> 
>>>>>>>> I've tried both, and when I don't separately index the contents table, 
>>>>>>>> I don't get any hits on a fairly unique string that I know is in the 
>>>>>>>> contents.plain field when I search on titles. The indexing process 
>>>>>>>> makes it clear that this relation is not being touched -- when I index 
>>>>>>>> the contents table, the result is this:
>>>>>>>> 
>>>>>>>> collected 1099 docs, 913.5 MB
>>>>>>>> 
>>>>>>>> When I index the titles (including the related field in contents) I 
>>>>>>>> only get this:
>>>>>>>> 
>>>>>>>> collected 1603 docs, 24.6 MB
>>>>>>>> 
>>>>>>>> contents.plain is the only large part of contents, it's just title_id 
>>>>>>>> and timestamps besides the plain column. If contents.plain was being 
>>>>>>>> accessed as a related column (and thus added to the metadata from the 
>>>>>>>> titles table), I would expect the result from the title indexing 
>>>>>>>> process to be in the 1GB neighborhood. That would make sense to me, 
>>>>>>>> given the amount of data being indexed.
>>>>>>>> 
>>>>>>>> What I have done as a fallback for now is added an index to the 
>>>>>>>> contents table, and rigged up my search results page to display those 
>>>>>>>> results as if they were to the title. But I'd still like to be 
>>>>>>>> searching once across all titles, and finding hits whether they were 
>>>>>>>> against the metadata (in the titles table) or the content (in the 
>>>>>>>> contents table). Is there some join syntax I need to use here? 
>>>>>>>> Remember -- this is Sphinx 2, not 3.
>>>>>>>> 
>>>>>>>> Thanks so much for your excellent support!
>>>>>>>> 
>>>>>>>> Walter
>>>>>>>> 
>>>>>>>> On Mar 21, 2014, at 11:04 PM, Pat Allan wrote:
>>>>>>>> 
>>>>>>>>> * The indexes line you're using will work fine - also, it'll work 
>>>>>>>>> with the arguments being shifted to method calls:
>>>>>>>>> 
>>>>>>>>> indexes contents.plain, :as => :content
>>>>>>>>> 
>>>>>>>>> If there's still issues, do get in touch.
>>>>>>>> 
>>>>>>>> -- 
>>>>>>>> You received this message because you are subscribed to the Google 
>>>>>>>> Groups "Thinking Sphinx" group.
>>>>>>>> To unsubscribe from this group and stop receiving emails from it, send 
>>>>>>>> an email to [email protected].
>>>>>>>> To post to this group, send email to [email protected].
>>>>>>>> Visit this group at http://groups.google.com/group/thinking-sphinx.
>>>>>>>> For more options, visit https://groups.google.com/d/optout.
>>>>>>> 
>>>>>>> -- 
>>>>>>> You received this message because you are subscribed to the Google 
>>>>>>> Groups "Thinking Sphinx" group.
>>>>>>> To unsubscribe from this group and stop receiving emails from it, send 
>>>>>>> an email to [email protected].
>>>>>>> To post to this group, send email to [email protected].
>>>>>>> Visit this group at http://groups.google.com/group/thinking-sphinx.
>>>>>>> For more options, visit https://groups.google.com/d/optout.
>>>>>> 
>>>>>> -- 
>>>>>> You received this message because you are subscribed to the Google 
>>>>>> Groups "Thinking Sphinx" group.
>>>>>> To unsubscribe from this group and stop receiving emails from it, send 
>>>>>> an email to [email protected].
>>>>>> To post to this group, send email to [email protected].
>>>>>> Visit this group at http://groups.google.com/group/thinking-sphinx.
>>>>>> For more options, visit https://groups.google.com/d/optout.
>>>>> 
>>>>> -- 
>>>>> You received this message because you are subscribed to the Google Groups 
>>>>> "Thinking Sphinx" group.
>>>>> To unsubscribe from this group and stop receiving emails from it, send an 
>>>>> email to [email protected].
>>>>> To post to this group, send email to [email protected].
>>>>> Visit this group at http://groups.google.com/group/thinking-sphinx.
>>>>> For more options, visit https://groups.google.com/d/optout.
>>>> 
>>>> -- 
>>>> You received this message because you are subscribed to the Google Groups 
>>>> "Thinking Sphinx" group.
>>>> To unsubscribe from this group and stop receiving emails from it, send an 
>>>> email to [email protected].
>>>> To post to this group, send email to [email protected].
>>>> Visit this group at http://groups.google.com/group/thinking-sphinx.
>>>> For more options, visit https://groups.google.com/d/optout.
>>> 
>>> -- 
>>> You received this message because you are subscribed to the Google Groups 
>>> "Thinking Sphinx" group.
>>> To unsubscribe from this group and stop receiving emails from it, send an 
>>> email to [email protected].
>>> To post to this group, send email to [email protected].
>>> Visit this group at http://groups.google.com/group/thinking-sphinx.
>>> For more options, visit https://groups.google.com/d/optout.
>> 
>> -- 
>> You received this message because you are subscribed to the Google Groups 
>> "Thinking Sphinx" group.
>> To unsubscribe from this group and stop receiving emails from it, send an 
>> email to [email protected].
>> To post to this group, send email to [email protected].
>> Visit this group at http://groups.google.com/group/thinking-sphinx.
>> For more options, visit https://groups.google.com/d/optout.
> 
> -- 
> You received this message because you are subscribed to the Google Groups 
> "Thinking Sphinx" group.
> To unsubscribe from this group and stop receiving emails from it, send an 
> email to [email protected].
> To post to this group, send email to [email protected].
> Visit this group at http://groups.google.com/group/thinking-sphinx.
> For more options, visit https://groups.google.com/d/optout.

-- 
You received this message because you are subscribed to the Google Groups 
"Thinking Sphinx" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To post to this group, send email to [email protected].
Visit this group at http://groups.google.com/group/thinking-sphinx.
For more options, visit https://groups.google.com/d/optout.

Reply via email to