I was using the wrong form...I changed it to html_strip and rebuilt ts...I
know see my production.sphinx.conf file has:
index topic_core
{
source = topic_core_0
path = /home/bsturim/openmind/db/sphinx/production/topic_core
morphology = stem_en
charset_type = utf-8
min_infix_len = 1
enable_star = 1
html_strip = 1
}
That said, it's still not finding the row I would expect it to find. Here's
the body of one of the topic_comment records I would expect it to return a
hit on but does not:
<p>Please verify the following:</p>
<ol>
<li>All Codeunit edits are done</li>
<li>The Fin.flf license file is copied into the folder for the
appropriate *version*.<br />
For example for NAV 5 SP 1 copy into <span style="font-family: Courier
New">C:\Program Files\Scribe\Nav\CFrontHost501</span></li>
<li>Check the Windows Application Event log to verify that your NAS is
starting with the Scribe Startup Parameter</li>
<li><span>Using Windows Explorer, c</span>ompare the Cfront file
versions in your NAV system with the correct <span style="font-family:
Courier New">C:\Program Files\Scribe\Nav\CFrontHost </span>folder
<ol type="a">
<li>In the folder, change the view to the Details view
then select View – Choose Details and select the File
Version check box. This will show the file versions for all files
that have versions. For Example my <span style="font-family: Courier
New">C:\Program Files\Scribe\Nav\CFrontHost501\CFRONT.DLL </span>has a file
*version *of <span style="font-family: Courier New">5.0.26084.0</span></li>
<li>If the complete build version does not match copy the following
files off your NAV installation CD to Scribe NAV CFrontHost folder<br />
<br />
<span style="font-family: Courier New">CFRONT.DLL</span><br />
<span style="font-family: Courier New">cfrontsql.dll</span><br />
<span style="font-family: Courier New">Dbm.dll</span><br />
<span style="font-family: Courier New">Nc_netb.dll</span><br />
<span style="font-family: Courier New">Nc_tcp.dll</span><br />
<span style="font-family: Courier New">nc_tcps.dll</span><br />
<span style="font-family: Courier New">ndbcs.dll</span><br />
<span style="font-family: Courier New">SLAVE.exe<br />
</span></li>
<li>Put the <span style="font-family: Courier
New">Microsoft.Navision.CFront.CFrontDotNet.dll </span>assembly into Global
Assembly Cache<br />
</li>
</ol>
</li>
<li>Stop all the Scibe Services</li>
<li>Stop the NAS</li>
<li>Verify the NAS started using Scribe startup paremeter by looking in
Windows Application Event Log.</li>
<li>Restart the Scribe Services</li>
</ol>
I've bolded a couple of the instances of the word "version".
In the console, if I type:
TopicComment.search('*version*')
or
TopicComment.search('version')
I don't get that hit. And yes, I do know that both forms of the invocation
should be identical since I have set enable_star.
Thanks again.
Bob
On Wed, Oct 6, 2010 at 11:02 PM, Pat Allan <[email protected]> wrote:
> Did you use strip_html or html_strip? It needs to be the latter - can't
> spot it in the conf file.
>
> --
> Pat
>
> On 07/10/2010, at 12:25 PM, Robert Sturim wrote:
>
> > I added the strip_html but so far it doesn't look like it's helped.
> >
> > My production.sphinx.conf looks as follows:
> >
> > source topic_comment_core_0
> > {
> > type = mysql
> > sql_host = localhost
> > sql_user = openmind
> > sql_pass = xxx
> > sql_db = openmind
> > sql_query_pre = UPDATE `comments` SET `delta` = 0 WHERE `delta` = 1
> > sql_query_pre = SET NAMES utf8
> > sql_query_pre = SET TIME_ZONE = '+0:00'
> > sql_query = SELECT SQL_NO_CACHE `comments`.`id` * 6 + 4 AS `id` ,
> `comments`.`body` AS `body`, `comments`.`id` AS `sphinx_internal_id`,
> CAST(IFNULL(CRC32(NULLIF(`comments`.`type`,'')), 432825427) AS UNSIGNED) AS
> `class_crc`, 0 AS `sphinx_deleted`, UNIX_TIMESTAMP(`comments`.`created_at`)
> AS `created_at`, UNIX_TIMESTAMP(`comments`.`updated_at`) AS `updated_at`
> FROM `comments` WHERE `comments`.`id` >= $start AND `comments`.`id` <=
> $end AND `comments`.`delta` = 0 AND `comments`.`type` = 'TopicComment' GROUP
> BY `comments`.`id`, `comments`.`type` ORDER BY NULL
> > sql_query_range = SELECT IFNULL(MIN(`id`), 1), IFNULL(MAX(`id`), 1)
> FROM `comments` WHERE `comments`.`delta` = 0
> > sql_attr_uint = sphinx_internal_id
> > sql_attr_uint = class_crc
> > sql_attr_uint = sphinx_deleted
> > sql_attr_timestamp = created_at
> > sql_attr_timestamp = updated_at
> > sql_query_info = SELECT * FROM `comments` WHERE `id` = (($id - 4) / 6)
> > }
> >
> > index topic_comment_core
> > {
> > source = topic_comment_core_0
> > path = /home/bsturim/openmind/db/sphinx/production/topic_comment_core
> > morphology = stem_en
> > charset_type = utf-8
> > min_infix_len = 1
> > enable_star = 1
> > }
> >
> > source topic_comment_delta_0 : topic_comment_core_0
> > {
> > type = mysql
> > sql_host = localhost
> > sql_user = openmind
> > sql_pass = xxx
> > sql_db = openmind
> > sql_query_pre =
> > sql_query_pre = SET NAMES utf8
> > sql_query_pre = SET TIME_ZONE = '+0:00'
> > sql_query = SELECT SQL_NO_CACHE `comments`.`id` * 6 + 4 AS `id` ,
> `comments`.`body` AS `body`, `comments`.`id` AS `sphinx_internal_id`,
> CAST(IFNULL(CRC32(NULLIF(`comments`.`type`,'')), 432825427) AS UNSIGNED) AS
> `class_crc`, 0 AS `sphinx_deleted`, UNIX_TIMESTAMP(`comments`.`created_at`)
> AS `created_at`, UNIX_TIMESTAMP(`comments`.`updated_at`) AS `updated_at`
> FROM `comments` WHERE `comments`.`id` >= $start AND `comments`.`id` <=
> $end AND `comments`.`delta` = 1 AND `comments`.`type` = 'TopicComment' GROUP
> BY `comments`.`id`, `comments`.`type` ORDER BY NULL
> > sql_query_range = SELECT IFNULL(MIN(`id`), 1), IFNULL(MAX(`id`), 1)
> FROM `comments` WHERE `comments`.`delta` = 1
> > sql_attr_uint = sphinx_internal_id
> > sql_attr_uint = class_crc
> > sql_attr_uint = sphinx_deleted
> > sql_attr_timestamp = created_at
> > sql_attr_timestamp = updated_at
> > sql_query_info = SELECT * FROM `comments` WHERE `id` = (($id - 4) / 6)
> > }
> > index topic_comment_delta : topic_comment_core
> > {
> > source = topic_comment_delta_0
> > path = /home/bsturim/openmind/db/sphinx/production/topic_comment_delta
> > }
> >
> > index topic_comment
> > {
> > type = distributed
> > local = topic_comment_delta
> > local = topic_comment_core
> > }
> >
> > The size of the body of a comment can vary...it's not bounded. I would
> say average is 700 characters...
> >
> > Thanks.
> >
> > Bob
> >
> >
> > On Wed, Oct 6, 2010 at 8:27 PM, Pat Allan <[email protected]>
> wrote:
> > Hi Bob
> >
> > The HTML text may make a difference... you'll probably want to set the
> html_strip setting to true in your sphinx.yml file.
> > http://www.sphinxsearch.com/docs/manual-0.9.9.html#conf-html-strip
> >
> > Also: how large are the values in the body column? And what does the
> source look like in development.sphinx.conf? (Make sure you remove the
> database password!)
> >
> > Cheers
> >
> > --
> > Pat
> >
> > On 07/10/2010, at 11:11 AM, Robert Sturim wrote:
> >
> > > Thanks for your response. You are correct that my SQL query would pick
> up partial hits that thinking sphinx would not, but that's not the issue in
> this case -- first, because in my test search I have tried using wildcards
> in my search and secondly I know that there are a number of cases in which
> my search term is a full word and it is still missed in my search.
> > >
> > > Two pieces of detail I neglected to mention in my original post -- I'm
> using version 1.3.18. And, the body of the comments which I am searching
> against is html text. I'm not sure if that would make a difference.
> > >
> > > Thanks.
> > >
> > > Bob
> > >
> > > On Wed, Oct 6, 2010 at 6:31 PM, Pat Allan <[email protected]>
> wrote:
> > > Hi Bob
> > >
> > > Not entirely sure why this is happening, but your comparison with a SQL
> query isn't quite accurate - Sphinx matches full words by default (not
> prefixes/infixes) - so '%value%' is different to 'value'.
> > >
> > > If you do want partial word searching, this is covered in the docs:
> > > http://freelancing-god.github.com/ts/en/common_issues.html#wildcards
> > >
> > > Also, you may want to use :star => true to automatically add stars to
> each word in your search queries (so 'value' is treated as '*value*'). The
> other way to test this would be to modify the SQL query to match on word
> boundaries (perhaps using a regular expression?).
> > >
> > > Let us know if the numbers still aren't matching up then.
> > >
> > > Cheers
> > >
> > > --
> > > Pat
> > >
> > > On 06/10/2010, at 3:42 AM, Bob Sturim wrote:
> > >
> > > > Thinking Sphinx appears to not be finding all the results I would
> > > > expect it to find.
> > > >
> > > > I have a comments table. The topics table uses single table
> > > > inheritance to store comments for both Ideas and Topics...so I am
> > > > searching on the entity TopicComment.
> > > >
> > > > My defines_index declaration in TopicComment.rb is:
> > > >
> > > > define_index do
> > > > indexes body
> > > > has created_at, updated_at
> > > > set_property :delta => true
> > > > end
> > > >
> > > > If I do the following:
> > > >
> > > > TopicComment.search('value')
> > > >
> > > > it will only find 6 hits...though if I go into sql and issue the
> > > > following query:
> > > >
> > > > SELECT * FROM comments
> > > > where type = 'TopicComment'
> > > > and body like '%value%'
> > > > order by topic_id
> > > >
> > > > it will retrieve 377 hits.
> > > >
> > > > There are 4141 entires in the comments table that are of type
> > > > TopicComments. If I rebuild my index using
> > > >
> > > > rake ts:rebuild
> > > >
> > > > it shows that 4141 documents were indexed:
> > > >
> > > >
> > > > indexing index 'topic_comment_core'...
> > > > collected 4141 docs, 2.9 MB
> > > > sorted 12.5 Mhits, 100.0% done
> > > > total 4141 docs, 2940976 bytes
> > > > total 8.233 sec, 357181 bytes/sec, 502.92 docs/sec
> > > > indexing index 'topic_comment_delta'...
> > > > collected 0 docs, 0.0 MB
> > > > total 0 docs, 0 bytes
> > > > total 0.001 sec, 0 bytes/sec, 0.00 docs/sec
> > > > skipping non-plain index 'topic_comment'...
> > > >
> > > > My sphinx.yml is defined as follows:
> > > >
> > > >
> > > > production:
> > > > enable_star: 1
> > > > min_infix_len: 1
> > > > max_matches: 5000
> > > > morphology: stem_en
> > > >
> > > >
> > > > Am I missing something?
> > > >
> > > > Thanks very much.
> > > >
> > > > --
> > > > You received this message because you are subscribed to the Google
> Groups "Thinking Sphinx" group.
> > > > To post to this group, send email to
> [email protected].
> > > > To unsubscribe from this group, send email to
> [email protected]<thinking-sphinx%[email protected]>
> .
> > > > For more options, visit this group at
> http://groups.google.com/group/thinking-sphinx?hl=en.
> > > >
> > >
> > > --
> > > You received this message because you are subscribed to the Google
> Groups "Thinking Sphinx" group.
> > > To post to this group, send email to [email protected].
> > > To unsubscribe from this group, send email to
> [email protected]<thinking-sphinx%[email protected]>
> .
> > > For more options, visit this group at
> http://groups.google.com/group/thinking-sphinx?hl=en.
> > >
> > >
> > >
> > > --
> > > You received this message because you are subscribed to the Google
> Groups "Thinking Sphinx" group.
> > > To post to this group, send email to [email protected].
> > > To unsubscribe from this group, send email to
> [email protected]<thinking-sphinx%[email protected]>
> .
> > > For more options, visit this group at
> http://groups.google.com/group/thinking-sphinx?hl=en.
> >
> > --
> > You received this message because you are subscribed to the Google Groups
> "Thinking Sphinx" group.
> > To post to this group, send email to [email protected].
> > To unsubscribe from this group, send email to
> [email protected]<thinking-sphinx%[email protected]>
> .
> > For more options, visit this group at
> http://groups.google.com/group/thinking-sphinx?hl=en.
> >
> >
> >
> > --
> > You received this message because you are subscribed to the Google Groups
> "Thinking Sphinx" group.
> > To post to this group, send email to [email protected].
> > To unsubscribe from this group, send email to
> [email protected]<thinking-sphinx%[email protected]>
> .
> > For more options, visit this group at
> http://groups.google.com/group/thinking-sphinx?hl=en.
>
> --
> You received this message because you are subscribed to the Google Groups
> "Thinking Sphinx" group.
> To post to this group, send email to [email protected].
> To unsubscribe from this group, send email to
> [email protected]<thinking-sphinx%[email protected]>
> .
> For more options, visit this group at
> http://groups.google.com/group/thinking-sphinx?hl=en.
>
>
--
You received this message because you are subscribed to the Google Groups
"Thinking Sphinx" group.
To post to this group, send email to [email protected].
To unsubscribe from this group, send email to
[email protected].
For more options, visit this group at
http://groups.google.com/group/thinking-sphinx?hl=en.