Hi there,

Mario Lang wrote:
> Can anyone explain how Ht::dig was "embedded" in the
> midgard-project.org site? I've played with ht::dig and would like to
> integrate it into my project.

OK, here comes. The search system is based on an article published on
DevShed (http://www.devshed.com/Server_Side/PHP/Search_This/). The
ht://Dig search engine is configured to index the lite version of the
Midgard's web site (http://lite.midgard-project.org/) so there won't be
unnecessary keywords all over the pages. (On a related note, the lite
version is a great example of the possibilities of the style mechanism!)
We run the htdig indexer every 24 hours.

The search page emulates a CGI-like environment and runs the htsearch
program. The htsearch command is configured to output a format that is
easily parseable by the script. The page extracts the information from
the htsearch program and displays it.

This setup allows us to completely customize the search engine with no
further changes in the htsearch program. The system cleanly separates
three parts of the system: underlying search engine, it's user
interface, and the outlook and navigation structure of the site. These
parts are implenentedy by ht://Dig, a Midgard page, and a Midgard style
respectively.

Jukka

The relevant code, with comments:

<? if ($search) {
     $HTSEARCH = '/opt/htdig/bin/htsearch';

     /* The CGI QUERY_STRING is built in the variable $qstring. */
     $qstring = sprintf("words=%s&method=%s",
                        UrlEncode($search), UrlEncode($method));
     if ($page)
       $qstring .= "&page=$page";
     else
       $page = 1;

     /* Emulate the CGI environment */
     putenv("QUERY_STRING=$qstring");
     putenv("REQUEST_METHOD=GET");

     /* Execute search and gather the results */
     exec($HTSEARCH, $result, $status);

     $rows = count($result);
     if ($status || $rows < 3) { ?>
<p>Search failed. Please file an error report.</p>
<[error-report]>
<?   } elseif ($result[2] == 'NOMATCH') { ?>
<p>No matches for <strong>&(search:T);</strong> found.</p>
<?   } elseif ($result[2] == 'SYNTAXERROR') { ?>
<p>There was an error in your search syntax. Please try again.</p>
<?   } else {
       $matches = $result[2];
       $pages   = $result[3];
       $first   = $result[4];
       $last    = $result[5]; ?>
<p>Matches <strong>&(first);</strong> - <strong>&(last);</strong> of
   <strong>&(matches);</strong> total.
   <? echo '*'; ?>'s indicate better matches.</p>
<?     for ($i = 6; $i < $rows; $i += 5) {
         /* The htsearch is configured to return matches in the
            following format, each part on its own row:
            URL
            TITLE
            SIZE
            PERCENT
            EXCERPT
         */
         $url     = $result[$i+0];
         $title   = $result[$i+1];
         $size    = $result[$i+2];
         $percent = $result[$i+3];
         $excerpt = $result[$i+4];
         for ($stars = '', $j = 10; $j < $percent; $j += 20) {
           $stars .= ' *';
         }
         printf('<dl><dt><strong><a href="%s">%s</a></strong>%s</dt>' .
"\n",
                str_replace('http://lite.midgard-project.org', '',
$url),
                (strlen($title) > 50) ? substr($title, 0, 47) . '...' :
$title,
                $stars);
         printf('    <dd>%s<br>%s', $excerpt, "\n");
         printf('    <font color="gray">%s (%d%%,
%skB)</font></dd></dl>%s',
                (strlen($url) > 50) ? substr($url, 0, 47) . '...' :
$url,
                $percent, ceil($size/1024), "\n");
       }
   } ?>

--
This is The Midgard Project's mailing list. For more information,
please visit the project's web site at http://www.midgard-project.org

To unsubscribe the list, send an empty email message to address
[EMAIL PROTECTED]

Reply via email to