On Fri, 2003-11-14 at 18:11, Neal Richter wrote:
>   OK, so I forgot the #2 Question...
> 
> > Questions:
> >
> > 1) If you run an htdump -w before and after the purge, do the db.docs
> > files differ?

 No.

>   2) If you temporarily replace your start_url with the one you want to
> re-add, and rerun 'htdig -v -c xxxx' does it add it properly?  Does for
> me.. and it shows up again in search results.

 No either unfortunately.

> I've got this kind of thing working in libhtdig & libhtdigphp.  How
> attached are you to your current implementation?????

 Not terribly. I can change it if needed. Here's what I'm doing:

 I've got a couple of stored procedures in PostgreSQL that allow me to
query htdig databases and return indexes. Basically, htdig returns the
following type URLs:

http://newfind.mcgill.ca/indexes/ads/?AdsID=1026194

where the integer at the end of the URL is the primary key of the Ads
table in our database. This allows me to do things like this:

http://newfind.mcgill.ca/ads/?words=jazz+guitar

which is a nifty way of doing full text indexing in PostgreSQL. The only
alternative at the moment in Postgres is to use GiST indexes and
t_search, which is incredibly complex and so poorly documented that even
the core Postgres developers are unable to get it to work.

So, I've got a PL/Perl script (with a PL/pgSQL wrapper) in Postgres that
returns these integers (above) as rows. This means I can do the
following type query:

SELECT * FROM htsearch('"reasonable offer"', 'ads');

and it returns this:

 item_id    |  htdig_order
------------+---------------
  1014752   |  1 
  1026970   |  2 


All of this is working very nicely. The thing I want to do now is write
the stored procedures that get htdig to re-index a new item. If I can do
this in PHP, that's fine. Just tell me how to build PHP/HtDig with 
libhtdigphp and how to use it and I'm there.

At the bottom of this email are the two stored procedures and data type
definitions if you are interested. The Pl/pgSQL is a required wrapper
because at the moment, PL/Perl scripts can't return sets, only simpler
data types. If anyone else is interested in this, I can send all the
code and documentation when it it completely built. I know that there
are folks on the Postgres list that are interested in this when it is
done.

Cheers,

Chris


-- 
Christopher Murtagh
Enterprise Systems Administrator
ISR / Web Communications Group 
McGill University
Montreal, Quebec
Canada

Tel.: (514) 398-3122
Fax:  (514) 398-2017



CREATE TYPE htdig AS (item_id int, htdig_order int);

CREATE OR REPLACE FUNCTION htdig(text, text) RETURNS SETOF htdig AS '
DECLARE
  result text[];
  low integer;
  high integer;
  item htdig%rowtype;
BEGIN
    result := htsearch($1,$2);
    low  := 1;
    high := array_upper(result, 1);

    FOR i IN low..high LOOP
      item.item_id     := result[i];
      item.htdig_order := i;
      RETURN NEXT item;
    END LOOP;
  RETURN;
END;
' LANGUAGE 'plpgsql' STABLE STRICT;

CREATE OR REPLACE FUNCTION htsearch(text, text)
RETURNS text[] AS '
 my $SearchTerms = $_[0];
 my $DBName = $_[1];
 my @Result;
 my $Line;
 $DBName =~ s/[^a-z]//g; #dbname is only allowed letters
 $SearchTerms =~ s/['']/ /g; # remove single quotes (prevent SQL injection)
 
 open HTSEARCH, "/usr/local/htdig/bin/htsearch -c /usr/local/htdig/conf/${DBName}.conf 
''config=${DBName};words=${SearchTerms};matchesperpage=1000;'' |";
 
 while(<HTSEARCH>) {
    $Line = $_;
    $Line =~ s/[^0-9-]//g;
    chomp($Line);
    push @Result, $Line;
 }
 close HTSEARCH;
 return qq/{/ . (join qq/,/, @Result) . qq/}/;
' LANGUAGE plperlu;




-------------------------------------------------------
This SF. Net email is sponsored by: GoToMyPC
GoToMyPC is the fast, easy and secure way to access your computer from
any Web browser or wireless device. Click here to Try it Free!
https://www.gotomypc.com/tr/OSDN/AW/Q4_2003/t/g22lp?Target=mm/g22lp.tmpl
_______________________________________________
ht://Dig Developer mailing list:
[EMAIL PROTECTED]
List information (subscribe/unsubscribe, etc.)
https://lists.sourceforge.net/lists/listinfo/htdig-dev

Reply via email to