Thanks by the way to Jantho on the IRC who plucked the name Harpoon from a
random dictionary search.
Eerie how appropriate that word is :)
> so you're running a Freenet spider too. Guess that means war ;-)
> Or just competition. Or mutual learning. Or collaboration.
In my state of exhaustion yesterday from a full-on day's work the day
before, the mad idea grabbed me to do this as an experiment.
It started out as a desire to pull content closer to my node so I wouldn't
have to wait ages for freesites to load.
Also, as a possible idea for a FreeWeb plugin to make content more visible.
For ideas, I tried a Windows app called 'NearSite', which crawls a selection
of start URLs, and is intended for maintaining a fresh cache of frequently
updated stuff like stockmarket, news etc. But NearSite is $hareware, and it
suffers another major disadvantage of actually storing Freenet content
unencrypted on the user's machine. No-one needs that liability or exposure.
Enter ht://dig, which discards content immediately after indexing.
It blew me away with how easily the whole thing fell into place with
ht://dig. It compiled easily on Windoze (thanks to Cygwin), and was an easy
hack to get working with Freenet URLs. Offering it as a public search engine
idea only occurred to me later.
I don't have any desire for Harpoon to be the 'One True Search Engine'.
But I would like to have it established as *one* of the major Freenet search
facilities.
The mainstream Net is enriched by its diversity of search engines. 3-6 major
general engines, through to millions of minor specialist engines.
I appreciate being able to choose from Yahoo, Altavista, Google etc. Stuff
that doesn't show on one engine shines up plain as day on another.
Freenet would be best served IMO by a handful of major engines, each based
on different methods of spidering, indexing etc. Some spider-based, some
based on keywords, some hand-edited, some automatic etc etc. What one engine
misses, another will find. We all serve the punters best by giving them a
healthy choice.
Stefan, there's no way I want to drive your or anyone else's portal out of
usage. Keep going with Freegle.
> Do you plan to add in-Freenet searching?
Sure do.
This will involve distributing a local prog (binaries for windows, modified
ht://dig source distro for Linux), plus a freesite containing the latest
Harpoon database. Include a prog which automatically downloads the search
database periodically, and Bob's your uncle :)
> And what about all the other nifty features that Freegle already
> incorporates? (SCNR...)
These are Freegle's claim to fame.
Harpoon on the other hand features:
1) boolean search
2) only displays results for retrievable pages
3) excludes the millions of defunct single-file keys (all those goddam KSKs,
CHKs, SSKs etc, 95+% of which fail)
Cheers
David
----- Original Message -----
From: "Stefan Reich" <[EMAIL PROTECTED]>
To: <[EMAIL PROTECTED]>
Sent: Monday, July 02, 2001 01:52
Subject: Re: [freenet-chat] Harpoon Search - user notes
> Hi David,
>
> so you're running a Freenet spider too. Guess that means war ;-)
>
> Or just competition. Or mutual learning. Or collaboration.
>
> So... what I would like to know: Where do you want to take this project?
Was
> this just a quick shot at hacking htdig to adapt it to Freenet? Or is it
the
> beginning of something bigger?
>
> Do you plan to add in-Freenet searching?
>
> And what about all the other nifty features that Freegle already
> incorporates? (SCNR...)
>
> -Stefan
>
> ----- Original Message -----
> From: "David McNab" <[EMAIL PROTECTED]>
> To: <[EMAIL PROTECTED]>
> Sent: Sunday, July 01, 2001 3:31 PM
> Subject: [freenet-chat] Harpoon Search - user notes
>
>
> > Notes for Harpoon Search
> >
> > 1. Harpoon Search is a fully spidered Freenet search engine. This means
> that
> > you can search for text or html pages containing certain words in their
> > text, as opposed to just searching on URIs or titles.
> >
> > 2. Temporary URL for Harpoon is http://harpoonsearch.cjb.net
> >
> > 3. Search engine results are based on your node's FProxy being at
> > http://127.0.0.1:8081
> >
> > 4. Using Harpoon directly is non-anonymous. However, I do regularly
delete
> > all access logs. If you feel wary, then please access Harpoon through
your
> > favourite anonymising http proxy.
> >
> > 5. You can do boolean searches, using 'and', 'or', 'not' and
parentheses,
> > like Altavista
> >
> > 6. You can sort by relevance, time or title.
> >
> > 7. Phrase searching not available - but will be added soon
> >
> > 8. Search results are confined to pages within standard freesites, or
URIs
> > beginning with 'MSK'. For example, Snarfoo, Content of Evil etc.
> >
> > No stray CHKs, KSKs or other non-freesite URIs will appear. This is
> because
> > the spider is busy enough trying to track down the MSK pages as it is,
and
> > such a low percentage of advertised CHKs are actually retrievable.
> >
> > 9. A submission form will be added in the near future, so you can add
your
> > own freesite (again, MSKs only. CHKs, KSKs, SVKs and SSKs will be
> > automatically rejected).
> >
> > 10. Fuzzy searching not available, but will be added if there is popular
> > demand.
> >
> > 11. The server running Harpoon does not necessarily contain the pages
> > returned in the search results. I have no way of determining whether any
> > results actually exist on the server. Also, the spidering is fully
> > automatic - there is no way for me to screen content. The search results
> > generated by Harpoon point to your own local node. The fact that a
> hyperlink
> > generated by Harpoon works for you does not imply the existence of the
> page
> > on the Harpoon server.
> >
> > 12. The search database is constructed by a recursive crawl starting at
> the
> > FProxy Gateway page.
> >
> > 13. Bugs, suggestions etc - please email me -
> > [EMAIL PROTECTED]
> >
> > Cheers
> > David
> >
> >
> > ----- Original Message -----
> > From: "David McNab" <[EMAIL PROTECTED]>
> > To: <[EMAIL PROTECTED]>
> > Sent: Monday, July 02, 2001 00:14
> > Subject: [freenet-chat] Harpoon - your new Freenet search engine
> >
> >
> > > Hi fellow Freenetters,
> > >
> > > It gives me pleasure to announce 'Harpoon', a new Freenet search
engine
> > > residing on the WWW.
> > >
> > > Harpoon is a true crawler, which indexes all the currently viewable
> > > freesites.
> > > So you can search on words which occur within current freesite pages.
> > >
> > > It is my intention to keep an update daemon running regularly, so that
> > most
> > > or all of the search results that come up will be live, retrievable
> pages.
> > >
> > > The temporary web address for Harpoon is:
> > > http://harpoonsearch.cjb.net
> > >
> > > Enjoy!
> > > David
> > >
> > >
> > >
> > > _______________________________________________
> > > Chat mailing list
> > > [EMAIL PROTECTED]
> > > http://lists.freenetproject.org/mailman/listinfo/chat
> > >
> >
> >
> > _______________________________________________
> > Chat mailing list
> > [EMAIL PROTECTED]
> > http://lists.freenetproject.org/mailman/listinfo/chat
>
>
> _______________________________________________
> Chat mailing list
> [EMAIL PROTECTED]
> http://lists.freenetproject.org/mailman/listinfo/chat
>
_______________________________________________
Chat mailing list
[EMAIL PROTECTED]
http://lists.freenetproject.org/mailman/listinfo/chat