Re: Incremental Search experiment with Lucene, sort of like the new Google Suggestion page

2004-12-12 Thread David Spencer
Chris Lamprecht wrote:
Very cool, thanks for posting this!  

Google's feature doesn't seem to do a search on every keystroke
necessarily.  Instead, it waits until you haven't typed a character
for a short period (I'm guessing about 100 or 150 milliseconds).  So
Thx again for tip, I updated my experiment 
(http://www.searchmorph.com/kat/isearch.html, as per below) to use a 
150ms delay to avoid some needless searches...TBD is more intelligent 
guidance or suggestions for the user.

if you type fast, it doesn't hit the server until you pause.  There
are some more detailed postings on slashdot about how it works.
On Fri, 10 Dec 2004 16:36:27 -0800, David Spencer
[EMAIL PROTECTED] wrote:
Google just came out with a page that gives you feedback as to how many
pages will match your query and variations on it:
http://www.google.com/webhp?complete=1hl=en
I had an unexposed experiment I had done with Lucene a few months ago
that this has inspired me to expose - it's not the same, but it's
similar in that as you type in a query you're given *immediate* feedback
as to how many pages match.
Try it here: http://www.searchmorph.com/kat/isearch.html
This is my SearchMorph site which has an index of ~90k pages of open
source javadoc packages.
As you type in a query, on every keystroke it does at least one Lucene
search to show results in the bottom part of the page.
It also gives spelling corrections (using my NGramSpeller
contribution) and also suggests popular tokens that start the same way
as your search query.
For one way to see corrections in action, type in rollback character
by character (don't do a cut and paste).
Note that:
-- this is not how the Google page works - just similar to it
-- I do single word suggestions while google does the more useful whole
phrase suggestions (TBD I'll try to copy them)
-- They do lots of javascript magic, whereas I use old school frames mostly
-- this is relatively expensive, as it does 1 query per character, and
when it's doing spelling correction there is even more work going on
-- this is just an experiment and the page may be unstable as I fool w/ it
What's nice is when you get used to immediate results, going back to the
batch way of searching seems backward, slow, and old fashioned.
There are too many idle CPUs in the world - this is one way to keep them
busier :)
-- Dave
PS Weblog entry updated too:
http://www.searchmorph.com/weblog/index.php?id=26
-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]


-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]


Re: Incremental Search experiment with Lucene, sort of like the new Google Suggestion page

2004-12-11 Thread David Spencer
Chris Lamprecht wrote:
Very cool, thanks for posting this!  

Google's feature doesn't seem to do a search on every keystroke
necessarily.  Instead, it waits until you haven't typed a character
for a short period (I'm guessing about 100 or 150 milliseconds). 
Ohh, good point - I was wondering how to cancel a URL - this is the 
right way. I'll try to code that in.

I also realized they're prob not doing searches at all - instead they're 
going off a DB of query popularity - I wanted to code up something 
generic, based just on term frequency but I don't think it'll be useful 
e.g. let's say
in my index (index of javadoc-generated documentation) the user types in
hash - well a human might guess that they intend hash map or 
hashmap or hash tree but I'm sure other terms are more frequent in 
my index than map and tree...I'm sure hash java occurs more 
frequently than hash map - or any other freq, non-stop word, and it's 
dubious that hash java is a useful suggestion...

So
if you type fast, it doesn't hit the server until you pause.  There
are some more detailed postings on slashdot about how it works.
On Fri, 10 Dec 2004 16:36:27 -0800, David Spencer
[EMAIL PROTECTED] wrote:
Google just came out with a page that gives you feedback as to how many
pages will match your query and variations on it:
http://www.google.com/webhp?complete=1hl=en
I had an unexposed experiment I had done with Lucene a few months ago
that this has inspired me to expose - it's not the same, but it's
similar in that as you type in a query you're given *immediate* feedback
as to how many pages match.
Try it here: http://www.searchmorph.com/kat/isearch.html
This is my SearchMorph site which has an index of ~90k pages of open
source javadoc packages.
As you type in a query, on every keystroke it does at least one Lucene
search to show results in the bottom part of the page.
It also gives spelling corrections (using my NGramSpeller
contribution) and also suggests popular tokens that start the same way
as your search query.
For one way to see corrections in action, type in rollback character
by character (don't do a cut and paste).
Note that:
-- this is not how the Google page works - just similar to it
-- I do single word suggestions while google does the more useful whole
phrase suggestions (TBD I'll try to copy them)
-- They do lots of javascript magic, whereas I use old school frames mostly
-- this is relatively expensive, as it does 1 query per character, and
when it's doing spelling correction there is even more work going on
-- this is just an experiment and the page may be unstable as I fool w/ it
What's nice is when you get used to immediate results, going back to the
batch way of searching seems backward, slow, and old fashioned.
There are too many idle CPUs in the world - this is one way to keep them
busier :)
-- Dave
PS Weblog entry updated too:
http://www.searchmorph.com/weblog/index.php?id=26
-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]


-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]


Re: Incremental Search experiment with Lucene, sort of like the new Google Suggestion page

2004-12-11 Thread Chris Hostetter
: I also realized they're prob not doing searches at all - instead they're
: going off a DB of query popularity - I wanted to code up something

you are correct, hence the reason cnet banana doesn't appear in the list
of suggestions even though it has 41K results, but hossman trophy does
(with less then 1K results)

They're building up the list based on search frequency, not term
frequency.


-Hoss


-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



Re: Incremental Search experiment with Lucene, sort of like the new Google Suggestion page

2004-12-10 Thread Chris Lamprecht
Very cool, thanks for posting this!  

Google's feature doesn't seem to do a search on every keystroke
necessarily.  Instead, it waits until you haven't typed a character
for a short period (I'm guessing about 100 or 150 milliseconds).  So
if you type fast, it doesn't hit the server until you pause.  There
are some more detailed postings on slashdot about how it works.

On Fri, 10 Dec 2004 16:36:27 -0800, David Spencer
[EMAIL PROTECTED] wrote:
 
 Google just came out with a page that gives you feedback as to how many
 pages will match your query and variations on it:
 
 http://www.google.com/webhp?complete=1hl=en
 
 I had an unexposed experiment I had done with Lucene a few months ago
 that this has inspired me to expose - it's not the same, but it's
 similar in that as you type in a query you're given *immediate* feedback
 as to how many pages match.
 
 Try it here: http://www.searchmorph.com/kat/isearch.html
 
 This is my SearchMorph site which has an index of ~90k pages of open
 source javadoc packages.
 
 As you type in a query, on every keystroke it does at least one Lucene
 search to show results in the bottom part of the page.
 
 It also gives spelling corrections (using my NGramSpeller
 contribution) and also suggests popular tokens that start the same way
 as your search query.
 
 For one way to see corrections in action, type in rollback character
 by character (don't do a cut and paste).
 
 Note that:
 -- this is not how the Google page works - just similar to it
 -- I do single word suggestions while google does the more useful whole
 phrase suggestions (TBD I'll try to copy them)
 -- They do lots of javascript magic, whereas I use old school frames mostly
 -- this is relatively expensive, as it does 1 query per character, and
 when it's doing spelling correction there is even more work going on
 -- this is just an experiment and the page may be unstable as I fool w/ it
 
 What's nice is when you get used to immediate results, going back to the
 batch way of searching seems backward, slow, and old fashioned.
 
 There are too many idle CPUs in the world - this is one way to keep them
 busier :)
 
 -- Dave
 
 PS Weblog entry updated too:
 http://www.searchmorph.com/weblog/index.php?id=26
 
 -
 To unsubscribe, e-mail: [EMAIL PROTECTED]
 For additional commands, e-mail: [EMAIL PROTECTED]
 


-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



Incremental Search experiment with Lucene, sort of like the new Google Suggestion page

2004-12-10 Thread David Spencer
Google just came out with a page that gives you feedback as to how many 
pages will match your query and variations on it:

http://www.google.com/webhp?complete=1hl=en
I had an unexposed experiment I had done with Lucene a few months ago 
that this has inspired me to expose - it's not the same, but it's 
similar in that as you type in a query you're given *immediate* feedback 
as to how many pages match.

Try it here: http://www.searchmorph.com/kat/isearch.html
This is my SearchMorph site which has an index of ~90k pages of open 
source javadoc packages.

As you type in a query, on every keystroke it does at least one Lucene 
search to show results in the bottom part of the page.

It also gives spelling corrections (using my NGramSpeller 
contribution) and also suggests popular tokens that start the same way 
as your search query.

For one way to see corrections in action, type in rollback character 
by character (don't do a cut and paste).

Note that:
-- this is not how the Google page works - just similar to it
-- I do single word suggestions while google does the more useful whole 
phrase suggestions (TBD I'll try to copy them)
-- They do lots of javascript magic, whereas I use old school frames mostly
-- this is relatively expensive, as it does 1 query per character, and 
when it's doing spelling correction there is even more work going on
-- this is just an experiment and the page may be unstable as I fool w/ it

What's nice is when you get used to immediate results, going back to the 
batch way of searching seems backward, slow, and old fashioned.

There are too many idle CPUs in the world - this is one way to keep them 
busier :)

-- Dave
PS Weblog entry updated too: 
http://www.searchmorph.com/weblog/index.php?id=26



-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]