Re: suggest: make searches efficient

2008-01-09 Thread Mikkel Kamstrup Erlandsen
On 09/01/2008, Debajyoti Bera [EMAIL PROTECTED] wrote:
  Ie cluster/facet extraction on the result set? It is far from trvial
  to do in an efficient and scalable way, but it can be done... The
  website I linked to have 10M items in the index.

 Its easy to do such fancy data-mining tricks for a webserver. On a desktop,
 such fancy things might cause annoying CPU spikes. But still this needs to be
 implemented to see how far can it work.

The example I sent you uses an external structure we call the facet
map to look up clusters on the fly[1]. It is an addition to the
Lucene index that can be updated at reasonably little overhead. -
Which still might be too much for a desktop box of course.

 Aside the question of scalability, I am wondering how to display this in the
 beagle-search GUI ? Without cluttering the interface.

Well; I suggest showing results like in that link :-)

Keep a flat list of all results that can then be refined by clicking
on the clusters.

Cheers,
Mikkel

[1]: I can provide a pointer to the code if anybody wants to look at
it. It is Java though.
___
Dashboard-hackers mailing list
Dashboard-hackers@gnome.org
http://mail.gnome.org/mailman/listinfo/dashboard-hackers


Re: suggest: make searches efficient

2008-01-09 Thread Debajyoti Bera
  Aside the question of scalability, I am wondering how to display this in
  the beagle-search GUI ? Without cluttering the interface.

 Well; I suggest showing results like in that link :-)

 Keep a flat list of all results that can then be refined by clicking
 on the clusters.

Beagle-search does not show a flat list of results, and I dont see how to 
implement a sidebar with all those extra links (buttons ?) without making it 
look cluttered.

Clustering/faceting should not be hard to implement, its a widely known and 
used idea. I am more worried about the user interaction part. OTOH, I am 
planning on showing a cluster in the webinterface. One major difference 
there: users are used to seeing lists of clickable text (with scrollbars) in 
a browser.

Thanks for your suggestion though,
- dBera

-- 
-
Debajyoti Bera @ http://dtecht.blogspot.com
beagle / KDE fan
Mandriva / Inspiron-1100 user
___
Dashboard-hackers mailing list
Dashboard-hackers@gnome.org
http://mail.gnome.org/mailman/listinfo/dashboard-hackers


Two patches for Xesam adaptor

2008-01-09 Thread Anders Rune Jensen
Hello

Two patches for the xesam adaptor:

title: Fixes title, as title is apparently dc:title in beagle and not title
snippet: implements snippet support

-- 
Anders Rune Jensen
http://people.iola.dk/anders/
--- /home/arj/source/beagle-xesam/src/Ontologies.cs	2007-12-18 15:30:05.0 +0100
+++ src/Ontologies.cs	2008-01-09 16:53:00.0 +0100
@@ -48,8 +48,8 @@
 			{
 fields_mapping = new Dictionarystring, string ();
 
-fields_mapping.Add (dc:title, title);
-fields_mapping.Add (xesam:title, title);
+fields_mapping.Add (dc:title, dc:title);
+fields_mapping.Add (xesam:title, dc:title);
 
 fields_mapping.Add (dc:author, author);
 fields_mapping.Add (xesam:author, author);
--- /home/arj/source/beagle-xesam/src/Ontologies.cs	2007-12-18 15:30:05.0 +0100
+++ src/Ontologies.cs	2008-01-09 16:53:00.0 +0100
@@ -73,6 +73,8 @@
 
 fields_mapping.Add (xesam:fileExtension, beagle:FilenameExtension);
 fields_mapping.Add (fileExtension, beagle:FilenameExtension);
+
+fields_mapping.Add (snippet, snippet);
 			}
 
 			private static void InitializeSourcesMapping ()
--- /home/arj/source/beagle-xesam/src/Search.cs	2007-12-21 12:10:34.0 +0100
+++ src/Search.cs	2008-01-09 17:08:57.0 +0100
@@ -53,7 +53,7 @@
 get { return bHit; }
 			}
 
-			public Hit (uint id, Beagle.Hit hit, string[] fields)
+		public Hit (uint id, Beagle.Hit hit, string[] fields, Query query)
 			{
 this.id = id;
 bHit = hit;
@@ -77,7 +77,13 @@
 	case date:
 		hitValue [i++] = hit.Timestamp.ToString (s);
 		break;
-		
+
+	case snippet:
+	SnippetRequest sreq = new SnippetRequest (query, hit);
+		SnippetResponse sresp = (SnippetResponse) sreq.Send ();
+		hitValue [i++] = sresp.Snippet != null ? sresp.Snippet : String.Empty;
+		break;
+	
 	default:
 		//FIXME: This *will* break since we don't know what the expected
 		//type here is
@@ -211,7 +217,7 @@
 mutex.WaitOne ();
 
 foreach (uint id in ids) {
-	Hit hit = new Hit (id, hits [id].BeagleHit, fields);
+Hit hit = new Hit (id, hits [id].BeagleHit, fields, query);
 	ret.Add (hit.Value);
 }
 
@@ -227,8 +233,8 @@
 // cache the hits and keep them nice and safe
 Console.Error.WriteLine ({0}: Got some hits: {1}, id, response.Hits.Count);
 foreach (Beagle.Hit bHit in response.Hits) {
-//	Console.Error.WriteLine (+Hit: {0}, bHit.Uri);
-	newHits.Add (hitCount++, new Xesam.Hit (hitCount, bHit, parentSession.HitFields));
+	Console.Error.WriteLine (+Hit: {0}, bHit.Uri);
+	newHits.Add (hitCount++, new Xesam.Hit (hitCount, bHit, parentSession.HitFields, query));
 }
 
 if (newHits.Count  0  HitsAddedHandler != null) {
___
Dashboard-hackers mailing list
Dashboard-hackers@gnome.org
http://mail.gnome.org/mailman/listinfo/dashboard-hackers


Re: suggest: make searches efficient

2008-01-09 Thread Joe Shaw
Hi,

On Jan 9, 2008 9:44 AM, Debajyoti Bera [EMAIL PROTECTED] wrote:
 Beagle-search does not show a flat list of results, and I dont see how to
 implement a sidebar with all those extra links (buttons ?) without making it
 look cluttered.

Flatness might not be the worst thing in the world if you have rather
simple one-click filtering of certain types.

In general, I am not sure how much people like the grouping of results
by type.  Certainly the UI for paging them is a little awkward.

Joe
___
Dashboard-hackers mailing list
Dashboard-hackers@gnome.org
http://mail.gnome.org/mailman/listinfo/dashboard-hackers


Re: suggest: make searches efficient

2008-01-09 Thread D Bera
 used idea. I am more worried about the user interaction part. OTOH, I am
 planning on showing a cluster in the webinterface. One major difference

Will something like this be helpful ?
http://cs-people.bu.edu/dbera/blogdata/beagle-refine-search.png

Its a very basic implementation that groups the search results based on
common properties and displays the groups with more than 10% presence. I
checked the test implementation in the beagle-webinterface-branch.

- dBera

-- 
-
Debajyoti Bera @ http://dtecht.blogspot.com
beagle / KDE fan
Mandriva / Inspiron-1100 user
___
Dashboard-hackers mailing list
Dashboard-hackers@gnome.org
http://mail.gnome.org/mailman/listinfo/dashboard-hackers