Re: suggest: make searches efficient
On 09/01/2008, Debajyoti Bera [EMAIL PROTECTED] wrote: Ie cluster/facet extraction on the result set? It is far from trvial to do in an efficient and scalable way, but it can be done... The website I linked to have 10M items in the index. Its easy to do such fancy data-mining tricks for a webserver. On a desktop, such fancy things might cause annoying CPU spikes. But still this needs to be implemented to see how far can it work. The example I sent you uses an external structure we call the facet map to look up clusters on the fly[1]. It is an addition to the Lucene index that can be updated at reasonably little overhead. - Which still might be too much for a desktop box of course. Aside the question of scalability, I am wondering how to display this in the beagle-search GUI ? Without cluttering the interface. Well; I suggest showing results like in that link :-) Keep a flat list of all results that can then be refined by clicking on the clusters. Cheers, Mikkel [1]: I can provide a pointer to the code if anybody wants to look at it. It is Java though. ___ Dashboard-hackers mailing list Dashboard-hackers@gnome.org http://mail.gnome.org/mailman/listinfo/dashboard-hackers
Re: suggest: make searches efficient
Aside the question of scalability, I am wondering how to display this in the beagle-search GUI ? Without cluttering the interface. Well; I suggest showing results like in that link :-) Keep a flat list of all results that can then be refined by clicking on the clusters. Beagle-search does not show a flat list of results, and I dont see how to implement a sidebar with all those extra links (buttons ?) without making it look cluttered. Clustering/faceting should not be hard to implement, its a widely known and used idea. I am more worried about the user interaction part. OTOH, I am planning on showing a cluster in the webinterface. One major difference there: users are used to seeing lists of clickable text (with scrollbars) in a browser. Thanks for your suggestion though, - dBera -- - Debajyoti Bera @ http://dtecht.blogspot.com beagle / KDE fan Mandriva / Inspiron-1100 user ___ Dashboard-hackers mailing list Dashboard-hackers@gnome.org http://mail.gnome.org/mailman/listinfo/dashboard-hackers
Two patches for Xesam adaptor
Hello Two patches for the xesam adaptor: title: Fixes title, as title is apparently dc:title in beagle and not title snippet: implements snippet support -- Anders Rune Jensen http://people.iola.dk/anders/ --- /home/arj/source/beagle-xesam/src/Ontologies.cs 2007-12-18 15:30:05.0 +0100 +++ src/Ontologies.cs 2008-01-09 16:53:00.0 +0100 @@ -48,8 +48,8 @@ { fields_mapping = new Dictionarystring, string (); -fields_mapping.Add (dc:title, title); -fields_mapping.Add (xesam:title, title); +fields_mapping.Add (dc:title, dc:title); +fields_mapping.Add (xesam:title, dc:title); fields_mapping.Add (dc:author, author); fields_mapping.Add (xesam:author, author); --- /home/arj/source/beagle-xesam/src/Ontologies.cs 2007-12-18 15:30:05.0 +0100 +++ src/Ontologies.cs 2008-01-09 16:53:00.0 +0100 @@ -73,6 +73,8 @@ fields_mapping.Add (xesam:fileExtension, beagle:FilenameExtension); fields_mapping.Add (fileExtension, beagle:FilenameExtension); + +fields_mapping.Add (snippet, snippet); } private static void InitializeSourcesMapping () --- /home/arj/source/beagle-xesam/src/Search.cs 2007-12-21 12:10:34.0 +0100 +++ src/Search.cs 2008-01-09 17:08:57.0 +0100 @@ -53,7 +53,7 @@ get { return bHit; } } - public Hit (uint id, Beagle.Hit hit, string[] fields) + public Hit (uint id, Beagle.Hit hit, string[] fields, Query query) { this.id = id; bHit = hit; @@ -77,7 +77,13 @@ case date: hitValue [i++] = hit.Timestamp.ToString (s); break; - + + case snippet: + SnippetRequest sreq = new SnippetRequest (query, hit); + SnippetResponse sresp = (SnippetResponse) sreq.Send (); + hitValue [i++] = sresp.Snippet != null ? sresp.Snippet : String.Empty; + break; + default: //FIXME: This *will* break since we don't know what the expected //type here is @@ -211,7 +217,7 @@ mutex.WaitOne (); foreach (uint id in ids) { - Hit hit = new Hit (id, hits [id].BeagleHit, fields); +Hit hit = new Hit (id, hits [id].BeagleHit, fields, query); ret.Add (hit.Value); } @@ -227,8 +233,8 @@ // cache the hits and keep them nice and safe Console.Error.WriteLine ({0}: Got some hits: {1}, id, response.Hits.Count); foreach (Beagle.Hit bHit in response.Hits) { -// Console.Error.WriteLine (+Hit: {0}, bHit.Uri); - newHits.Add (hitCount++, new Xesam.Hit (hitCount, bHit, parentSession.HitFields)); + Console.Error.WriteLine (+Hit: {0}, bHit.Uri); + newHits.Add (hitCount++, new Xesam.Hit (hitCount, bHit, parentSession.HitFields, query)); } if (newHits.Count 0 HitsAddedHandler != null) { ___ Dashboard-hackers mailing list Dashboard-hackers@gnome.org http://mail.gnome.org/mailman/listinfo/dashboard-hackers
Re: suggest: make searches efficient
Hi, On Jan 9, 2008 9:44 AM, Debajyoti Bera [EMAIL PROTECTED] wrote: Beagle-search does not show a flat list of results, and I dont see how to implement a sidebar with all those extra links (buttons ?) without making it look cluttered. Flatness might not be the worst thing in the world if you have rather simple one-click filtering of certain types. In general, I am not sure how much people like the grouping of results by type. Certainly the UI for paging them is a little awkward. Joe ___ Dashboard-hackers mailing list Dashboard-hackers@gnome.org http://mail.gnome.org/mailman/listinfo/dashboard-hackers
Re: suggest: make searches efficient
used idea. I am more worried about the user interaction part. OTOH, I am planning on showing a cluster in the webinterface. One major difference Will something like this be helpful ? http://cs-people.bu.edu/dbera/blogdata/beagle-refine-search.png Its a very basic implementation that groups the search results based on common properties and displays the groups with more than 10% presence. I checked the test implementation in the beagle-webinterface-branch. - dBera -- - Debajyoti Bera @ http://dtecht.blogspot.com beagle / KDE fan Mandriva / Inspiron-1100 user ___ Dashboard-hackers mailing list Dashboard-hackers@gnome.org http://mail.gnome.org/mailman/listinfo/dashboard-hackers