On Thursday 25 June 2009 02:05:41 leo stone wrote: > There are two considerations.How many typos are likely and how is the local > filtering done. > > If the local result filtering is not relaxed about typos of the sort "Woh" > than it would make no sense at all > to sort the consonants since non matching results would get filtered out > anyway. > > If the local filtering can handle those typos it's still a question of COST > vs. GAIN, and this decision will be left to your guts.
We can certainly do some kind of local filtering in the end that can handle typos; I am just not really willing to decide on sorting or not just based on guts and without any facts. > One should consider though that most of the typos will probably happen > during search input rather > than when inserting a file. And I must say, if a program is smart enough to > handle my search typos > I am likely to be very pleased. You have a much better idea about the > impact on the net so I can't > really say anything about that. Well, I wonder if a better alternative to dealing with typos would not be to simply run a spell-checker when the keyword is entered. That would take care of permutations and eliminate the problem of transmitting results that then need to be filtered later. > regards leo > > ps: I am wondering if you have an opinion about the matters that i am > trying to talk about in the forum. I do, it usually takes me longer to get to the forum, but I've now added some comments there. Christian > On Wed, Jun 24, 2009 at 8:15 PM, Christian Grothoff > > <[email protected]>wrote: > > I like this idea (at least as an option that should likely be the > > default) and > > have added it to the list of things to change for 0.9.x. What I wonder > > if sorting the consonants should be omitted or not. Some statistics on > > bad collisions with and without sorting would probably be nice to have... > > > > Christian > > > > On Tuesday 23 June 2009 07:27:17 leo stone wrote: > > > I believe the biggest factor on how we judge a system for future > > > > usability > > > > > is how many results we get if we are looking for "something" like > > > "something". > > > Imagine a shoe shop, with only two pair of shoes in it. And one with a > > > > few > > > > > hundreds. > > > > > > The result in the end might be the same you leave both shop's not > > > finding what you want, but most people will consider > > > the shop with a hundred pairs more promising and worth spending time > > > next time they try to find some shoes. > > > > > > So making sure people are getting results in their searches is probably > > > > one > > > > > of the more important issues, after > > > my doubts about how the routing is handled. > > > > > > Even though it might mean some significant overhead, i would consider > > > > doing > > > > > something like normalizing keywords. > > > If it must be, per language but in the beginning English should be > > > > enough. > > > > > So if i wanted to share the following file, and i would like it public, > > > > so > > > > > people can find it, why not store it such: > > > > > > "Woh_the.fuck_is ALICe(2008).divx.avi.WMV" => { HW , HT , CFK , S , > > > CL > > > > , > > > > > 2008 , DVX , V , MVW } > > > > > > Put the file under the hash's of those nine "key words". > > > > > > When i seach now for "fuck alice" => { CFK , CL } > > > > > > search h(CFK) AND h(CL) will return a lot of wrong similar results > > > but them one can filter locally in a more elaborate way. > > > > > > It might even be more selective than search h(video/x-msvideo) > > > > > > At least it returns results, whereas "Woh_the.fuck_is > > > ALICe(2008).divx.avi.WMV" as a key word is very unlikely that any one > > > would think to search for and therefore never be found, never be spread > > > ....., except by chance of course. > > > > > > regards leo _______________________________________________ GNUnet-developers mailing list [email protected] http://lists.gnu.org/mailman/listinfo/gnunet-developers
