I have tokenized it, but now I have to untokenize... :-))))))) Thanks! Massimiliano
2010/6/4 bsb <[email protected]> > Purely for academic reasons, I'd like to point out that the > computationally efficient way to find exact substrings (ie "cat" in > "merecats") is a Suffix Tree: http://en.wikipedia.org/wiki/Suffix_tree > > You could probably (with a lot of time and mental strength) create a > suffix tree system using the GAE data store. Not that you'd want that, > I'm just saying, you could ;-) > > > Ben > > On Jun 3, 7:53 pm, Rafael Sierra <[email protected]> wrote: > > On Thu, Jun 3, 2010 at 2:51 PM, Rafael Sierra <[email protected]> > wrote: > > > On Thu, Jun 3, 2010 at 2:40 PM, Massimiliano > > > <[email protected]> wrote: > > >> Dear Ikai, > > >> I want to search in the datastore "Pizza" and obtain "Pizza with > tomatoes", > > >> "Pizza with Mushrooms", I don't care about "abcpizza" or "123pizza". > > > > > Those you can obtain by the filter told some messages ago, but 'I want > > > pizza man!' need a full-text of the poor solution I gave in the last > > > message. > > > > ...OR* the poor solution... (not "of") > > > > > > > > > > > > >> Massimiliano > > > > >> 2010/6/3 Ikai L (Google) <[email protected]> > > > > >>> The reason search engines work is because they do stemming on terms. > For > > >>> instance, when you search for "cats" you may want to match "cat" and > "cats" > > >>> for better results, and it's fairly unlikely you want to return > "abcatst" or > > >>> "youcats123" (sorry, I couldn't think of any words that had "cats" in > the > > >>> middle). Lucene, an open source Java project for ships with basic > analyzers > > >>> that will do things like this. > > >>> Otherwise, there aren't easily scalable ways to index on substrings. > The > > >>> closest you can come is to try to intelligently determine which > substrings > > >>> in a string can be searched on, and index those terms. > > >>> On Thu, Jun 3, 2010 at 9:49 AM, Rafael Sierra <[email protected] > > > > >>> wrote: > > > > >>>> Massimiliano, the point is that there's no way to create an index > that > > >>>> can be used to avoid a full-scan into the database in that kind of > > >>>> query, so, even if you can do "ILIKE '%anything%'" in PgSQL or "LIKE > > >>>> '%anotherthing%'" in MySQL, it will result in a full-scan on every > > >>>> record at database. > > > > >>>> But, if you have a small set of record in you database, you can do > > >>>> something like this:http://dpaste.com/hold/202782/. You can do this > > >>>> in larger sets of databases also, but you may deal with some > > >>>> DeadLineErrors at your application. > > > > >>>> Note: This solution has absolutely no performance at all, you can > even > > >>>> do some kind of pagination (scanning the database only up to N > > >>>> registers and breaking after that) but if the scan happens to > iterate > > >>>> over thousands of records before reach N records found, you will > still > > >>>> have DeadLineErrors. > > > > >>>> On Thu, Jun 3, 2010 at 11:04 AM, Massimiliano > > >>>> <[email protected]> wrote: > > >>>> > So there isn't a scalable solution! > > > > >>>> > 2010/6/3 Geoffrey Spear <[email protected]> > > > > >>>> >> On Jun 3, 4:58 am, Massimiliano <[email protected] > > > > >>>> >> wrote: > > >>>> >> > I need just something like *myvar* so I will accept any > carachters > > >>>> >> > before > > >>>> >> > and after the var... > > >>>> >> > Or I have to divide the strings in list (each word an elment of > the > > >>>> >> > list) > > >>>> >> > and use the operator IN. > > >>>> >> > Thinking > > > > >>>> >> Building a keyword index for each entity is fairly trivial. > > > > >>>> >> Indexing so that you can find "oob" in "foobar" isn't, and I > don't > > >>>> >> believe there's a scalable solution for this. An RDBMS will do > > >>>> >> searches like this for you, but it won't scale well. I believe > it's > > >>>> >> only possible by doing a table scan, which App Engine won't let > you do > > >>>> >> (short of manually fetching every entity and checking if it > contains > > >>>> >> your substring, which obviously isn't going to be pretty.) > > > > >>>> >> -- > > >>>> >> You received this message because you are subscribed to the > Google > > >>>> >> Groups > > >>>> >> "Google App Engine" group. > > >>>> >> To post to this group, send email to > > >>>> >> [email protected]. > > >>>> >> To unsubscribe from this group, send email to > > >>>> >> [email protected]<google-appengine%[email protected]> > . > > >>>> >> For more options, visit this group at > > >>>> >>http://groups.google.com/group/google-appengine?hl=en. > > > > >>>> > -- > > > > >>>> > My email: [email protected] > > >>>> > My Google Wave: [email protected] > > > > >>>> > -- > > >>>> > You received this message because you are subscribed to the Google > > >>>> > Groups > > >>>> > "Google App Engine" group. > > >>>> > To post to this group, send email to > [email protected]. > > >>>> > To unsubscribe from this group, send email to > > >>>> > [email protected]<google-appengine%[email protected]> > . > > >>>> > For more options, visit this group at > > >>>> >http://groups.google.com/group/google-appengine?hl=en. > > > > >>>> -- > > >>>> You received this message because you are subscribed to the Google > Groups > > >>>> "Google App Engine" group. > > >>>> To post to this group, send email to > [email protected]. > > >>>> To unsubscribe from this group, send email to > > >>>> [email protected]<google-appengine%[email protected]> > . > > >>>> For more options, visit this group at > > >>>>http://groups.google.com/group/google-appengine?hl=en. > > > > >>> -- > > >>> Ikai Lan > > >>> Developer Programs Engineer, Google App Engine > > >>> Blog: http://googleappengine.blogspot.com > > >>> Twitter: http://twitter.com/app_engine > > >>> Reddit: http://www.reddit.com/r/appengine > > > > >>> -- > > >>> You received this message because you are subscribed to the Google > Groups > > >>> "Google App Engine" group. > > >>> To post to this group, send email to > [email protected]. > > >>> To unsubscribe from this group, send email to > > >>> [email protected]<google-appengine%[email protected]> > . > > >>> For more options, visit this group at > > >>>http://groups.google.com/group/google-appengine?hl=en. > > > > >> -- > > > > >> My email: [email protected] > > >> My Google Wave: [email protected] > > > > >> -- > > >> You received this message because you are subscribed to the Google > Groups > > >> "Google App Engine" group. > > >> To post to this group, send email to > [email protected]. > > >> To unsubscribe from this group, send email to > > >> [email protected]<google-appengine%[email protected]> > . > > >> For more options, visit this group at > > >>http://groups.google.com/group/google-appengine?hl=en. > > -- > You received this message because you are subscribed to the Google Groups > "Google App Engine" group. > To post to this group, send email to [email protected]. > To unsubscribe from this group, send email to > [email protected]<google-appengine%[email protected]> > . > For more options, visit this group at > http://groups.google.com/group/google-appengine?hl=en. > > -- My email: [email protected] My Google Wave: [email protected] -- You received this message because you are subscribed to the Google Groups "Google App Engine" group. To post to this group, send email to [email protected]. To unsubscribe from this group, send email to [email protected]. For more options, visit this group at http://groups.google.com/group/google-appengine?hl=en.
