I have tokenized it, but now I have to untokenize... :-)))))))
Thanks!

Massimiliano

2010/6/4 bsb <[email protected]>

> Purely for academic reasons, I'd like to point out that the
> computationally efficient way to find exact substrings (ie "cat" in
> "merecats") is a Suffix Tree: http://en.wikipedia.org/wiki/Suffix_tree
>
> You could probably (with a lot of time and mental strength) create a
> suffix tree system using the GAE data store. Not that you'd want that,
> I'm just saying, you could ;-)
>
>
> Ben
>
> On Jun 3, 7:53 pm, Rafael Sierra <[email protected]> wrote:
> > On Thu, Jun 3, 2010 at 2:51 PM, Rafael Sierra <[email protected]>
> wrote:
> > > On Thu, Jun 3, 2010 at 2:40 PM, Massimiliano
> > > <[email protected]> wrote:
> > >> Dear Ikai,
> > >> I want to search in the datastore "Pizza" and obtain "Pizza with
> tomatoes",
> > >> "Pizza with Mushrooms", I don't care about "abcpizza" or "123pizza".
> >
> > > Those you can obtain by the filter told some messages ago, but 'I want
> > > pizza man!' need a full-text of the poor solution I gave in the last
> > > message.
> >
> > ...OR* the poor solution... (not "of")
> >
> >
> >
> >
> >
> > >> Massimiliano
> >
> > >> 2010/6/3 Ikai L (Google) <[email protected]>
> >
> > >>> The reason search engines work is because they do stemming on terms.
> For
> > >>> instance, when you search for "cats" you may want to match "cat" and
> "cats"
> > >>> for better results, and it's fairly unlikely you want to return
> "abcatst" or
> > >>> "youcats123" (sorry, I couldn't think of any words that had "cats" in
> the
> > >>> middle). Lucene, an open source Java project for ships with basic
> analyzers
> > >>> that will do things like this.
> > >>> Otherwise, there aren't easily scalable ways to index on substrings.
> The
> > >>> closest you can come is to try to intelligently determine which
> substrings
> > >>> in a string can be searched on, and index those terms.
> > >>> On Thu, Jun 3, 2010 at 9:49 AM, Rafael Sierra <[email protected]
> >
> > >>> wrote:
> >
> > >>>> Massimiliano, the point is that there's no way to create an index
> that
> > >>>> can be used to avoid a full-scan into the database in that kind of
> > >>>> query, so, even if you can do "ILIKE '%anything%'" in PgSQL or "LIKE
> > >>>> '%anotherthing%'" in MySQL, it will result in a full-scan on every
> > >>>> record at database.
> >
> > >>>> But, if you have a small set of record in you database, you can do
> > >>>> something like this:http://dpaste.com/hold/202782/. You can do this
> > >>>> in larger sets of databases also, but you may deal with some
> > >>>> DeadLineErrors at your application.
> >
> > >>>> Note: This solution has absolutely no performance at all, you can
> even
> > >>>> do some kind of pagination (scanning the database only up to N
> > >>>> registers and breaking after that) but if the scan happens to
> iterate
> > >>>> over thousands of records before reach N records found, you will
> still
> > >>>> have DeadLineErrors.
> >
> > >>>> On Thu, Jun 3, 2010 at 11:04 AM, Massimiliano
> > >>>> <[email protected]> wrote:
> > >>>> > So there isn't a scalable solution!
> >
> > >>>> > 2010/6/3 Geoffrey Spear <[email protected]>
> >
> > >>>> >> On Jun 3, 4:58 am, Massimiliano <[email protected]
> >
> > >>>> >> wrote:
> > >>>> >> > I need just something like *myvar* so I will accept any
> carachters
> > >>>> >> > before
> > >>>> >> > and after the var...
> > >>>> >> > Or I have to divide the strings in list (each word an elment of
> the
> > >>>> >> > list)
> > >>>> >> > and use the operator IN.
> > >>>> >> > Thinking
> >
> > >>>> >> Building a keyword index for each entity is fairly trivial.
> >
> > >>>> >> Indexing so that you can find "oob" in "foobar" isn't, and I
> don't
> > >>>> >> believe there's a scalable solution for this.  An RDBMS will do
> > >>>> >> searches like this for you, but it won't scale well.  I believe
> it's
> > >>>> >> only possible by doing a table scan, which App Engine won't let
> you do
> > >>>> >> (short of manually fetching every entity and checking if it
> contains
> > >>>> >> your substring, which obviously isn't going to be pretty.)
> >
> > >>>> >> --
> > >>>> >> You received this message because you are subscribed to the
> Google
> > >>>> >> Groups
> > >>>> >> "Google App Engine" group.
> > >>>> >> To post to this group, send email to
> > >>>> >> [email protected].
> > >>>> >> To unsubscribe from this group, send email to
> > >>>> >> [email protected]<google-appengine%[email protected]>
> .
> > >>>> >> For more options, visit this group at
> > >>>> >>http://groups.google.com/group/google-appengine?hl=en.
> >
> > >>>> > --
> >
> > >>>> > My email: [email protected]
> > >>>> > My Google Wave: [email protected]
> >
> > >>>> > --
> > >>>> > You received this message because you are subscribed to the Google
> > >>>> > Groups
> > >>>> > "Google App Engine" group.
> > >>>> > To post to this group, send email to
> [email protected].
> > >>>> > To unsubscribe from this group, send email to
> > >>>> > [email protected]<google-appengine%[email protected]>
> .
> > >>>> > For more options, visit this group at
> > >>>> >http://groups.google.com/group/google-appengine?hl=en.
> >
> > >>>> --
> > >>>> You received this message because you are subscribed to the Google
> Groups
> > >>>> "Google App Engine" group.
> > >>>> To post to this group, send email to
> [email protected].
> > >>>> To unsubscribe from this group, send email to
> > >>>> [email protected]<google-appengine%[email protected]>
> .
> > >>>> For more options, visit this group at
> > >>>>http://groups.google.com/group/google-appengine?hl=en.
> >
> > >>> --
> > >>> Ikai Lan
> > >>> Developer Programs Engineer, Google App Engine
> > >>> Blog: http://googleappengine.blogspot.com
> > >>> Twitter: http://twitter.com/app_engine
> > >>> Reddit: http://www.reddit.com/r/appengine
> >
> > >>> --
> > >>> You received this message because you are subscribed to the Google
> Groups
> > >>> "Google App Engine" group.
> > >>> To post to this group, send email to
> [email protected].
> > >>> To unsubscribe from this group, send email to
> > >>> [email protected]<google-appengine%[email protected]>
> .
> > >>> For more options, visit this group at
> > >>>http://groups.google.com/group/google-appengine?hl=en.
> >
> > >> --
> >
> > >> My email: [email protected]
> > >> My Google Wave: [email protected]
> >
> > >> --
> > >> You received this message because you are subscribed to the Google
> Groups
> > >> "Google App Engine" group.
> > >> To post to this group, send email to
> [email protected].
> > >> To unsubscribe from this group, send email to
> > >> [email protected]<google-appengine%[email protected]>
> .
> > >> For more options, visit this group at
> > >>http://groups.google.com/group/google-appengine?hl=en.
>
> --
> You received this message because you are subscribed to the Google Groups
> "Google App Engine" group.
> To post to this group, send email to [email protected].
> To unsubscribe from this group, send email to
> [email protected]<google-appengine%[email protected]>
> .
> For more options, visit this group at
> http://groups.google.com/group/google-appengine?hl=en.
>
>


-- 

My email: [email protected]
My Google Wave: [email protected]

-- 
You received this message because you are subscribed to the Google Groups 
"Google App Engine" group.
To post to this group, send email to [email protected].
To unsubscribe from this group, send email to 
[email protected].
For more options, visit this group at 
http://groups.google.com/group/google-appengine?hl=en.

Reply via email to