1.3 sec running over 150k words, not bad at all!

> -----Oorspronkelijk bericht-----
> Van: Geert Josten [mailto:[email protected]]
> Verzonden: zondag 13 mei 2012 10:32
> Aan: MarkLogic Developer Discussion
> Onderwerp: RE: [MarkLogic Dev General] Bug in cts:element-words? (was:
Term
> with same stem)
>
> Duh.. It just had to be something that obvious..
>
> Thnx Danny!
>
> > -----Oorspronkelijk bericht-----
> > Van: [email protected] [mailto:general-
> > [email protected]] Namens Danny Sokolsky
> > Verzonden: zondag 13 mei 2012 0:42
> > Aan: MarkLogic Developer Discussion
> > Onderwerp: Re: [MarkLogic Dev General] Bug in cts:element-words? (was:
> Term
> > with same stem)
> >
> > I hadn't had enough coffee yet when I made my last comment.  The
example
> in
> > the doc is correct, it just puts a start value in.  Geert, your
example
> would use
> > the "collation=..." string as the start value, and would pick up the
> whatever is the
> > default collation in your environment (and you probably do not have an
> element
> > word lexicon on the default collation, so it probably throws an
> exception).
> >
> > -Danny
> > ________________________________________
> > From: [email protected] [general-
> > [email protected]] On Behalf Of Danny Sokolsky
> > [[email protected]]
> > Sent: Saturday, May 12, 2012 10:38 AM
> > To: MarkLogic Developer Discussion
> > Subject: Re: [MarkLogic Dev General] Bug in cts:element-words? (was:
> Term
> > with  same stem)
> >
> > I think your call to element-words is missing the second parameter;
> $options is
> > the 3rd parameter.  So I think it should be:
> >
> > cts:element-words(fn:QName("http://grtjn.nl/twitter/utils";, "text"),
(),
> > "collation=http://marklogic.com/collation/nl/S1/AS/T00BB";)
> >
> > It looks like the example in the doc is missing that second arg
> too--I'll see if I can
> > get that fixed ;)
> >
> > -Danny
> >
> > ________________________________________
> > From: [email protected] [general-
> > [email protected]] On Behalf Of Geert Josten
> > [[email protected]]
> > Sent: Saturday, May 12, 2012 8:52 AM
> > To: MarkLogic Developer Discussion
> > Subject: [MarkLogic Dev General] Bug in cts:element-words? (was: Term
> with
> > same stem)
> >
> > Curious how well the idea of Danny would perform, I thought to apply
it
> to one
> > of my test databases with a fair number of tweets (roughly 400K last
> time I
> > checked). I had to rewrite cts:words to cts:element-words since I have
> no words
> > lexicon. But it breaks with me. Did I hit a bug?
> >
> > let $map := map:map()
> > let $all :=
> >   for $x in
cts:element-words(fn:QName("http://grtjn.nl/twitter/utils";,
> "text"),
> > "collation=http://marklogic.com/collation/nl/S1/AS/T00BB";)
> >   return map:put($map, cts:stem($x), $x)
> > return (
> > fn:concat(xs:string(fn:count(map:keys($map))), " unique stems in the
> database"),
> > fn:concat(fn:count(cts:words()), " unique words in the database
> > "),
> > map:keys($map) )
> >
> > Note that I specify a specific collation, but that seems to get
ignored.
> Can
> > anyone confirm this behavior?
> >
> > Kind regards,
> > Geert
> >
> > Van: [email protected]<mailto:general-
> > [email protected]> [mailto:general-
> > [email protected]<mailto:general-
> > [email protected]>] Namens Danny Sokolsky
> > Verzonden: zaterdag 12 mei 2012 0:13
> > Aan: MarkLogic Developer Discussion
> > Onderwerp: Re: [MarkLogic Dev General] Term with same stem
> >
> > If you have a word lexicon you can do something like this to get
> information
> > about your words and stems:
> >
> > let $map := map:map()
> > let $all :=
> >   for $x in cts:words()
> >   return map:put($map, cts:stem($x), $x)
> > return (
> > fn:concat(xs:string(fn:count(map:keys($map))), " unique stems in the
> database"),
> > fn:concat(fn:count(cts:words()), " unique words in the database
> > "),
> > map:keys($map) )
> >
> > -Danny
> >
> > From: [email protected]<mailto:general-
> > [email protected]> [mailto:general-
> > [email protected]]<mailto:[mailto:general-
> > [email protected]]> On Behalf Of Michael Blakeley
> > Sent: Friday, May 11, 2012 2:02 PM
> > To: MarkLogic Developer Discussion
> > Cc: MarkLogic Developer Discussion
> > Subject: Re: [MarkLogic Dev General] Term with same stem
> >
> > If stemming=advanced I think cts:stem will do that. With basic the
best
> you can
> > do is to pass terms to cts:stem and see if they have the same stem.
> > -- Mike
> >
> > On May 11, 2012, at 13:39, Abhishek53 S
> > <[email protected]<mailto:[email protected]>> wrote:
> > Hi Folks,
> >
> > Is it possible to get the all terms that have same stem from Marklogic
> database?
> > I want to get all terms that belongs to the same stem.
> >
> > Thanks & Regards
> > Abhishek Srivastav
> > Systems Engineer
> > Tata Consultancy Services
> > Cell:- +91-9883389968
> > Mailto: [email protected]<mailto:[email protected]>
> > Website: http://www.tcs.com<http://www.tcs.com/>
> > ____________________________________________
> > Experience certainty. IT Services
> > Business Solutions
> > Outsourcing
> >
> > =====-----=====-----=====
> > Notice: The information contained in this e-mail
> > message and/or attachments to it may contain
> > confidential or privileged information. If you are
> > not the intended recipient, any dissemination, use,
> > review, distribution, printing or copying of the
> > information contained in this e-mail message
> > and/or attachments to it are strictly prohibited. If
> > you have received this communication in error,
> > please notify us by reply e-mail or telephone and
> > immediately and permanently delete the message
> > and any attachments. Thank you
> > _______________________________________________
> > General mailing list
> >
> [email protected]<mailto:[email protected]
> > >
> > http://developer.marklogic.com/mailman/listinfo/general
> > _______________________________________________
> > General mailing list
> > [email protected]
> > http://developer.marklogic.com/mailman/listinfo/general
> > _______________________________________________
> > General mailing list
> > [email protected]
> > http://developer.marklogic.com/mailman/listinfo/general
_______________________________________________
General mailing list
[email protected]
http://developer.marklogic.com/mailman/listinfo/general

Reply via email to