Hi Danny, Thanks for the response!
We are not introducing any spaces in the text and i have confirmed in the title element no space is present, following is the text "<Title>Magnetic anisotropy data of C<Subscript>24</Subscript>H<Subscript>12</Subscript></Title>". It is just the XML representation of the text available in the Title element. However we are using phrase through and phrase around to search the complete phrase which contains elements. Could you please provide solution by using phrase through? Not sure Shannon referred to the same by mentioning word through. --- Debabrata ---- On Wed, Aug 25, 2010 at 5:18 AM, Danny Sokolsky < [email protected]> wrote: > Just for clarification here, while Shannon's example makes fn:data of the > Title element return the string that is desired, search tokenization does > not. > > For search purposes, each text node is tokenized separately. A word > boundary will never cross a text node. The following demonstrates how this > is tokenized: > > let $x := <Title>Magnetic anisotropy data of > C<Subscript>24</Subscript>H<Subscript>12</Subscript></Title> > for $textnode in $x//text() > return <tn>{$textnode}</tn> > > => > <tn>Magnetic anisotropy data of C</tn> > <tn>24</tn> > <tn>H</tn> > <tn>12</tn> > > So there is no search term here for C24H12. If you want that to be a > search term (that is, a term to be used by cts:query), then you will have to > mark up the document somehow to extract that term. For example, you can > rewrite this element as follows: > > let $x := <Title>Magnetic anisotropy data of > C<Subscript>24</Subscript>H<Subscript>12</Subscript></Title> > return > element Title { attribute text {fn:string($x)}, > $x/node()} > > => > <Title text="Magnetic anisotropy data of C24H12">Magnetic anisotropy data > of C<Subscript>24</Subscript>H<Subscript>12</Subscript></Title> > > Then you could do a cts:element-attribute-word-query on Title/@text to > search for your terms. > > -Danny > > From: [email protected] [mailto: > [email protected]] On Behalf Of Debabrata Jena > Sent: Tuesday, August 24, 2010 11:01 AM > To: General Mark Logic Developer Discussion > Cc: [email protected] > Subject: Re: [MarkLogic Dev General] Phase Through Search problem > > Hi Shannon, > > Thanks for the answer. The answer may solve my purpose. I think this might > be the case. > > -- Debabrata -- > On Tue, Aug 24, 2010 at 11:18 PM, Shannon <[email protected]> wrote: > The data is being tokenized on whitespace, and you're introducing > whitespace. Wouldn't the following solve the problem? > > <Title>Magnetic anisotropy data of > C<Subscript>24</Subscript>H<Subscript>12</Subscript></Title> > Just a guess.. > > On Aug 24, 2010, at 1:40 PM, Shannon wrote: > > > Hi Debabarata, > > > > If I'm not mistaken, you want a "Word-Through" which is not currently > supported. MarkLogic has filed an RFE (#5849, "Enable per-database > word-through specifications", as well as a Word-Around) for consideration in > a future release. We have requested that this be implemented in v4.3. The > only work-around I know of is to duplicate the data to index the word token > in its entirety. > > > > On Aug 24, 2010, at 1:07 PM, Debabrata Jena wrote: > > > >> Hi, > >> > >> This is regarding not being able to search in for a phrase/search term > in an element in which phrase is combination of text and node . Please find > the details below and sample data attached. > >> Use Case : search for a phrase in which phrase is a combination of text > and node. For ex. search for "Magnetic anisotropy data of C24H12" Following > is the XML representation for the same phrase : > >> <Title> > >> Magnetic anisotropy data of C > >> <Subscript>24</Subscript> > >> H > >> <Subscript>12</Subscript> > >> </Title> > >> Approach followed: Added Phrase Through element for Subscript element so > that text inside the Subscript element can be search able. > >> > >> Current State : we are not able to search for a following text "Magnetic > anisotropy data of C24H12" in Title element by using cts:element-query. > However, we are able to search for the same text if we pass the search term > with spaces as following "Magnetic anisotropy data of C 24 H 12" in the > cts:element-query. > >> Please advise what else needs to be done so that we can search > successfully for the above scenario. > >> > >> > >> Thanks, > >> Debabarata > >> _______________________________________________ > >> General mailing list > >> [email protected] > >> http://developer.marklogic.com/mailman/listinfo/general > > > > _______________________________________________ > > General mailing list > > [email protected] > > http://developer.marklogic.com/mailman/listinfo/general > > _______________________________________________ > General mailing list > [email protected] > http://developer.marklogic.com/mailman/listinfo/general > > _______________________________________________ > General mailing list > [email protected] > http://developer.marklogic.com/mailman/listinfo/general >
_______________________________________________ General mailing list [email protected] http://developer.marklogic.com/mailman/listinfo/general
