The order of attributes is explicitly something that should not and cannot be used in a way that is semantically different. Going down the path of either trying to get the attributes in the 'right' order or 'fixing' things that reorder attributes so they do so differently is futile.
If there is an attribute related bug here, it would be that the an attribute is applied which depends on another attribute value before all the attributes of an element are read. But I can't fully decipher the expected results since you are using an unfiltered query and I don't know the index settings of the database. Furthermore your elements have no content and you're not doing stemmed searching, your using unfiltered searches, and searching on empty elements (that may or may not have significant whitespace) - many variables - here- and according to the docs https://docs.marklogic.com/guide/search-dev/languages#id_91703 There is a constrained range of expected behavior ... xmls:lang applies only to 'the text children' -- So this test case seems to fall in a somewhat undefined array - "All of the text node children and text node descendants of an element with an xml:lang attribute are treated as the language specified in the xml:lang attribute, unless a child element has an xml:lang attribute with a different value. If so, any text node children and text node descendants are treated as the new language, and so on until no other xml:lang attributes are encountered." "Any content within an element having an xml:lang attribute is indexed in that language. Additionally, the xml:lang value is inherited by all of the descendants of that element, until another xml:lang value is encountered." ----------------------------------------------------------------------------- David Lee Lead Engineer MarkLogic Corporation [email protected] Phone: +1 812-482-5224 Cell: +1 812-630-7622 www.marklogic.com -----Original Message----- From: [email protected] [mailto:[email protected]] On Behalf Of Geert Josten Sent: Wednesday, June 24, 2015 3:40 AM To: MarkLogic Developer Discussion Subject: Re: [MarkLogic Dev General] Indexing strategy for attributes when using xdmp:xlst-invoke Hi Johan, I will file a bug. Can you tell which version of MarkLogic you are running exactly? Not uncommonly, XSLT transforms like below reorder attributes. Does it make a difference if you try to get the xml:lang attribute first in the XSLT output? Last but not least, is this related to a customer case? It will push up priority if it is.. (you can let me know offline if necessary..) Cheers, Geert On 6/22/15, 5:10 PM, "Johan de Boer" <[email protected]> wrote: >Hi, > >I have discovered that when you use a stylesheet with xdmp:xlst-invoke >to transform your document content in some circumstances attributes are >not indexed as you might expect. > >- If within an element an attribute x appears before the xml:lang >attribute then this attribute x is indexed based on the default >language of the database. >- If within an element an attribute x appears after the xml:lang >attribute then this attribute x is indexed based on the language in >this previous xml:lang attribute. > >Because the default language of the database can differ from the >language in the xml:lang attribute values for attribute x can be found >within different languages. > >After reindexing the database all these attributes x are indexed >according to the xml:lang attribute that appears within the same >element. > >This appears in both Marklogic 7 and Marklogic 8 > >Although this problem can easily be avoided does anyone know if a >certain option within the stylesheet should be used to avoid this? Or >might this perhaps be a bug? > >An example is given below: > >xquery version "1.0-ml"; >declare namespace html = "http://www.w3.org/1999/xhtml"; import module >namespace search="http://marklogic.com/appservices/search" at >"/MarkLogic/appservices/search/search.xqy"; > >declare variable $SEARCH-OPTIONS := > <options xmlns="http://marklogic.com/appservices/search"> > <search-option>unfiltered</search-option> > <return-query>true</return-query> > <return-results>true</return-results> > > <constraint name="type-de"> > <word> > <attribute ns="" name="type"/> > <element ns="" name="bar"/> > <term-option>lang=de</term-option> > </word> > </constraint> > <constraint name="type-en"> > <word> > <attribute ns="" name="type"/> > <element ns="" name="bar"/> > <term-option>lang=en</term-option> > </word> > </constraint> > </options>; > >let $content1 := ><foo> > <bar type="abc" xml:lang="de"> > </bar> ></foo> > >let $content2 := ><foo> > <bar xml:lang="de" type="def"> > </bar> ></foo> > >(: default database language is 'en' :) > >(: copy-and-paste.xsl is a stylesheet: > ><xsl:stylesheet version="2.0" >xmlns:xsl="http://www.w3.org/1999/XSL/Transform"> > <xsl:template match="@*|node()"> > <xsl:copy> > <xsl:apply-templates select="@*|node()" /> > </xsl:copy> > </xsl:template> ></xsl:stylesheet> >:) > >(: Run 1: I add two documents :) > >(: >let $_ := xdmp:document-insert("/test/foo1",$content1) >let $_ := xdmp:document-insert("/test/foo2",$content2) >return "inserted documents 1 and 2" >:) > >(: Run 2 : I check the number of documents found in each language after >run 1 :) > >(: >let $found-de-abc := search:search("type-de:abc", >$SEARCH-OPTIONS)/@total let $found-en-abc := >search:search("type-en:abc", $SEARCH-OPTIONS)/@total let $found-de-def >:= search:search("type-de:def", $SEARCH-OPTIONS)/@total let >$found-en-def := search:search("type-en:def", $SEARCH-OPTIONS)/@total >return fn:concat ("Language 'de'/'abc' : ", $found-de-abc," and >language 'en'/'abc' : ", $found-en-abc, " and language 'de'/'def' : ", >$found-de-def," and language 'en'/'def' : ", $found-en-def) >:) > >(: Run 2 returns: >Language 'de'/'abc' : 1 and language 'en'/'abc' : 0 and language >'de'/'def' : 1 and language 'en'/'def' : 0 >:) > >(: Run 3 : I add two more documents based on the previous documents >using xdmp:xlst-invoke and the stylesheet :) > >(: >let $content3 := xdmp:xslt-invoke("/app/xsl/copy-and-paste.xsl", >fn:doc("/test/foo1")) >let $content4 := xdmp:xslt-invoke("/app/xsl/copy-and-paste.xsl", >fn:doc("/test/foo2")) >let $_ := xdmp:document-insert("/test/foo3",$content3) >let $_ := xdmp:document-insert("/test/foo4",$content4) >return "inserted documents 3 and 4" >:) > >(: Run 4 : I check the number of documents found in each language after >run 1 and 2 :) > >(: >let $found-de-abc := search:search("type-de:abc", >$SEARCH-OPTIONS)/@total let $found-en-abc := >search:search("type-en:abc", $SEARCH-OPTIONS)/@total let $found-de-def >:= search:search("type-de:def", $SEARCH-OPTIONS)/@total let >$found-en-def := search:search("type-en:def", $SEARCH-OPTIONS)/@total >return fn:concat ("Language 'de'/'abc' : ", $found-de-abc," and >language 'en'/'abc' : ", $found-en-abc, " and language 'de'/'def' : ", >$found-de-def," and language 'en'/'def' : ", $found-en-def) >:) > >(: Run 4 returns: >Language 'de'/'abc' : 1 and language 'en'/'abc' : 1 and language >'de'/'def' : 2 and language 'en'/'def' : 0 >:) > >(: Then I reindex the database :) > >(: Run 5 : I check the number of documents found in each language after >reindex :) > >let $found-de-abc := search:search("type-de:abc", >$SEARCH-OPTIONS)/@total let $found-en-abc := >search:search("type-en:abc", $SEARCH-OPTIONS)/@total let $found-de-def >:= search:search("type-de:def", $SEARCH-OPTIONS)/@total let >$found-en-def := search:search("type-en:def", $SEARCH-OPTIONS)/@total >return fn:concat ("Language 'de'/'abc' : ", $found-de-abc," and >language 'en'/'abc' : ", $found-en-abc, " and language 'de'/'def' : ", >$found-de-def," and language 'en'/'def' : ", $found-en-def) > >(: Run 5 returns: >Language 'de'/'abc' : 2 and language 'en'/'abc' : 0 and language >'de'/'def' : 2 and language 'en'/'def' : 0 >:) > > >Thanks, > >Johan de Boer >_______________________________________________ >General mailing list >[email protected] >Manage your subscription at: >http://developer.marklogic.com/mailman/listinfo/general _______________________________________________ General mailing list [email protected] Manage your subscription at: http://developer.marklogic.com/mailman/listinfo/general _______________________________________________ General mailing list [email protected] Manage your subscription at: http://developer.marklogic.com/mailman/listinfo/general
