The order of attributes is explicitly something that should not and cannot be 
used in a way that is semantically different.
Going down the path of either trying to get the attributes in the 'right' order 
or 'fixing' things that reorder attributes so they do so differently is futile.



If there is an attribute related bug here, it would be that the an attribute is 
applied which depends on another attribute value before all the attributes of 
an element are read.   

But I can't fully decipher the expected results since you are using an 
unfiltered query and I don't know the index settings of the database.
Furthermore your elements have no content and you're not doing stemmed 
searching, your using unfiltered searches, and searching on empty elements 
(that may or may not have significant whitespace) - many variables - here- and 
according to the docs
https://docs.marklogic.com/guide/search-dev/languages#id_91703
There is a constrained range of expected behavior ... 


xmls:lang applies only to 'the text children'  -- So this test case seems to 
fall in a somewhat undefined array - 


"All of the text node children and text node descendants of an element with an 
xml:lang attribute are treated as the language specified in the xml:lang 
attribute, unless a child element has an xml:lang attribute with a different 
value. If so, any text node children and text node descendants are treated as 
the new language, and so on until no other xml:lang attributes are encountered."

"Any content within an element having an xml:lang attribute is indexed in that 
language. Additionally, the xml:lang value is inherited by all of the 
descendants of that element, until another xml:lang value is encountered."







-----------------------------------------------------------------------------
David Lee
Lead Engineer
MarkLogic Corporation
[email protected]
Phone: +1 812-482-5224
Cell:  +1 812-630-7622
www.marklogic.com

-----Original Message-----
From: [email protected] 
[mailto:[email protected]] On Behalf Of Geert Josten
Sent: Wednesday, June 24, 2015 3:40 AM
To: MarkLogic Developer Discussion
Subject: Re: [MarkLogic Dev General] Indexing strategy for attributes when 
using xdmp:xlst-invoke

Hi Johan,

I will file a bug. Can you tell which version of MarkLogic you are running 
exactly?

Not uncommonly, XSLT transforms like below reorder attributes. Does it make a 
difference if you try to get the xml:lang attribute first in the XSLT output?

Last but not least, is this related to a customer case? It will push up 
priority if it is.. (you can let me know offline if necessary..)

Cheers,
Geert

On 6/22/15, 5:10 PM, "Johan de Boer" <[email protected]> wrote:

>Hi,
>
>I have discovered that when you use a stylesheet with xdmp:xlst-invoke 
>to transform your document content in some circumstances attributes are 
>not indexed as you might expect.
>
>- If within an element an attribute x appears before the xml:lang 
>attribute then this attribute x is indexed based on the default 
>language of the database.
>- If within an element an attribute x appears after the xml:lang 
>attribute then this attribute x is indexed based on the language in 
>this previous xml:lang attribute.
>
>Because the default language of the database can differ from the 
>language in the xml:lang attribute values for attribute x can be found 
>within different languages.
>
>After reindexing the database all these attributes x are indexed 
>according to the xml:lang attribute that appears within the same 
>element.
>
>This appears in both Marklogic 7 and Marklogic 8
>
>Although this problem can easily be avoided does anyone know if a 
>certain option within the stylesheet should be used to avoid this? Or 
>might this perhaps be a bug?
>
>An example is given below:
>
>xquery version "1.0-ml";
>declare namespace html = "http://www.w3.org/1999/xhtml";; import module 
>namespace search="http://marklogic.com/appservices/search"; at 
>"/MarkLogic/appservices/search/search.xqy";
>
>declare variable $SEARCH-OPTIONS :=
>    <options xmlns="http://marklogic.com/appservices/search";>
>        <search-option>unfiltered</search-option>
>        <return-query>true</return-query>
>        <return-results>true</return-results>
>
>        <constraint name="type-de">
>            <word>
>                <attribute ns="" name="type"/>
>                <element ns="" name="bar"/>
>                <term-option>lang=de</term-option>
>            </word>
>        </constraint>
>        <constraint name="type-en">
>            <word>
>                <attribute ns="" name="type"/>
>                <element ns="" name="bar"/>
>                <term-option>lang=en</term-option>
>            </word>
>        </constraint>
>    </options>;
>
>let $content1 :=
><foo>
>   <bar type="abc" xml:lang="de">
>   </bar>
></foo>
>
>let $content2 :=
><foo>
>   <bar xml:lang="de" type="def">
>   </bar>
></foo>
>
>(: default database language is 'en' :)
>
>(: copy-and-paste.xsl is a stylesheet:
>
><xsl:stylesheet version="2.0"
>xmlns:xsl="http://www.w3.org/1999/XSL/Transform";>
>    <xsl:template match="@*|node()">
>        <xsl:copy>
>            <xsl:apply-templates select="@*|node()" />
>        </xsl:copy>
>    </xsl:template>
></xsl:stylesheet>
>:)
>
>(: Run 1: I add two documents :)
>
>(:
>let $_ := xdmp:document-insert("/test/foo1",$content1)
>let $_ := xdmp:document-insert("/test/foo2",$content2)
>return "inserted documents 1 and 2"
>:)
>
>(: Run 2 : I check the number of documents found in each language after 
>run 1 :)
>
>(:
>let $found-de-abc := search:search("type-de:abc", 
>$SEARCH-OPTIONS)/@total let $found-en-abc := 
>search:search("type-en:abc", $SEARCH-OPTIONS)/@total let $found-de-def 
>:= search:search("type-de:def", $SEARCH-OPTIONS)/@total let 
>$found-en-def := search:search("type-en:def", $SEARCH-OPTIONS)/@total 
>return fn:concat ("Language 'de'/'abc' : ", $found-de-abc," and 
>language 'en'/'abc' : ", $found-en-abc, " and language 'de'/'def' : ", 
>$found-de-def," and language 'en'/'def' : ", $found-en-def)
>:)
>
>(: Run 2 returns:
>Language 'de'/'abc' : 1 and language 'en'/'abc' : 0 and language 
>'de'/'def' : 1 and language 'en'/'def' : 0
>:)
>
>(: Run 3 : I add two more documents based on the previous documents 
>using xdmp:xlst-invoke and the stylesheet :)
>
>(:
>let $content3 := xdmp:xslt-invoke("/app/xsl/copy-and-paste.xsl",
>fn:doc("/test/foo1"))
>let $content4 := xdmp:xslt-invoke("/app/xsl/copy-and-paste.xsl",
>fn:doc("/test/foo2"))
>let $_ := xdmp:document-insert("/test/foo3",$content3)
>let $_ := xdmp:document-insert("/test/foo4",$content4)
>return "inserted documents 3 and 4"
>:)
>
>(: Run 4 : I check the number of documents found in each language after 
>run 1 and 2 :)
>
>(:
>let $found-de-abc := search:search("type-de:abc", 
>$SEARCH-OPTIONS)/@total let $found-en-abc := 
>search:search("type-en:abc", $SEARCH-OPTIONS)/@total let $found-de-def 
>:= search:search("type-de:def", $SEARCH-OPTIONS)/@total let 
>$found-en-def := search:search("type-en:def", $SEARCH-OPTIONS)/@total 
>return fn:concat ("Language 'de'/'abc' : ", $found-de-abc," and 
>language 'en'/'abc' : ", $found-en-abc, " and language 'de'/'def' : ", 
>$found-de-def," and language 'en'/'def' : ", $found-en-def)
>:)
>
>(: Run 4 returns:
>Language 'de'/'abc' : 1 and language 'en'/'abc' : 1 and language 
>'de'/'def' : 2 and language 'en'/'def' : 0
>:)
>
>(: Then I reindex the database :)
>
>(: Run 5 : I check the number of documents found in each language after 
>reindex :)
>
>let $found-de-abc := search:search("type-de:abc", 
>$SEARCH-OPTIONS)/@total let $found-en-abc := 
>search:search("type-en:abc", $SEARCH-OPTIONS)/@total let $found-de-def 
>:= search:search("type-de:def", $SEARCH-OPTIONS)/@total let 
>$found-en-def := search:search("type-en:def", $SEARCH-OPTIONS)/@total 
>return fn:concat ("Language 'de'/'abc' : ", $found-de-abc," and 
>language 'en'/'abc' : ", $found-en-abc, " and language 'de'/'def' : ", 
>$found-de-def," and language 'en'/'def' : ", $found-en-def)
>
>(: Run 5 returns:
>Language 'de'/'abc' : 2 and language 'en'/'abc' : 0 and language 
>'de'/'def' : 2 and language 'en'/'def' : 0
>:)
>
>
>Thanks,
>
>Johan de Boer
>_______________________________________________
>General mailing list
>[email protected]
>Manage your subscription at:
>http://developer.marklogic.com/mailman/listinfo/general

_______________________________________________
General mailing list
[email protected]
Manage your subscription at: 
http://developer.marklogic.com/mailman/listinfo/general
_______________________________________________
General mailing list
[email protected]
Manage your subscription at: 
http://developer.marklogic.com/mailman/listinfo/general

Reply via email to