Re: [MarkLogic Dev General] Wrong Behavior of Term-Query vs. Value-Query

2018-03-19 Thread Will Thompson
Hi Hubertus,

I would not expect that behavior, so probably the best thing to do is contact 
ML support if you have not already done so. My understanding is that element 
value queries use the same index data structures under the hood as word/term 
queries, so I would not expect fewer results for a word query when compared to 
a (nearly) equivalent element value query.

-Will


> On Feb 13, 2018, at 5:48 AM, Willuhn, Hubertus 
>  wrote:
> 
> Hello Community,
>  
> I have a question regarding a strange behavior difference between term 
> queries and value queries.
>  
> During some test for a new service, my team and I were trying to query 
> documents based on a value query and the german language (xml:lang=”de”).
> However the query result was empty, but we do know there are documents that 
> should match.
>  
> So I test around a bit and found out that if you send a raw combined query 
> (via the Java API) to the REST service of MarkLogic ,
> the server just ignores any language set via the query options.
>  
> To make things clear for you, I created a simple test case that shows this 
> behavior. In a fresh new database with default settings I created 4 test 
> documents via the following XQuery:
>  
> xdmp:document-insert("/demo1.xml",  xml:lang="de">Zeitstand-Innendruckfestigkeit);
> xdmp:document-insert("/demo2.xml", long-term 
> hydrostatic strength);
> xdmp:document-insert("/demo3.xml", 
> Zeitstand-Innendruckfestigkeit);
> xdmp:document-insert("/demo4.xml", long-term hydrostatic 
> strength);
>  
> As you can see, the documents are tagged differently by language or have no 
> tagging at all (which should result in English by default).
>  
> Now I am searching for the document with the following command:
>  
> search:resolve(http://marklogic.com/appservices/search; 
> xmlns:search="http://marklogic.com/appservices/search;>
>
> 
> 
>  long-term hydrostatic strength
>  1.0
> 
>   
>   ,
> http://marklogic.com/appservices/search;>
> 300
> 
> lang=de
> 
> )
>  
> So if you execute the search with the English word “long-term hydrostatic 
> strength” and “lang=en” within the 
> query options part,
> all is fine and it results in 2 found documents (for both value-query and 
> term-query).
>  
>  page-length="300" xmlns:search="http://marklogic.com/appservices/search;>
>path="fn:doc(/demo2.xml)" score="14336" confidence="0.5296452" 
> fitness="0.7490314">
> 
>path="fn:doc(/demo2.xml)/root/element">long-term
>  hydrostatic strength
> 
>   
>path="fn:doc(/demo4.xml)" score="14336" confidence="0.5296452" 
> fitness="0.7490314">
> 
>path="fn:doc(/demo4.xml)/root/element">long-term
>  hydrostatic strength
> 
>   
>   
> PT0.001277S
> 
> PT0.001016S
> PT0.003034S
>   
> 
>  
> The same applies if you try to run this with German language and the German 
> equivalent term “Zeitstand-Innendruckfestigkeit”. And the third scenario does 
> also work: searching for
> the German term with no language constraint will result in one found document 
> (with both term-query and value-query).
>  
> However if you do above query for the English term with the language 
> constraint “lang=de” within the 
> query options,
> it results in no documents found for the term-query but in 2 found documents 
> for the value-query!
> In other words it seems that the value-query just ignores the language 
> constraint set by the query options.
>  
> My question for this long story is: Is this a desired behavior or is this 
> something I should report to MarkLogic support as a bug?
> Mit freundlichen Grüßen
> 
> i.A. Hubertus Willuhn
> Informatiker 
> Datenservice | XML-Technologie
> 
> T +49 30 2601-2032| F +49 30 2601-42032
> 
>  
> 
> Folgen Sie uns auf
> 
> 
> DIN Software GmbH, Am DIN-Platz, Burggrafenstraße 6, 10787 Berlin; 
> http://www.dinsoftware.de; Registergericht: AG Berlin-Charlottenburg, HRB 
> 28484; Geschäftsführer: Dr.-Ing. Mario Schacht
> 
> Der Inhalt dieser E-Mail (einschließlich Anhängen) ist vertraulich. Falls Sie 
> diese E-Mail versehentlich erhalten haben, löschen Sie sie bitte und 
> informieren den Absender. Die DIN Software GmbH liefert Daten und erteilt 
> Auskünfte nach Maßgabe einer Haftungsbeschränkung, die hier abrufbar ist. The 
> contents of this e-mail (including attachments) are confidential. If you 
> received this e-mail in error, please delete it and notify the sender. DIN 
> Software GmbH provides data and information in accordance with its statement 
> on the limitation of liability which is availablehere. 
>  
> 
> 
> 
> ___
> General mailing list
> General@developer.marklogic.com
> Manage your subscription at: 
> 

[MarkLogic Dev General] Wrong Behavior of Term-Query vs. Value-Query

2018-02-13 Thread Willuhn, Hubertus
Hello Community,

I have a question regarding a strange behavior difference between term queries 
and value queries.

During some test for a new service, my team and I were trying to query 
documents based on a value query and the german language (xml:lang=”de”).
However the query result was empty, but we do know there are documents that 
should match.

So I test around a bit and found out that if you send a raw combined query (via 
the Java API) to the REST service of MarkLogic ,
the server just ignores any language set via the query options.

To make things clear for you, I created a simple test case that shows this 
behavior. In a fresh new database with default settings I created 4 test 
documents via the following XQuery:

xdmp:document-insert("/demo1.xml", Zeitstand-Innendruckfestigkeit);
xdmp:document-insert("/demo2.xml", long-term 
hydrostatic strength);
xdmp:document-insert("/demo3.xml", 
Zeitstand-Innendruckfestigkeit);
xdmp:document-insert("/demo4.xml", long-term hydrostatic 
strength);

As you can see, the documents are tagged differently by language or have no 
tagging at all (which should result in English by default).

Now I am searching for the document with the following command:

search:resolve(http://marklogic.com/appservices/search; 
xmlns:search="http://marklogic.com/appservices/search;>
   


 long-term hydrostatic strength
 1.0

  
  ,
http://marklogic.com/appservices/search;>
300

lang=de

)

So if you execute the search with the English word “long-term hydrostatic 
strength” and “lang=en” within the 
query options part,
all is fine and it results in 2 found documents (for both value-query and 
term-query).

http://marklogic.com/appservices/search;>
  

  long-term 
hydrostatic strength

  
  

  long-term 
hydrostatic strength

  
  
PT0.001277S
PT0.001016S
PT0.003034S
  


The same applies if you try to run this with German language and the German 
equivalent term “Zeitstand-Innendruckfestigkeit”. And the third scenario does 
also work: searching for
the German term with no language constraint will result in one found document 
(with both term-query and value-query).

However if you do above query for the English term with the language constraint 
“lang=de” within the query options,
it results in no documents found for the term-query but in 2 found documents 
for the value-query!
In other words it seems that the value-query just ignores the language 
constraint set by the query options.

My question for this long story is: Is this a desired behavior or is this 
something I should report to MarkLogic support as a bug?
Mit freundlichen Grüßen
i.A. Hubertus Willuhn
Informatiker
Datenservice | XML-Technologie
T +49 30 2601-2032| F +49 30 2601-42032


Folgen Sie uns auf
[cid:image97879d.JPG@f4d1db5f.468300c0] 


DIN Software GmbH, Am DIN-Platz, Burggrafenstraße 6, 10787 Berlin; 
http://www.dinsoftware.de; Registergericht: AG Berlin-Charlottenburg, HRB 
28484; Geschäftsführer: Dr.-Ing. Mario Schacht

Der Inhalt dieser E-Mail (einschließlich Anhängen) ist vertraulich. Falls Sie 
diese E-Mail versehentlich erhalten haben, löschen Sie sie bitte und 
informieren den Absender. Die DIN Software GmbH liefert Daten und erteilt 
Auskünfte nach Maßgabe einer Haftungsbeschränkung, die 
hier abrufbar ist. The contents of this 
e-mail (including attachments) are confidential. If you received this e-mail in 
error, please delete it and notify the sender. DIN Software GmbH provides data 
and information in accordance with its statement on the limitation of liability 
which is available here.


___
General mailing list
General@developer.marklogic.com
Manage your subscription at: 
http://developer.marklogic.com/mailman/listinfo/general