Re: [MarkLogic Dev General] Community Newsletter

2017-09-01 Thread Dave Cassel
Good catch, and thanks for pointing it out. Fixed.

--
Dave Cassel, @dmcassel
Technical Community Manager
MarkLogic Corporation
http://developer.marklogic.com/

From: 
>
 on behalf of "Donohoe, Paul, Macmillan" 
>
Reply-To: MarkLogic Developer Discussion 
>
Date: Friday, September 1, 2017 at 11:25 AM
To: MarkLogic Developer Discussion 
>
Subject: Re: [MarkLogic Dev General] Community Newsletter

Thanks Dave.

I was reading the post How to Find and Control Access to 
PII , and I 
think I found a typo.  In the code under "Scoping and Sampling — Elegant and 
Better Than Brute Force", line 3 is missing "Employee" from fn:collection().  
Without it, it's possible that the sample size is greater than the size of the 
"Employee" collection.

Otherwise, a good read!

Best regards,
Paul


---
Paul Donohoe
Technical Lead
Process & Content Management

Springer Nature
The Campus, 4 Crinan Street, London N1 9XW, UK
T +44 (0)20 7843 4783

paul.dono...@springernature.com
www.springernature.com
---
Visitor address: Porters Gate Reception, Wharfdale Road, London, UK
---
Springer Nature is one of the world’s leading global research, educational
and professional publishers, created in May 2015 through the combination
of Nature Publishing Group, Palgrave Macmillan, Macmillan Education and
Springer Science+Business Media.
---




From: 
general-boun...@developer.marklogic.com
 [mailto:general-boun...@developer.marklogic.com] On Behalf Of Dave Cassel
Sent: 01 September 2017 16:06
To: MarkLogic Developer Discussion
Subject: [MarkLogic Dev General] Community Newsletter

Hello Community,

Our presence on Stack Overflow continues to grow. We have over 1,700 questions 
there with nearly 400 subscribers watching the feed. It's a great place to get 
help. Don't forget to up vote the good answers!

Content
New blog posts on developer.marklogic.com:

  *   Michael Malgeri demonstrates Building a Semantic Recommendation 
Engine
  *   Caio Milani explains How to Find and Control Access to 
PII
  *   Scott Brooks addresses a common security problem with CSRF Attack 
Application 
Protection
We also have a few new recipes:

  *   Cache data with 
timeout (Justin 
Donnelly)
  *   Get permissions with role 
names 
(Paxton Hare)
  *   Anchor Dates for Finding Recent 
Documents 
(Dave Cassel)
Want to contribute or request a recipe? Contact us as 
rec...@marklogic.com

Software Releases
MarkLogic company releases:

  *   MLCP 8.0-7 for MarkLogic 
8.0-7.
  *   Java Client API 3.0.8 for MarkLogic 
8.0-7.
  *   MarkLogic-rdf4j 1.0.0 for use with MarkLogic 
9.0-2.
Community project releases:

  *   Paxton Hare has released version 1.1.5 of the Data Hub 
Framework.
  *   Rob Rudin released version 2.9.0 of 
ml-gradle 
and 2.9.0 of 
ml-app-deployer.
  *   Rob Szkutak and Geert Josten released version 1.7.7 of 
Roxy.
  *   Rob Rudin (busy guy) also released version 2.14.1 of 
ml-javaclient-util.
  *   Scott Stafford released version 1.1.0 of 
marklogic-spring-batch.

MarkLogic Jobs

  *   Seeking MarkLogic Developers for prestigious firm in NYC. Contract or 
Contract to Hire opportunities. Will be developing XQuery, JS and REST modules 
n the MarkLogic Technology Stack and Utilizing the MarkLogic Library to support 
FWCL initiatives. labr...@consultnet.com.
  *   We need a 

Re: [MarkLogic Dev General] Community Newsletter

2017-09-01 Thread Donohoe, Paul, Macmillan
Thanks Dave.

I was reading the post How to Find and Control Access to 
PII , and I 
think I found a typo.  In the code under "Scoping and Sampling - Elegant and 
Better Than Brute Force", line 3 is missing "Employee" from fn:collection().  
Without it, it's possible that the sample size is greater than the size of the 
"Employee" collection.

Otherwise, a good read!

Best regards,
Paul


---
Paul Donohoe
Technical Lead
Process & Content Management

Springer Nature
The Campus, 4 Crinan Street, London N1 9XW, UK
T +44 (0)20 7843 4783

paul.dono...@springernature.com
www.springernature.com
---
Visitor address: Porters Gate Reception, Wharfdale Road, London, UK
---
Springer Nature is one of the world's leading global research, educational
and professional publishers, created in May 2015 through the combination
of Nature Publishing Group, Palgrave Macmillan, Macmillan Education and
Springer Science+Business Media.
---




From: general-boun...@developer.marklogic.com 
[mailto:general-boun...@developer.marklogic.com] On Behalf Of Dave Cassel
Sent: 01 September 2017 16:06
To: MarkLogic Developer Discussion
Subject: [MarkLogic Dev General] Community Newsletter

Hello Community,

Our presence on Stack Overflow continues to grow. We have over 1,700 questions 
there with nearly 400 subscribers watching the feed. It's a great place to get 
help. Don't forget to up vote the good answers!

Content
New blog posts on developer.marklogic.com:

  *   Michael Malgeri demonstrates Building a Semantic Recommendation 
Engine
  *   Caio Milani explains How to Find and Control Access to 
PII
  *   Scott Brooks addresses a common security problem with CSRF Attack 
Application 
Protection
We also have a few new recipes:

  *   Cache data with 
timeout (Justin 
Donnelly)
  *   Get permissions with role 
names 
(Paxton Hare)
  *   Anchor Dates for Finding Recent 
Documents 
(Dave Cassel)
Want to contribute or request a recipe? Contact us as 
rec...@marklogic.com

Software Releases
MarkLogic company releases:

  *   MLCP 8.0-7 for MarkLogic 
8.0-7.
  *   Java Client API 3.0.8 for MarkLogic 
8.0-7.
  *   MarkLogic-rdf4j 1.0.0 for use with MarkLogic 
9.0-2.
Community project releases:

  *   Paxton Hare has released version 1.1.5 of the Data Hub 
Framework.
  *   Rob Rudin released version 2.9.0 of 
ml-gradle 
and 2.9.0 of 
ml-app-deployer.
  *   Rob Szkutak and Geert Josten released version 1.7.7 of 
Roxy.
  *   Rob Rudin (busy guy) also released version 2.14.1 of 
ml-javaclient-util.
  *   Scott Stafford released version 1.1.0 of 
marklogic-spring-batch.

MarkLogic Jobs

  *   Seeking MarkLogic Developers for prestigious firm in NYC. Contract or 
Contract to Hire opportunities. Will be developing XQuery, JS and REST modules 
n the MarkLogic Technology Stack and Utilizing the MarkLogic Library to support 
FWCL initiatives. labr...@consultnet.com.
  *   We need a MarkLogic Programmer in Columbia, MD. Exp: 10 yrs, Rate:$60/hr. 
Send resume to h...@arealtech.in.
Recruiters - if you'd like me to include your MarkLogic-related openings in 
this newsletter, send them to me before the first of each month.

Dave.

--
Dave Cassel, @dmcassel
Technical Community Manager
MarkLogic Corporation
http://developer.marklogic.com/

DISCLAIMER: This e-mail is confidential and should not be used by anyone who is 
not the original intended recipient. If you have received this e-mail in error 
please inform the sender and delete it from your mailbox or any other storage 
mechanism. Macmillan Publishers Limited does not accept liability for any 
statements made which are clearly the sender's own 

Re: [MarkLogic Dev General] Apparent Memory Leak in Profiler

2017-09-01 Thread Eliot Kimber
I can verify that ML 8.07 resolves the memory leak in the profiler. I can now 
profile 100s of 1000s of tasks no problem.

Cheers,

E.

--
Eliot Kimber
http://contrext.com
 


On 8/28/17, 1:41 PM, "general-boun...@developer.marklogic.com on behalf of 
Eliot Kimber"  wrote:

Thanks—I should be able to test with latest ML 8 in a couple of days.

Cheers,

E.

--
Eliot Kimber
http://contrext.com
 


On 8/28/17, 12:37 PM, "general-boun...@developer.marklogic.com on behalf of 
Christopher Hamlin"  wrote:

There was a bug where, under certain circumstances, the profiler will
result in a query deadlock &/or a resource leak (#45569).  It could be
that this is what you are seeing.

It was noticed in 8.0-2 and is fixed in the latest release (8.0-7).

On Mon, Aug 28, 2017 at 1:11 PM, Eliot Kimber  
wrote:
> I reported earlier that my profiling application was causing 
MarkLogic to restart after handling about 20,000 tasks. Turns out it was an 
out-of-memory issue on the server itself (currently configured with 256GB of 
RAM). We could see a distinct spike in memory usage, at which point the server 
restarted MarkLogic. I tried different input data sets so it doesn’t appear to 
be an issue with a particular input document (my data set has a few outliers 
that are much larger than typical but only a few).
>
> Subsequent testing determined that it was the use of the MarkLogic 
profiler that was causing the memory spike: if I turned off the profiler then 
memory usage was flat and all the tasks completed as expected.
>
> This is ML 8.03. I’m still working on getting my server upgraded to a 
newer version of MarkLogic so I can see if this is an issue that has already 
been fixed.
>
> So it looks like there’s some kind of memory leak related to the 
profiler and I’d like to understand what that issue and either understand how 
to avoid it or report it formally.
>
> If it’s a general potential problem with large-scale processing would 
like to understand how to avoid it or plan for it. If it’s a problem specific 
to the profiler then need to report it formally and provide appropriate 
diagnostics.
>
> So my questions:
>
> 1. Is this a known issue with profiling? I’m guessing not in that I’m 
probably doing something out-of-the-ordinary vis-à-vis profiling and is 
something that nobody would see in typical single-instance ad-hoc profiling.
> 2. What types of MarkLogic processing would cause this kind of memory 
spike that lasts across the execution of multiple tasks? I would expect the 
memory required for a given task to be released as soon as the task is complete 
so I’m guessing it must be an issue with caches or something?
>
> Thanks,
>
> Eliot
> --
> Eliot Kimber
> http://contrext.com
>
>
>
>
> ___
> General mailing list
> General@developer.marklogic.com
> Manage your subscription at:
> http://developer.marklogic.com/mailman/listinfo/general
___
General mailing list
General@developer.marklogic.com
Manage your subscription at: 
http://developer.marklogic.com/mailman/listinfo/general



___
General mailing list
General@developer.marklogic.com
Manage your subscription at: 
http://developer.marklogic.com/mailman/listinfo/general



___
General mailing list
General@developer.marklogic.com
Manage your subscription at: 
http://developer.marklogic.com/mailman/listinfo/general


[MarkLogic Dev General] Community Newsletter

2017-09-01 Thread Dave Cassel
Hello Community,

Our presence on Stack Overflow continues to grow. We have over 1,700 questions 
there with nearly 400 subscribers watching the feed. It's a great place to get 
help. Don't forget to up vote the good answers!

Content
New blog posts on developer.marklogic.com:

  *   Michael Malgeri demonstrates Building a Semantic Recommendation 
Engine
  *   Caio Milani explains How to Find and Control Access to 
PII
  *   Scott Brooks addresses a common security problem with CSRF Attack 
Application 
Protection

We also have a few new recipes:

  *   Cache data with 
timeout (Justin 
Donnelly)
  *   Get permissions with role 
names 
(Paxton Hare)
  *   Anchor Dates for Finding Recent 
Documents 
(Dave Cassel)

Want to contribute or request a recipe? Contact us as rec...@marklogic.com

Software Releases
MarkLogic company releases:

  *   MLCP 8.0-7 for MarkLogic 
8.0-7.
  *   Java Client API 3.0.8 for MarkLogic 
8.0-7.
  *   MarkLogic-rdf4j 1.0.0 for use with MarkLogic 
9.0-2.

Community project releases:

  *   Paxton Hare has released version 1.1.5 of the Data Hub 
Framework.
  *   Rob Rudin released version 2.9.0 of 
ml-gradle 
and 2.9.0 of 
ml-app-deployer.
  *   Rob Szkutak and Geert Josten released version 1.7.7 of 
Roxy.
  *   Rob Rudin (busy guy) also released version 2.14.1 of 
ml-javaclient-util.
  *   Scott Stafford released version 1.1.0 of 
marklogic-spring-batch.

MarkLogic Jobs

  *   Seeking MarkLogic Developers for prestigious firm in NYC. Contract or 
Contract to Hire opportunities. Will be developing XQuery, JS and REST modules 
n the MarkLogic Technology Stack and Utilizing the MarkLogic Library to support 
FWCL initiatives. labr...@consultnet.com.
  *   We need a MarkLogic Programmer in Columbia, MD. Exp: 10 yrs, Rate:$60/hr. 
Send resume to h...@arealtech.in.

Recruiters — if you'd like me to include your MarkLogic-related openings in 
this newsletter, send them to me before the first of each month.

Dave.

--
Dave Cassel, @dmcassel
Technical Community Manager
MarkLogic Corporation
http://developer.marklogic.com/
___
General mailing list
General@developer.marklogic.com
Manage your subscription at: 
http://developer.marklogic.com/mailman/listinfo/general


Re: [MarkLogic Dev General] xray tests

2017-09-01 Thread Oleksii Segeda
Geert,

There is nothing unusual about these tests, they call different submodules of 
our application.
I can run xray when I cut the number of these tests roughly in half. It doesn't 
matter which tests to include - what matters is the amount.
My colleague, who has slightly more powerful machine with more RAM, can execute 
all these tests without any issues.

Regards,
Oleksii


From: general-boun...@developer.marklogic.com 
[mailto:general-boun...@developer.marklogic.com] On Behalf Of Geert Josten
Sent: Thursday, August 31, 2017 11:32 AM
To: MarkLogic Developer Discussion 
Subject: Re: [MarkLogic Dev General] xray tests

Hi,

Could you share some more detail on what is happening inside those tests? Would 
you be able to isolate which test is the culprit by commenting out each one by 
one?

Cheers,
Geert

From: 
>
 on behalf of Oleksii Segeda 
>
Reply-To: MarkLogic Developer Discussion 
>
Date: Thursday, August 31, 2017 at 5:19 PM
To: MarkLogic Developer Discussion 
>
Subject: [MarkLogic Dev General] xray tests

Hi everyone,

I have around 20-30 xray unit tests (https://github.com/robwhitby/xray). I want 
to run a full set of tests locally, before I deploy my code somewhere else.
Unfortunately, ML dies with out of memory error. If I run each test 
individually it works perfectly fine, but it takes forever to go through all of 
them manually.
I've tried to increase swap, limit the number of debug threads, limit cache 
sizes, etc. - nothing helps.

What else can be done here?

Regards,

Oleksii Segeda

IT Analyst

Information and Technology Solutions

[http://siteresources.worldbank.org/NEWS/Images/spacer.png]

[http://siteresources.worldbank.org/NEWS/Images/WBG_Information_and_Technology_Solutions.png]



___
General mailing list
General@developer.marklogic.com
Manage your subscription at: 
http://developer.marklogic.com/mailman/listinfo/general


Re: [MarkLogic Dev General] Regarding cts:element-value-query

2017-09-01 Thread Geert Josten
Hi Siva,

cts:not-query(cts:element-value-query(xs:QName("myelem"), "")) would exclude 
empty myelem elements..

Kind regards,
Geert

From: 
>
 on behalf of "Mani, Sivasubramani (ELS)" 
>
Reply-To: MarkLogic Developer Discussion 
>
Date: Friday, September 1, 2017 at 1:32 PM
To: "general@developer.marklogic.com" 
>
Cc: ConSyn-Infosys-Support 
>
Subject: [MarkLogic Dev General] Regarding cts:element-value-query

Hi Team,

I use cts:element –query() & cts:element-value-query() to filter the documents 
based on their elements and element values. I need to filter the documents 
based on elements with values only but above query’s consider the empty 
elements also.

S56789
ES






S56789
ES


This is my query cts:and-query(( cts:element-query(xs:Qname(“pii”),”*”), 
cts:element-query(xs:Qname(“cp”),”*”)  )) or cts:and-query(( 
cts:element-value-query(xs:Qname(“pii”),”*”), 
cts:element-value-query(xs:Qname(“cp”),”*”)  )) both the query includes empty 
element in the result. I need to filter out the empty element from the result. 
Kindly do the needful.

Thanks & Regards,
Siva

___
General mailing list
General@developer.marklogic.com
Manage your subscription at: 
http://developer.marklogic.com/mailman/listinfo/general


[MarkLogic Dev General] Regarding cts:element-value-query

2017-09-01 Thread Mani, Sivasubramani (ELS)
Hi Team,

I use cts:element -query() & cts:element-value-query() to filter the documents 
based on their elements and element values. I need to filter the documents 
based on elements with values only but above query's consider the empty 
elements also.

S56789
ES






S56789
ES


This is my query cts:and-query(( cts:element-query(xs:Qname("pii"),"*"), 
cts:element-query(xs:Qname("cp"),"*")  )) or cts:and-query(( 
cts:element-value-query(xs:Qname("pii"),"*"), 
cts:element-value-query(xs:Qname("cp"),"*")  )) both the query includes empty 
element in the result. I need to filter out the empty element from the result. 
Kindly do the needful.

Thanks & Regards,
Siva

___
General mailing list
General@developer.marklogic.com
Manage your subscription at: 
http://developer.marklogic.com/mailman/listinfo/general


Re: [MarkLogic Dev General] Search result estimation: issue with json array structure

2017-09-01 Thread APEL Holger
Thank you James for pointing me in the right direction. I knew there would be 
an index option to eliminate my false positives. With positions turned on the 
counts are all good now

Regards,
Holger

From: general-boun...@developer.marklogic.com 
[mailto:general-boun...@developer.marklogic.com] On Behalf Of James Kerr
Sent: 2017-08-31 18:18
To: MarkLogic Developer Discussion 
Subject: Re: [MarkLogic Dev General] Search result estimation: issue with json 
array structure

I’m not sure if you figured this out yet but you are likely running into a 
filtered vs. unfiltered issue now. The Java search client runs queries 
unfiltered by default as it is a best practice to be able to resolve queries 
from the indexes without filtering.

Since you are using container queries though, you will need positions turned on 
for your indexes so it can resolve the nested structure without filtering.

You will want to turn on “word positions”, “element word positions” and 
“element value positions” to support resolving these types of queries 
unfiltered. See this knowledgebase article 
https://help.marklogic.com/knowledgebase/article/View/245/0/queries-constrained-to-elements
 as well as the “Usage Notes” for https://docs.marklogic.com/cts.elementQuery 
and https://docs.marklogic.com/cts:json-property-scope-query for details.

-James

From: 
>
 on behalf of APEL Holger >
Reply-To: MarkLogic Developer Discussion 
>
Date: Tuesday, August 22, 2017 at 4:05 AM
To: MarkLogic Developer Discussion 
>
Subject: Re: [MarkLogic Dev General] Search result estimation: issue with json 
array structure

Ah yes, I got the cts.andQuery parameter wrong. But what I really want to do is 
using the Java API to query my pojos

StructuredQueryBuilder qb = new StructuredQueryBuilder();
StructuredQueryDefinition q = qb.containerQuery(qb.jsonProperty("stages"),
qb.and(
qb.value(qb.jsonProperty("status"), "CURRENT"),
qb.value(qb.jsonProperty("stageId"), )
));

DatabaseClient client = DatabaseClientFactory.newClient("localhost", 8082, new 
DigestAuthContext("admin", "admin"));

SearchHandle result = client.newQueryManager().search(q, new SearchHandle());
logger.info("query: {}", q.serialize());
logger.info("returned: {}", result.getTotalResults());

The serialized query is:
http://marklogic.com/appservices/search;>
  
stages

  
status
CURRENT
  
  
stageId

  

  


And totalResults: 20

From: 
general-boun...@developer.marklogic.com
 [mailto:general-boun...@developer.marklogic.com] On Behalf Of James Kerr
Sent: 2017-08-19 05:58
To: MarkLogic Developer Discussion 
>
Subject: Re: [MarkLogic Dev General] Search result estimation: issue with json 
array structure

The function signature for cts.andQuery accepts an array. You are using () 
instead of [] around your sub-queries. This should work:

fn.count(
  cts.search(
  cts.jsonPropertyScopeQuery("stages",
 cts.andQuery([
cts.jsonPropertyValueQuery("status", "CURRENT"),
cts.jsonPropertyValueQuery("stageId", )
 ])
  )
  , 'filtered')
);


From: 
>
 on behalf of APEL Holger >
Reply-To: MarkLogic Developer Discussion 
>
Date: Friday, August 18, 2017 at 5:05 AM
To: MarkLogic Developer Discussion 
>
Subject: [MarkLogic Dev General] Search result estimation: issue with json 
array structure

Hello community,

I stumbled over a use case where the total result count of a query is wrong.

Here or set of test data

declareUpdate();
for (i = 0; i < 10; i++) {
  xdmp.documentInsert(
  "/a" + i + ".json",
  {
"project": {
"stages": [{"stageId": , "status": "CURRENT"},
   {"stageId": , "status": "CLOSED"}]
}
  }
  );
  xdmp.documentInsert(
 "/b" + i + ".json",
  {
"project": {
"stages": [{"stageId": 9998, "status": "CURRENT"},
   {"stageId": , "status": "CLOSED"}]
}
  }
  );
};

fn.count(
  cts.search(
  cts.jsonPropertyScopeQuery("stages",
 cts.andQuery(
(cts.jsonPropertyValueQuery("status", "CURRENT"),
 cts.jsonPropertyValueQuery("stageId", ))
 )
  )
  , 'filtered')
);

Returns 20 but in xquery

fn:count(
  cts:search(/,
  cts:json-property-scope-query("stages",