Re: [MarkLogic Dev General] fragments filter large difference

2018-05-24 Thread Geert Josten
Hi Paul,

Optimizing XPath is always tricky. I think the optimizer didn’t recognize that 
`collection($mycollection)/myelem`  and ` collection($mycollection)[./myelem]` 
are (in terms of index resolution) effectively the same. And if the optimizer 
didn’t, it is likely that MarkLogic would have to filter out much more false 
positives in filtering stage.

Cheers,
Geert

From:  on behalf of Paul M 

Reply-To: MarkLogic Developer Discussion 
Date: Wednesday, May 23, 2018 at 6:06 PM
To: "general@developer.marklogic.com" 
Subject: [MarkLogic Dev General] fragments filter large difference

collection($mycollection)/myelem[.//myA[@myattr="myval"]//myB="val"]
vs
collection($mycollection)[./myelem//myA[@myattr="myval"]//myB="val"]

The first iteration performs markedly better than the second.
The second attempts to filter a million + fragments.

xdmp:query-trace shows only two constraints on the second iteration, namely 
fn:collection, when gathering constraints.

Any insight appreciated on why so different.


___
General mailing list
General@developer.marklogic.com
Manage your subscription at: 
http://developer.marklogic.com/mailman/listinfo/general


Re: [MarkLogic Dev General] Not able to write in ErrorLog.txt (Markloglogc 9)

2018-05-03 Thread Geert Josten
ErrorLog.txt is used for system-wide messages only since MarkLogic 9, and 
app(-server) specific messages are written to _ErrorLog.txt. When running 
xdmp:log from QConsole on port 8000, look for 8000_ErrorLog.txt. You should 
find your messages there..

Messages can get a little scattered that way, particularly when also spawning 
tasks on the TaskServer, but it also allows protecting particular parts of 
logging in case there could be sensitive data inside. I typically find myself 
using something like this to follow all logs at once:

tail -f /var/opt/MarkLogic/Logs/*.txt

Cheers,
Geert

From:  on behalf of DK Singh 

Reply-To: MarkLogic Developer Discussion 
Date: Thursday, May 3, 2018 at 3:21 PM
To: MarkLogic Developer Discussion 
Subject: [MarkLogic Dev General] Not able to write in ErrorLog.txt (Markloglogc 
9)

Hi I am using Marklogic 9 and i am running simple query using xdmp:log function 
from query console to write in in the ErrorLog.txt but not able to write, can 
anyone anything need to be configured in marklogc 9 for the writing in the log 
. i also tried to restart the server but the same result.


Query Example: xdmp:log("###Hello")

Regards
Dharmendra Kuarm Singh


___
General mailing list
General@developer.marklogic.com
Manage your subscription at: 
http://developer.marklogic.com/mailman/listinfo/general


Re: [MarkLogic Dev General] Importing temporal documents with MLCP

2018-03-07 Thread Geert Josten
Hi,

Slight adjustment to this message. There is a flaw in the example below. The 
risk of sending untested code..

It turns out that you cannot override systemStart and systemEnd this way. 
MarkLogic silently ignores those values in the metadata, hence the reason it 
slipped through unnoticed. To make this work, make sure LSQT has been enabled 
on your temporal collection. For instance with:

xquery version "1.0-ml";

import module namespace temporal = "http://marklogic.com/xdmp/temporal;
  at "/MarkLogic/temporal.xqy";

temporal:set-use-lsqt("uni-temporal", true())

It will initialize LSQT to 1601-01-01T00:00:00Z.  After that, you have to use 
temporal:statement-set-system-time to override the systemStart. Try avoid 
enabling LSQT automation until you have loaded all historic data, MarkLogic 
will not allow you to set the system-time before LSQT.

The following MLCP transform worked at my end without LSQT automation enabled:

xquery version "1.0-ml";
module namespace example = "http://marklogic.com/example;;

import module namespace temporal = "http://marklogic.com/xdmp/temporal;
  at "/MarkLogic/temporal.xqy";

declare default function namespace "http://www.w3.org/2005/xpath-functions;; 
(::)

declare option xdmp:mapping "false";

declare function example:transform(
  $content as map:map,
  $context as map:map
) as map:map*
{
  (: grab uri and content from $content :)
  let $uri := map:get($content, "uri")
  let $doc := map:get($content, "value")

  (: grab other argument values passed in via cmd-line from $context :)
  let $transform_param := map:get($context, "transform_param")
  let $collections := map:get($context, "collections")
  let $permissions := map:get($context, "permissions")
  let $quality := head((map:get($context, "quality"), 0))
  let $temporalCollection := map:get($context, "temporalCollection")

  (: determine the start you want to apply :)
  let $systemStart := current-dateTime() - xs:yearMonthDuration("P1Y") (: or an 
expression to grab the value from $doc or $transform_param :)

  (: apply systemStart to all subsequent temporal inserts :)
  let $_ := temporal:statement-set-system-time($systemStart)

  (: do a temporal inserts as desired :)
  let $_ :=
temporal:document-insert(
  $temporalCollection,
  $uri,
  $doc,
  map:new((
map:entry("collections", $collections),
map:entry("permissions", $permissions),
map:entry("quality", $quality)
  ))
)

  (: return empty-sequence to let MLCP know it doesn't need to insert docs 
itself :)
  return ()
};

Cheers,
Geert

From: 
<general-boun...@developer.marklogic.com<mailto:general-boun...@developer.marklogic.com>>
 on behalf of Geert Josten 
<geert.jos...@marklogic.com<mailto:geert.jos...@marklogic.com>>
Reply-To: MarkLogic Developer Discussion 
<general@developer.marklogic.com<mailto:general@developer.marklogic.com>>
Date: Saturday, February 24, 2018 at 4:49 PM
To: MarkLogic Developer Discussion 
<general@developer.marklogic.com<mailto:general@developer.marklogic.com>>
Subject: Re: [MarkLogic Dev General] Importing temporal documents with MLCP

Hi Hans,

Sorry for being late with this reply, hopefully it is still useful to you. This 
was a non-trivial question though, so I had to poke around in the docs to 
verify various things..

I think it is possible with MLCP. You can override the document-insert 
mechanism of MLCP using an MLCP transform. It would roughly look like:

xquery version "1.0-ml";
module namespace example = "http://marklogic.com/example;;

import module namespace temporal = "http://marklogic.com/xdmp/temporal;
  at "/MarkLogic/temporal.xqy";

declare function example:transform(
  $content as map:map,
  $context as map:map
) as map:map*
{
  (: grab uri and content from $content :)
  let $uri := map:get($content, "uri")
  let $doc := map:get($content, "value")

  (: grab other argument values passed in via cmd-line from $context :)
  let $transform_param := map:get($context, "transform_param")
  let $collections := map:get($context, "collections")
  let $permissions := map:get($context, "permissions")
  let $quality := map:get($context, "quality")
  let $temporalCollection := map:get($context, "temporalCollection")

  (: determine the start/end you want to apply :)
  let $systemStart := current-dateTime() (: or an expression to grab the value 
from $doc or $transform_param :)
  let $systemEnd := $temporal:MAX_TIME (: or an expression to grab the value 
from $doc or $transform_param :)

  (: do a temporal insert as desired :)
  let $_ :=
temporal:document-insert(
  $temporalCollection,
  $uri,
  $doc,
  map:new((

Re: [MarkLogic Dev General] Importing temporal documents with MLCP

2018-02-24 Thread Geert Josten
Hi Hans,

Sorry for being late with this reply, hopefully it is still useful to you. This 
was a non-trivial question though, so I had to poke around in the docs to 
verify various things..

I think it is possible with MLCP. You can override the document-insert 
mechanism of MLCP using an MLCP transform. It would roughly look like:

xquery version "1.0-ml";
module namespace example = "http://marklogic.com/example;;

import module namespace temporal = "http://marklogic.com/xdmp/temporal;
  at "/MarkLogic/temporal.xqy";

declare function example:transform(
  $content as map:map,
  $context as map:map
) as map:map*
{
  (: grab uri and content from $content :)
  let $uri := map:get($content, "uri")
  let $doc := map:get($content, "value")

  (: grab other argument values passed in via cmd-line from $context :)
  let $transform_param := map:get($context, "transform_param")
  let $collections := map:get($context, "collections")
  let $permissions := map:get($context, "permissions")
  let $quality := map:get($context, "quality")
  let $temporalCollection := map:get($context, "temporalCollection")

  (: determine the start/end you want to apply :)
  let $systemStart := current-dateTime() (: or an expression to grab the value 
from $doc or $transform_param :)
  let $systemEnd := $temporal:MAX_TIME (: or an expression to grab the value 
from $doc or $transform_param :)

  (: do a temporal insert as desired :)
  let $_ :=
temporal:document-insert(
  $temporalCollection,
  $uri,
  $doc,
  map:new((
map:entry("collections", $collections),
map:entry("permissions", $permissions),
map:entry("quality", $quality),
map:entry("metadata", map:new((
  map:entry("systemStart", $systemStart),
  map:entry("systemEnd", $systemEnd)
)))
  ))
)

  (: return empty-sequence to let MLCP know it doesn't need to insert docs 
itself :)
  return ()
};

Note: I haven’t actually tested above code, but I have overridden doc-insert in 
an MLCP transform before, and am pretty sure the temporal-insert should work 
like this too.

Regarding alternatives, you could take a look at doing REST calls to 
/v1/documents (multipart if possible), or at using DMSDK for large volume bulk 
loading with transforms.

Cheers,
Geert

From: 
>
 on behalf of Hans Hübner 
>
Reply-To: MarkLogic Developer Discussion 
>
Date: Friday, February 9, 2018 at 6:27 AM
To: MarkLogic Developer Discussion 
>
Subject: [MarkLogic Dev General] Importing temporal documents with MLCP

Hello,

when importing documents into a temporal collection with MLCP, is there a way 
to override the system time stamp?  We need to import large amounts of historic 
data into a database and want to set the system time to the time when those 
documents were created.  We anticipate needing the valid axis as well, so we do 
not want to abuse the valid axis to represent the document creation time.

If MLCP cannot do it, what would be a good alternative to bulk insert documents 
with a specified system timestamp?

Any pointers would be appreciated.

Thanks,
Hans

--
LambdaWerk GmbH
Oranienburger Straße 87/89
10178 Berlin
Phone: +49 30 555 7335 0
Fax: +49 30 555 7335 99

HRB 169991 B Amtsgericht Charlottenburg
USt-ID: DE301399951
Geschäftsführer:  Hans Hübner

http://lambdawerk.com/


___
General mailing list
General@developer.marklogic.com
Manage your subscription at: 
http://developer.marklogic.com/mailman/listinfo/general


Re: [MarkLogic Dev General] short question about result of sem:sparql-values

2018-02-20 Thread Geert Josten
Hi Erik,

Correct. The function returns a sequence of sem:binding objects. A sem:binding 
is a special type of map:map, so you can use map functions on them. Here is a 
working example that returns concrete values:

xquery version "1.0-ml";

import module namespace sem = "http://marklogic.com/semantics;
  at "/MarkLogic/semantics.xqy";

sem:sparql-values("select * { ?s ?p ?o } limit 50", map:map()) ! map:get(., "o")

Note: the ! Is the iterator operator from XQuery 3. You could also use a more 
old fashion FLWOR is that would be clearer to you..

Cheers,
Geert

From: 
>
 on behalf of Erik Zander 
>
Reply-To: MarkLogic Developer Discussion 
>
Date: Tuesday, February 20, 2018 at 9:02 AM
To: "general@developer.marklogic.com" 
>
Subject: [MarkLogic Dev General] short question about result of 
sem:sparql-values

Hi,

I have what I feel should be a simple question but I yet to understand how it’s 
done.

When calling sem:sparql-values how do I get the result?
In the
documentation it says it’s a sequence 
http://docs.marklogic.com/guide/semantics/semantic-searches - 
id_90139 
but that is as I understand it a sequence of sem:binding
Output of xdmp:describe

sem:binding(http://www.w3.org/2001/XMLSchema; 
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance; 
xmlns:json="http://marklogic.com/xdmp/json;>)

Raw Output

[{"notEmpty":"","label":"\"Friedrich 
Hegel\"@sv"}]

In this case I just want to get the value Friedrich Hegel.
Doing sem:sparql-values($sparql,$bindings)/label doesn’t work as it says not a 
node

Any pointers on how I should go about would be welcome.

Regards
Erik

___
General mailing list
General@developer.marklogic.com
Manage your subscription at: 
http://developer.marklogic.com/mailman/listinfo/general


Re: [MarkLogic Dev General] Unique GUID generation in MarkLogic

2018-02-13 Thread Geert Josten
Hi Abhinav,

sem:uuid-string() generates a 123-bit size random string, which makes changes 
on collisions extremely rare. If you prefer being paranoid, and want to check 
anyhow (there are various ways tocheck if an id or uri is taken or not), you’d 
typically do that in one database only. You could also do it across multiple 
databases, but the more you check and include, the slower it obviously gets.

There are some brief notes on this topic in the README of my ml-unique library: 
https://github.com/grtjn/ml-unique

Cheers,
Geert

From: 
<general-boun...@developer.marklogic.com<mailto:general-boun...@developer.marklogic.com>>
 on behalf of 
"abhinav.mish...@cognizant.com<mailto:abhinav.mish...@cognizant.com>" 
<abhinav.mish...@cognizant.com<mailto:abhinav.mish...@cognizant.com>>
Reply-To: MarkLogic Developer Discussion 
<general@developer.marklogic.com<mailto:general@developer.marklogic.com>>
Date: Tuesday, February 13, 2018 at 9:39 AM
To: "general@developer.marklogic.com<mailto:general@developer.marklogic.com>" 
<general@developer.marklogic.com<mailto:general@developer.marklogic.com>>
Subject: Re: [MarkLogic Dev General] Unique GUID generation in MarkLogic


Hi Geert,


Thank you for your response.


I guess it is just something we are wondering about. If we changed Servers or 
hardware, would there be a chance for duplicates. There is no other specific 
reason though.


Regards,

Abhinav


From: 
general-boun...@developer.marklogic.com<mailto:general-boun...@developer.marklogic.com>
 
<general-boun...@developer.marklogic.com<mailto:general-boun...@developer.marklogic.com>>
 on behalf of Geert Josten 
<geert.jos...@marklogic.com<mailto:geert.jos...@marklogic.com>>
Sent: Tuesday, February 13, 2018 8:55 AM
To: MarkLogic Developer Discussion; 
general-requ...@developer.marklogic.com<mailto:general-requ...@developer.marklogic.com>
Subject: Re: [MarkLogic Dev General] Unique GUID generation in MarkLogic

Hi Abhinav,

Can you elaborate on what you mean with ‘unique across environments’?

Cheers,
Geert

From: 
<general-boun...@developer.marklogic.com<mailto:general-boun...@developer.marklogic.com>>
 on behalf of abhinav mishra 
<abhinavmishr...@gmail.com<mailto:abhinavmishr...@gmail.com>>
Reply-To: MarkLogic Developer Discussion 
<general@developer.marklogic.com<mailto:general@developer.marklogic.com>>
Date: Tuesday, February 13, 2018 at 8:40 AM
To: 
"general-requ...@developer.marklogic.com<mailto:general-requ...@developer.marklogic.com>"
 
<general-requ...@developer.marklogic.com<mailto:general-requ...@developer.marklogic.com>>,
 "general@developer.marklogic.com<mailto:general@developer.marklogic.com>" 
<general@developer.marklogic.com<mailto:general@developer.marklogic.com>>
Subject: [MarkLogic Dev General] Unique GUID generation in MarkLogic

Hi All,

We are exploring ways to generate GUIDs. We just want to be sure that the GUIDs 
are unique always and we have a requirement that these GUIDs should be even 
unique across environments. There should not be same GUID in two MarkLogic 
environments

We found sem:uuid-string() function from documentation and seems like a good 
point to start with. However we are not sure if this method returns unique 
across environments.

Can someone guide or provide more information on GUID generation, any third 
party open source library which we can use.

Regards,
Abhinav
--
Sent from my iPhone
This e-mail and any files transmitted with it are for the sole use of the 
intended recipient(s) and may contain confidential and privileged information. 
If you are not the intended recipient(s), please reply to the sender and 
destroy all copies of the original message. Any unauthorized review, use, 
disclosure, dissemination, forwarding, printing or copying of this email, 
and/or any action taken in reliance on the contents of this e-mail is strictly 
prohibited and may be unlawful. Where permitted by applicable law, this e-mail 
and other e-mail communications sent to and from Cognizant e-mail addresses may 
be monitored.
___
General mailing list
General@developer.marklogic.com
Manage your subscription at: 
http://developer.marklogic.com/mailman/listinfo/general


Re: [MarkLogic Dev General] Unique GUID generation in MarkLogic

2018-02-13 Thread Geert Josten
Hi Abhinav,

Can you elaborate on what you mean with ‘unique across environments’?

Cheers,
Geert

From: 
>
 on behalf of abhinav mishra 
>
Reply-To: MarkLogic Developer Discussion 
>
Date: Tuesday, February 13, 2018 at 8:40 AM
To: 
"general-requ...@developer.marklogic.com"
 
>,
 "general@developer.marklogic.com" 
>
Subject: [MarkLogic Dev General] Unique GUID generation in MarkLogic

Hi All,

We are exploring ways to generate GUIDs. We just want to be sure that the GUIDs 
are unique always and we have a requirement that these GUIDs should be even 
unique across environments. There should not be same GUID in two MarkLogic 
environments

We found sem:uuid-string() function from documentation and seems like a good 
point to start with. However we are not sure if this method returns unique 
across environments.

Can someone guide or provide more information on GUID generation, any third 
party open source library which we can use.

Regards,
Abhinav
--
Sent from my iPhone
___
General mailing list
General@developer.marklogic.com
Manage your subscription at: 
http://developer.marklogic.com/mailman/listinfo/general


Re: [MarkLogic Dev General] ML 9 Issues

2018-02-07 Thread Geert Josten
Hi Praveen,

I have not run into segfaults myself for a while, running various patch version 
of ML9, including 9.0-4. Are you doing anything special? How are you loading 
the content, and are you applying any transformations along the way?

Cheers,
Geert

From: 
<general-boun...@developer.marklogic.com<mailto:general-boun...@developer.marklogic.com>>
 on behalf of Praveen Gontla 
<praveenkumargontla...@gmail.com<mailto:praveenkumargontla...@gmail.com>>
Reply-To: MarkLogic Developer Discussion 
<general@developer.marklogic.com<mailto:general@developer.marklogic.com>>
Date: Wednesday, February 7, 2018 at 3:27 PM
To: MarkLogic Developer Discussion 
<general@developer.marklogic.com<mailto:general@developer.marklogic.com>>
Subject: Re: [MarkLogic Dev General] ML 9 Issues

I am using 9.0.-4.

Yes, have a ticket filed already with my LexisNexis account login. Checking 
here with this group if anyone else have seen a similar type of issues and 
found any solutions.

Thanks,
Praveen.

On Wed, Feb 7, 2018 at 2:23 PM, Geert Josten 
<geert.jos...@marklogic.com<mailto:geert.jos...@marklogic.com>> wrote:
Hi Praveen,

Which version are you using specifically? If you are not yet using 9.0-4, could 
you rerun it with that as well?

Cheers,
Geert

From: 
<general-boun...@developer.marklogic.com<mailto:general-boun...@developer.marklogic.com>>
 on behalf of Praveen Gontla 
<praveenkumargontla...@gmail.com<mailto:praveenkumargontla...@gmail.com>>
Reply-To: MarkLogic Developer Discussion 
<general@developer.marklogic.com<mailto:general@developer.marklogic.com>>
Date: Wednesday, February 7, 2018 at 1:45 PM
To: MarkLogic Developer Discussion 
<general@developer.marklogic.com<mailto:general@developer.marklogic.com>>
Subject: [MarkLogic Dev General] ML 9 Issues

Team,

Does anyone heard of segmentation faults in ML 9 when running a load test ?

Thanks,
Praveen.

___
General mailing list
General@developer.marklogic.com<mailto:General@developer.marklogic.com>
Manage your subscription at:
http://developer.marklogic.com/mailman/listinfo/general


___
General mailing list
General@developer.marklogic.com
Manage your subscription at: 
http://developer.marklogic.com/mailman/listinfo/general


Re: [MarkLogic Dev General] ML 9 Issues

2018-02-07 Thread Geert Josten
Hi Praveen,

Which version are you using specifically? If you are not yet using 9.0-4, could 
you rerun it with that as well?

Cheers,
Geert

From: 
>
 on behalf of Praveen Gontla 
>
Reply-To: MarkLogic Developer Discussion 
>
Date: Wednesday, February 7, 2018 at 1:45 PM
To: MarkLogic Developer Discussion 
>
Subject: [MarkLogic Dev General] ML 9 Issues

Team,

Does anyone heard of segmentation faults in ML 9 when running a load test ?

Thanks,
Praveen.
___
General mailing list
General@developer.marklogic.com
Manage your subscription at: 
http://developer.marklogic.com/mailman/listinfo/general


Re: [MarkLogic Dev General] Strange issue with xdmp:node-delete

2018-02-01 Thread Geert Josten
I concur that the fact ML seems to stop responding does sound like a deadlock. 
Looking at cluster status, and inspecting the execution queues might reveal a 
request that doesn’t seem to return.

Your code is not using eval or invoke, though, so i don’t think you can create 
a deadlock with just that code. Could it be though that you are invoking below 
code from elsewhere where you might be reading or updating same kind of content?

Cheers,
Geert

From: 
>
 on behalf of David Ennis 
>
Reply-To: MarkLogic Developer Discussion 
>
Date: Thursday, February 1, 2018 at 7:30 AM
To: MarkLogic Developer Discussion 
>
Subject: Re: [MarkLogic Dev General] Strange issue with xdmp:node-delete

I think that you are getting a deadlock by referencing the element via the red 
and subsequent xpath into the document while in the same transaction trying to 
update the document.

Isolating the read of the document should work. Likewise, you can isolate the 
node-delete as well, but that could have an impact on the way you are expecting 
your transactions to work.


Please try to isolate this line into a separate transaction using 
xdmp:invoke-function()
‘let $doc := fn:doc(db:get-uri-by-rid($rid))/l:expression’
- isolation: different transaction
- update-auto-commit

You may also want to use the ‘prevent-deadlocks’ option to throw an error if 
your code has, in fact, created a deadlock situation.


My theory can be tested by removing the following lines:
let $doc := fn:doc(db:get-uri-by-rid($rid))/l:expression
$doc/l:exportChannels/l:exportChannel[@href = 
$exportlib:channel-rid-i2])/l:precomputedValues

-David


--


From: 
>
 on behalf of Lanz 
>
Reply-To: MarkLogic Developer Discussion 
>
Date: Wednesday, January 31, 2018 at 7:08 PM
To: MarkLogic Developer Discussion 
>
Subject: Re: [MarkLogic Dev General] Strange issue with xdmp:node-delete

yes Florent, it returns the doc uri

On Wed, Jan 31, 2018 at 6:53 PM, Florent Georges 
> wrote:
Just in case, you can confirm that the node you are about to delete is indeed 
from a stored document with something like the following :

xdmp:log(fn:document-uri(fn:root($node)))
--
Florent Georges
H2O Consulting
http://h2o.consulting/

On 31 Jan 2018 18:47, "Lanz" 
> wrote:
Hi Mister Florent,

Thks for your answer.
There is no symptom at all: no exception, no specific log, no deletion. When I 
run the 'real' script on QConsole (or Oxygen connected through xdbc to ML), it 
returns the focus after a while (whitout SVC-EXTIME) without any deletion.
I'm trying with a colleague to simplify the script and recreate a readable case 
for everyone.
For now it seems that the sem:sparql combined with sem:store has some 
influences on this behavior

So I will come back soon with a simplified script.
Lanz


On Wed, Jan 31, 2018 at 6:26 PM, Florent Georges 
> wrote:
>
> Salut Lancelot,
>
> I might have missed it, but what is the exact symptom? Any error message? 
> Have you tried the "real" script through QConsole? Any chance you simplify it 
> and post it here? More pairs of eyes might catch something you missed...
>
> Regards,
>
> --
> Florent Georges
> H2O Consulting
> http://h2o.consulting/
>
> On 30 Jan 2018 22:29, "Lanz" 
> > wrote:
>>
>> Hi all,
>>
>> I've got this strange issue with xdmp:node-delete with a node in the 
>> database with Marklogic 8.0-6.3.
>> First when I use xdmp:node-delete on the target node with the above basic 
>> script, it works:
>> 
>> xquery version "1.0-ml";
>> declare namespace html = "http://www.w3.org/1999/xhtml;;
>> declare namespace l="http://www.oecd.org/ns/lambda/schema/;;
>> import module namespace db = "http://www.oecd.org/ns/lambda/app/lib/db; at 
>> "/app/lib/db/db.xqm";
>> import module namespace exportlib = 
>> "http://www.oecd.org/ns/lambda/app/lib/export; at 
>> "/app/lib/export/export.xqm";
>> declare variable $keyName-parent-toc-info as xs:string := 'parent-toc-info';
>> let $rid := 'urn:oecd.org:publications:id:expression:g2g12781'
>> let $doc := fn:doc(db:get-uri-by-rid($rid))/l:expression
>> return
>> (
>> (: display parent node before child node deletion :)
>> 

Re: [MarkLogic Dev General] archival strategies in bitemporal data

2018-02-01 Thread Geert Josten
Hi Swayam,

Can you elaborate a little more? There is temporal:document-protect, which 
takes archiving properties. Is that what you are after?

Cheers,
Geert

From: 
>
 on behalf of Serious Guy >
Reply-To: MarkLogic Developer Discussion 
>
Date: Thursday, February 1, 2018 at 1:13 PM
To: "general@developer.marklogic.com" 
>
Cc: "sebile1...@gmail.com" 
>
Subject: [MarkLogic Dev General] mlcp for multiple host

Hi All,

Can anyone help me with archival strategies in bitemporal data. Any kind of 
help is appreciated!

Thanks and regards,
Swayam Kartikey Sinha
___
General mailing list
General@developer.marklogic.com
Manage your subscription at: 
http://developer.marklogic.com/mailman/listinfo/general


Re: [MarkLogic Dev General] mlcp for multiple host

2018-02-01 Thread Geert Josten
Hi Vikas,

You don’t need to specify multiple hosts. MLCP will read out the list of hosts 
of the cluster automatically through the connect host, and will distribute the 
load among them. It is essential though that host names as listed inside 
MarkLogic work as identification on the network too..

Cheers,
Geert

From: 
>
 on behalf of "vikas.sin...@cognizant.com" 
>
Reply-To: MarkLogic Developer Discussion 
>
Date: Wednesday, January 31, 2018 at 8:12 PM
To: "general@developer.marklogic.com" 
>
Subject: [MarkLogic Dev General] mlcp for multiple host

Hi All,

I am customizing mlcp bean to run content pump through java code. I am able to 
set host name where I have single host .In cluster environment ,  I changed my 
code to set host as comma separated list  but getting illegal Argument 
exception . Do we have any other way to connect mlcp for multiple host.

Regards,
Vikas Singh
This e-mail and any files transmitted with it are for the sole use of the 
intended recipient(s) and may contain confidential and privileged information. 
If you are not the intended recipient(s), please reply to the sender and 
destroy all copies of the original message. Any unauthorized review, use, 
disclosure, dissemination, forwarding, printing or copying of this email, 
and/or any action taken in reliance on the contents of this e-mail is strictly 
prohibited and may be unlawful. Where permitted by applicable law, this e-mail 
and other e-mail communications sent to and from Cognizant e-mail addresses may 
be monitored.
___
General mailing list
General@developer.marklogic.com
Manage your subscription at: 
http://developer.marklogic.com/mailman/listinfo/general


Re: [MarkLogic Dev General] How to define relative path for the URL rewriter setting?

2018-01-26 Thread Geert Josten
Hi Evgeny,

Keep in mind that relative means relative to the modules-root. Check the 
modules-root setting of the app-server you are looking at.

Cheers,
Geert

From: 
>
 on behalf of Evgeny Degtyarev 
>
Reply-To: MarkLogic Developer Discussion 
>
Date: Friday, January 26, 2018 at 1:50 PM
To: "general@developer.marklogic.com" 
>
Subject: [MarkLogic Dev General] How to define relative path for the URL 
rewriter setting?

Hello,

We have Marklogic 9 server, and if I define relative path for URL rewriter - it 
doesn't work (404 error), although absolute path works fine.
According to the documentation - 
https://docs.marklogic.com/admin:appserver-set-url-rewriter - it should be 
possible.
> The path should specify a relative or absolute path to either an XQuery 
> module used as the interpretive rewriter or the XML file used by the 
> declarative rewriter.
I use declarative (XML) file as a rewriter.

Regards,
Evgeny
___
General mailing list
General@developer.marklogic.com
Manage your subscription at: 
http://developer.marklogic.com/mailman/listinfo/general


Re: [MarkLogic Dev General] question about transactions

2018-01-24 Thread Geert Josten
The outer query runs in query mode, so runs against the timestamp of initial 
invocation, causing it to never see the result of sem:rdf-insert. You’d have to 
put the sem:sparql in an xdmp:eval with different-transaction as well.

I also wonder though: what are you trying to do, why trying to squeeze insert 
and read in one request?

Cheers,
Geert

From: 
>
 on behalf of David Ennis 
>
Reply-To: MarkLogic Developer Discussion 
>
Date: Wednesday, January 24, 2018 at 7:34 PM
To: MarkLogic Developer Discussion 
>
Subject: Re: [MarkLogic Dev General] question about transactions

Please look up the options for xdmp:eval and note the following options 
explained there:
- transaction-mode
- isolation

Then change your eval to have the following options:
- transaction-mode=update-auto-commit
- isolation = different transaction

Then move the sem:sparql statement below the eval in your main code.

What are you doing here?

You are telling the insert to run as a separate transaction and auto-commit. 
This makes the triples available immediately after the eval is done. Therefore, 
you should run the select in the main code and not the isolated transaction.

Careful with the use of different transactions via eval and invoke. The wrong 
combination can get you into a deadlock.

Regards,
David Ennis

--


From: 
>
 on behalf of Erik Zander 
>
Reply-To: MarkLogic Developer Discussion 
>
Date: Wednesday, January 24, 2018 at 5:35 PM
To: MarkLogic Developer Discussion 
>
Subject: [MarkLogic Dev General] question about transactions

Hi All,

I have a question about I think transactions.

I want to insert some rdf and then query the database, and I want to do this in 
a function so I can call the function and depending on if I have the data in 
Marklogic or not get the data as rdf and insert it.

But my problem is that the following code only returns result second time I 
call it.
I’m thankful for pointers here
Regards
Erik
Code below
==

xquery version "1.0-ml"encoding "utf-8";


import module namespace sem="http://marklogic.com/semantics;
  at"/MarkLogic/semantics.xqy";
declare namespace rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#;;

let $wDataRdf:=
http://www.w3.org/1999/02/22-rdf-syntax-ns#;
  xmlns:rdfs="http://www.w3.org/2000/01/rdf-schema#;

  xmlns:wikibase="http://wikiba.se/ontology#;
  xmlns:wd="http://www.wikidata.org/entity/;
  >

Re: [MarkLogic Dev General] Good Way to Automatical

2018-01-22 Thread Geert Josten
I just wanted to mention you also have manage rest api to setup CPF:
http://docs.marklogic.com/REST/management/content-processing-framework-(cpf
)

Which is leveraged by ml-gradle for deployment of CPF:
https://github.com/marklogic-community/ml-gradle/tree/625e3aaadeb58dfa9f040
46a31234b33724b4134/examples/cpf-project


Another option is the code library that Roxy uses to deploy CPF:
https://github.com/marklogic-community/roxy/blob/master/deploy/lib/xquery/c
pf.xqy

It comes with an XML configuration for pipelines, and such:
https://github.com/marklogic-community/roxy/blob/master/deploy/sample/pipel
ine-config.sample.xml

Cheers

On 1/23/18, 12:49 AM, "general-boun...@developer.marklogic.com on behalf
of Eliot Kimber"  wrote:

>I've taken the Admin install CPF script and reworked it as a function
>library and removed the code related to default domains (I don't want
>default domains in any case).
>
>What's left includes code to set up the pipelines and triggers, as
>described in the CPF Configuration chapter.
>
>*But*
>
>It also includes the loading of schemas that are used by the pipelines
>and I didn't see anything that does that (or mentions it) in the CPF API.
>
>So unless I'm missing something (which is quite possible), I still need
>to do the schema loading.
>
>What I've extracted from the Admin code seems like a convenient way to
>just get CPF in place so that you can then set up your custom domains.
>
>It could be optimized for the needs of FlexRep, namely only bothering to
>even install the change and FlexRep pipelines but seems likely that other
>servers that will need automatic conversion might use other pipelines and
>there's no particular harm in having unused pipelines lying about.
>
>Note my requirement is simply to have CPF available so that I can then
>configure FlexRep. The lack of a quick-and-easy way to programmatically
>install CPF is simply a roadblock to the real configuration I need to do,
> namely configure FlexRep, which is otherwise easy enough to do (once one
>has understood how all the FlexRep parts fit together, which was a little
>harder than it should have been, but I think I've already commented on
>the FlexRep docs...).
>
>So I was really looking for a "call this one function to get CPF
>installed so you can continue on with your real task of getting FlexRep
>configured via a script" and I'm not seeing that out of the box.
>
>Or said more directly: there's a one-button task in the Admin UI to get
>CPF installed for a database. There should be a corresponding single-call
>function to do it programmatically and the FlexRep docs should make
>reference to that function at the same time they refer to the manual CPF
>installation process.
>
>Cheers,
>
>E.
>
>--
>Eliot Kimber
>http://contrext.com
> 
>
>
>On 1/22/18, 5:32 PM, "general-boun...@developer.marklogic.com on behalf
>of Mary Holstege" mary.holst...@marklogic.com> wrote:
>
>
>There isn't a single API that orchestrates all the pieces, but there
>are  
>APIs do do all the necessary parts in the pipelines and domains
>modules.  
>These should be executed against your triggers database. If you share
>a  
>triggers database, you don't need to do it all over again.
>
>p:insert to put a pipeline XML into the right collections etc.
>dom:configuration-create to create the overall configuration object
>that  
>defines your restart user etc.
>You need to do this before you create domains or things will go
>horribly wrong.
>dom:create to define your domains
>dom:add-pipeline to attach pipelines if you didn't put them in the
>domain  
>in dom:create
>
>All default pipelines are in the Installer directory.
>
>The thing in the admin GUI makes some default assumptions about some
>of  
>this that aren't always the appropriate thing to do.
>
>I'd suggest making a script that creates the domains you want and
>loads  
>and attaches the appropriate pipelines.
>
>//Mary
>
>On Mon, 22 Jan 2018 14:09:23 -0800, Eliot Kimber
>
>wrote:
>
>> I'm putting together a script that will do all the configuration
>for a  
>> server all the way through defining a FlexRep app server, domains,
>and  
>> targets. The requirement is avoid the need for any manual
>intervention  
>> once the configuration is started.
>>
>> The one fly in this ointment is the CPF--since I'm creating new
>> databases they of course won't have CPF installed, so I need to
>install  
>> the CPF into those that are involved in FlexRep.
>>
>> As far as I can tell there is no API for doing this API (there
>should  
>> be), so I'm going to attempt to simply call the
>> Admin/database-cpf-admin-go.xqy module, which seems simple enough
>(I  
>> only need to specify the 

Re: [MarkLogic Dev General] Dynamic Faceting in Marklogic

2018-01-22 Thread Geert Josten
Hi Arvind,

You can define indexes before or after (or during) uploading data. MarkLogic 
will automatically start reindexation if necessary. Defining indexes before 
ingestion is more efficient though. So, if you have a chance of ingesting a 
sample, deciding on indexes based on that, and then bulk load all your data, 
that would be most efficient.

Kind regards,
Geert

From: 
>
 on behalf of Arvind Kumar >
Reply-To: MarkLogic Developer Discussion 
>
Date: Monday, January 22, 2018 at 3:36 PM
To: MarkLogic Developer Discussion 
>
Subject: [MarkLogic Dev General] Dynamic Faceting in Marklogic

Hi All,

We have requirement for dynamic faceting in a search application.
It should enable facet on each field added while ingestion or updation.
As per my understanding, we need to re configure index configuration each time 
while ingestion.
But not sure how much it would be feasible.
Is Marklogic capable of setting indexes while ingestion.

Please suggest.


Regards,
Arvind Kr.
___
General mailing list
General@developer.marklogic.com
Manage your subscription at: 
http://developer.marklogic.com/mailman/listinfo/general


Re: [MarkLogic Dev General] Regarding Spawn function not working

2018-01-10 Thread Geert Josten
Hi Siva,

The xdmp:node functions only work on persisted nodes. Make sure 
$userPersonalInfo is a reference to something from the database, or use an 
in-memory update library: 
https://github.com/ryanjdew/XQuery-XML-Memory-Operations

Cheers,
Geert

From: 
>
 on behalf of "Mani, Sivasubramani (ELS)" 
>
Reply-To: MarkLogic Developer Discussion 
>
Date: Wednesday, January 10, 2018 at 7:17 AM
To: "general@developer.marklogic.com" 
>
Cc: "Sodihardjo, Aiwen (ELS-AMS)" 
>
Subject: [MarkLogic Dev General] Regarding Spawn function not working

Hi  Team,

I try to update the xml nodes more than one time using xdmp:spawn function , 
but the update was not happened inside the spwan function , I have specified 
the sample code below. Kindly do the needful

let $updateact_deact := xdmp:spawn-function(function(){
  ( xdmp:node-replace($userPersonalInfo/us:isActive,
false)>),
xdmp:commit())
  },
  
  update
  )

let $updatedeact_timestamp := xdmp:spawn-function(function(){
  
(xdmp:node-insert-child($userPersonalInfo/us:lastdeactAcntTimestamp,
 
{current-dateTime()}),
xdmp:commit())
  },
  
  update
  )


Thanks & Regards,
Siva

___
General mailing list
General@developer.marklogic.com
Manage your subscription at: 
http://developer.marklogic.com/mailman/listinfo/general


Re: [MarkLogic Dev General] Best Approach to Manage "Flags" That Might Change Within a Single Transaction

2017-12-07 Thread Geert Josten
You typically avoid these kinds of issues, by using a schedule that gets a
fresh latest view on the data each round, or by orchestrating things from
outside of MarkLogic..

You could also consider making http calls to localhost instead of eval.
Probably not quicker, but perhaps it feels more natural to your mind.. ;-)

Cheers

On 12/7/17, 6:19 PM, "general-boun...@developer.marklogic.com on behalf of
Eliot Kimber"  wrote:

>I think I've solved my problem by once again being more careful about
>holding elements in memory. By replacing global reads of my job doc with
>on-demand reads through xdmp:eval() I seem to have resolved my issue with
>changes to the job doc not being seen within the same separate
>transaction (e.g,, my read loop). I seem to be unable to let go of my
>procedural language brain damage
>
>Still, it seems like having a general, cross-application field or shared
>memory mechanism would be useful for this type of application where one
>app (e.g., my Web UI) spawns tasks that do the work and need a way to
>dynamically communicate within the scope of a single long-running
>transaction. At least that's the way I would go about building this type
>of application in a different environment.
>
>Cheers,
>
>E.
>--
>Eliot Kimber
>http://contrext.com
> 
>
>On 12/7/17, 10:48 AM, "general-boun...@developer.marklogic.com on behalf
>of Eliot Kimber" ekim...@contrext.com> wrote:
>
>I don't think server fields are going to work because they are per
>application server and I have different application servers at work.
>
>There is an HTTP server that gets the pause/resume request and then
>spawned tasks running the TaskServer that need to read the field.
>
>My experiments show that, per the docs, a field changed by one app is
>not seen by a different app.
>
>Cheers,
>
>Eliot
>--
>Eliot Kimber
>http://contrext.com
> 
>
>On 12/7/17, 10:13 AM, "general-boun...@developer.marklogic.com on
>behalf of Eliot Kimber" behalf of ekim...@contrext.com> wrote:
>
>I had not considered server fields--I'll check it out.
>
>Cheers,
>
>E.
>
>--
>Eliot Kimber
>http://contrext.com
> 
>
>On 12/7/17, 10:11 AM, "general-boun...@developer.marklogic.com on
>behalf of Erik Hennum" of erik.hen...@marklogic.com> wrote:
>
>Hi, Eliot:
>
>Have you considered a server field -- where any code that
>changes the status also updates the server field and the iterator checks
>the server field?
>
>The server fields are local to the host, so there's no
>concern about a separate iterator running on a different host.
>
>If multiple iterators run on the same host, each would need
>to distinguish its status by an id, which the iterator could generate
>from a random id when it starts.
>
>
>Hoping that helps,
>
>
>Erik Hennum
>
>
>
>
>From: general-boun...@developer.marklogic.com
> on behalf of Eliot Kimber
>
>Sent: Thursday, December 7, 2017 7:48:44 AM
>To: MarkLogic Developer Discussion
>Subject: [MarkLogic Dev General] Best Approach to Manage
>"Flags" That Might Change Within a Single Transaction
>
>In the context of my remote processing management system,
>where my client server is sending many tasks to a set of remote servers
>through a set of spawned tasks running in parallel, I need to be able to
>pause the client so that it stops sending new tasks to the remote servers.
>
>So far I've been using a single document stored in ML as my
>mechanism for indicating that a job is in progress and capturing the job
>details (job ID, start time, servers in use, etc.). This works fine
>because it was only updated at the start and end of the job.
>
>But for the pause/resume use case I need to have a flag that
>indicates that the job is paused and have other processes (e.g., my
>task-submission code) immediately respond to a change. For example, if
>I'm looping over 100 tasks to load up a remote task queue and the job is
>paused, I want that loop to end immediately.
>
>So basically, in this loop, for every iteration, check the
>"is paused" status, which requires reading the job doc to see if a
>@paused attribute is present (the @paused attribute captures the time the
>pause was requested and serves as the "is paused" flag). 

Re: [MarkLogic Dev General] How to query metadata

2017-12-07 Thread Geert Josten
Hi Florent,

Hidden metadata is part of the document fragment, but not part of ‘full-text’. 
If you want to do things with it, you need to add a metadata-field for each of 
them, and then you can also use a field-range index for range queries on them.

So yeah, it is a bit like a non-intrusive envelope (it works on binaries too), 
not requiring a separate fragment (like doc properties)..

Cheers

From: 
>
 on behalf of Florent Georges >
Reply-To: MarkLogic Developer Discussion 
>
Date: Thursday, December 7, 2017 at 2:38 PM
To: MarkLogic Developer Discussion 
>
Subject: Re: [MarkLogic Dev General] How to query metadata

Hi Erik,

Sorry, I don't have access to this email box from my client's, so it's not easy 
to follow discussions here.  But I saw your response and it was indeed the 
solution.

It seems I missed the point of the new metadata then, since I thought it was a 
way to attach indexed values to a document.

Ot is it more like a non-intrusive envelope pattern (it does not modify the 
content itself), not requiring new fragments (like doc properties), but having 
to be indexed one by one explicitly?

Anyway, thank you for your answer!

Regards,

--
Florent Georges
H2O Consulting
http://h2o.consulting/


On 4 December 2017 at 19:01, Erik Hennum wrote:
Hi, Florent:

Document metadata values are not indexed by default.  First, define a field for 
the metadata value that you want to search:

http://docs.marklogic.com/guide/search-dev/query-options#id_50523


Hoping that helps,


Erik Hennum



From: 
general-boun...@developer.marklogic.com
 
>
 on behalf of Florent Georges >
Sent: Monday, December 4, 2017 9:45:11 AM
To: MarkLogic Developer Discussion
Subject: [MarkLogic Dev General] How to query metadata

Hi,

I have documents with metadata (the metadata values you can retrieve with 
xdmp:document-get-metadata()).

I would like to search documents with a metadata with a specific value, but I 
cannot find any CTS query function for that purpose.  Something like:

cts:search(/, cts:metadata('foo', 'bar'))

Have I missed something?

Regards,

--
Florent Georges
http://fgeorges.org/
http://h2o.consulting/ - New website!



___
General mailing list
General@developer.marklogic.com
Manage your subscription at:
http://developer.marklogic.com/mailman/listinfo/general





___
General mailing list
General@developer.marklogic.com
Manage your subscription at: 
http://developer.marklogic.com/mailman/listinfo/general


Re: [MarkLogic Dev General] Bug in XSLT and XQuery Reference Guide

2017-11-29 Thread Geert Josten
Thnx, I filed a bug, and will add this as suggested extra example..

On 11/29/17, 7:18 PM, "general-boun...@developer.marklogic.com on behalf
of Eliot Kimber" <general-boun...@developer.marklogic.com on behalf of
ekim...@contrext.com> wrote:

>Here’s the code I have:
>
>for $map in $details
>order by map:get($map, 'active') ascending,
> map:get($map, 'queued') ascending,
>return $map
>
>Cheers,
>
>E.
>
>--
>Eliot Kimber
>http://contrext.com
> 
>
>
>On 11/29/17, 11:46 AM, "general-boun...@developer.marklogic.com on behalf
>of Geert Josten" <general-boun...@developer.marklogic.com on behalf of
>geert.jos...@marklogic.com> wrote:
>
>Thanks, looks like you are right.
>
>Can you elaborate on the multiple expressions?
>
>Cheers,
>Geert
>
>On 11/29/17, 5:30 PM, "general-boun...@developer.marklogic.com on
>behalf
>of Eliot Kimber" <general-boun...@developer.marklogic.com on behalf of
>ekim...@contrext.com> wrote:
>
>>I didn¹t see a place to submit comments in the guide like you can in
>the
>>reference topics so I¹m posting here.
>>
>>In http://docs.marklogic.com/guide/xquery/langoverview#id_11626, in
>the
>>section on the order-by clause, the syntax diagram shows the repeat
>>returning to before the ³order by² keyword.
>>
>>The correct syntax should have the repeat returning *after* the
>³order
>>by² keyword and before the $varExpr
>>
>>That is, order by is:
>>
>>order by expression1, expression2
>>
>>not order by expression1, order by expression2
>>
>>I also didn¹t see any examples of order-by clauses with multiple
>>expressions‹that would be useful to have.
>>
>>Cheers,
>>
>>E.
>>
>>--
>>Eliot Kimber
>>http://contrext.com
>> 
>>
>>
>>
>>___
>>General mailing list
>>General@developer.marklogic.com
>>Manage your subscription at:
>>http://developer.marklogic.com/mailman/listinfo/general
>
>___
>General mailing list
>General@developer.marklogic.com
>Manage your subscription at:
>http://developer.marklogic.com/mailman/listinfo/general
>
>
>
>___
>General mailing list
>General@developer.marklogic.com
>Manage your subscription at:
>http://developer.marklogic.com/mailman/listinfo/general

___
General mailing list
General@developer.marklogic.com
Manage your subscription at: 
http://developer.marklogic.com/mailman/listinfo/general


Re: [MarkLogic Dev General] Bug in XSLT and XQuery Reference Guide

2017-11-29 Thread Geert Josten
Thanks, looks like you are right.

Can you elaborate on the multiple expressions?

Cheers,
Geert

On 11/29/17, 5:30 PM, "general-boun...@developer.marklogic.com on behalf
of Eliot Kimber"  wrote:

>I didn¹t see a place to submit comments in the guide like you can in the
>reference topics so I¹m posting here.
>
>In http://docs.marklogic.com/guide/xquery/langoverview#id_11626, in the
>section on the order-by clause, the syntax diagram shows the repeat
>returning to before the ³order by² keyword.
>
>The correct syntax should have the repeat returning *after* the ³order
>by² keyword and before the $varExpr
>
>That is, order by is:
>
>order by expression1, expression2
>
>not order by expression1, order by expression2
>
>I also didn¹t see any examples of order-by clauses with multiple
>expressions‹that would be useful to have.
>
>Cheers,
>
>E.
>
>--
>Eliot Kimber
>http://contrext.com
> 
>
>
>
>___
>General mailing list
>General@developer.marklogic.com
>Manage your subscription at:
>http://developer.marklogic.com/mailman/listinfo/general

___
General mailing list
General@developer.marklogic.com
Manage your subscription at: 
http://developer.marklogic.com/mailman/listinfo/general


Re: [MarkLogic Dev General] Multi-Database Architecture

2017-11-29 Thread Geert Josten
We usually look at sizing questions from a timing and load perspective first. 
How many queries per sec on average and peak, and how many inserts per sec on 
average and peak?

With a given sample set, you can often get estimates on read/write IO, which is 
one of the biggest bottle neck in most cases, particularly for inserts. The 
expected IO bandwidth versus available IO bandwidth per host typically gives an 
indication how many hosts you need to reach the ingest speed you are after.

Querying however should be less IO bound, because ideally you try to run from 
indexes as much as possible. More forests helps speed up querying because index 
lookups can be parallelized. The number of forests is linked to the number of 
cores though, like you suggest. It is not a 1 on 1 relation, though. Rough 
thumb rule is 1 or 2 cores per forest. 1 if is it mostly querying or inserting 
only, 2 if both happen at the same time a lot.

That is for bigger forests though. You can probably push it a bit if the 
forests are tiny, and/or used only limited during a day. I think I currently 
have almost 150 forests on my 16 core laptop, 3 to 5 for each demo that i 
happen to have installed. That only works because i rarely use more than one 
demo at the same time.

In the end I think IO bandwidth is more important than the number of forests. 
Also keep in mind that scaling up and down is relatively easy with MarkLogic. 
If you start doing metrics on performance, you should get a good feel of how 
your system would hold up, if you start increasing load.

Cheers,
Geert

From: 
<general-boun...@developer.marklogic.com<mailto:general-boun...@developer.marklogic.com>>
 on behalf of Andreas Hubmer 
<andreas.hub...@ebcont.com<mailto:andreas.hub...@ebcont.com>>
Reply-To: MarkLogic Developer Discussion 
<general@developer.marklogic.com<mailto:general@developer.marklogic.com>>
Date: Wednesday, November 29, 2017 at 1:19 PM
To: MarkLogic Developer Discussion 
<general@developer.marklogic.com<mailto:general@developer.marklogic.com>>
Subject: Re: [MarkLogic Dev General] Multi-Database Architecture

Actually, it is the other way around. MarkLogic prefers multiple forests above 
a single forest...
Don’t put too many forests on a single host though, or they will just compete 
for resources.
Where would you draw the border between preferring many small forests and not 
creating too many forests on a host?
Would you use the expected forest size as indicator? (eg. no forest < 1gb)
Or would you try to create not more forests than cpu-cores /2 per host?

Thanks,
Andreas


2017-11-28 12:38 GMT+01:00 Geert Josten 
<geert.jos...@marklogic.com<mailto:geert.jos...@marklogic.com>>:
Actually, it is the other way around. MarkLogic prefers multiple forests above 
a single forest. Each forest has its own in-memory stand, and MarkLogic prefers 
multiple smaller ones above one big one. The idea is that it allows 
parallelizing the workload to resolve from indexes, and also be able to pull 
content from disk in parallel (particularly if multiple hosts, or 
disks/controllers are involved).

Don’t put too many forests on a single host though, or they will just compete 
for resources.

Also note that a forest is not the same as a database. Each database will have 
at least one forest, but could have many more, potentially spread out over 
multiple hosts. So, one big database, or multiple small ones could end up 
resulting in the same in-memory stand sizes. It all depends on how many forests 
each database has, and how much data is inside them.

Whether it makes most sense to use one shared db, or multiple small ones, that 
really is a functional/business question primarily. I’d add though, that I’d 
personally prefer built-in backup over MLCP for backups..

Cheers,
Geert

From: 
<general-boun...@developer.marklogic.com<mailto:general-boun...@developer.marklogic.com>>
 on behalf of Andreas Hubmer 
<andreas.hub...@ebcont.com<mailto:andreas.hub...@ebcont.com>>
Reply-To: MarkLogic Developer Discussion 
<general@developer.marklogic.com<mailto:general@developer.marklogic.com>>
Date: Tuesday, November 28, 2017 at 10:59 AM
To: MarkLogic Developer Discussion 
<general@developer.marklogic.com<mailto:general@developer.marklogic.com>>
Subject: Re: [MarkLogic Dev General] Multi-Database Architecture

Hi,

The clients are different services in a larger micro-service landscape. Some of 
them will store small amounts of data (less than 1GB, maybe even less than 
100MB), others large amounts.
The services with small amounts of data make me worry about efficient usage of 
memory and in-memory-stands. If they share a database, the shared database 
could have larger in-memory stands (in contrast to many small in-memory stands 
of the individual databases). I assume that larger in-memory stands perform 
much better in peak moments?! Additionally, it is easier to tune 

Re: [MarkLogic Dev General] Cannot install 9.3.1 on CentOS 7

2017-11-28 Thread Geert Josten
[vagrant@ml9-ml1 ~]$ yum list glibc

Loaded plugins: fastestmirror

Repodata is over 2 weeks old. Install yum-cron? Or run: yum makecache fast

Loading mirror speeds from cached hostfile

 * base: centos.mirror.triple-it.nl

 * epel: ftp.nluug.nl

 * extras: centos.mirror.transip.nl

 * updates: mirror.ams1.nl.leaseweb.net

Installed Packages

glibc.i686   2.17-106.el7_2.8   
   @updates

glibc.x86_64 2.17-106.el7_2.8   
   @updates

ml9-ml1 is based on a centos-7.2 basebox..

Try running that yum install line I provided below, and see if that helps..

Cheers,
Geert

From: 
<general-boun...@developer.marklogic.com<mailto:general-boun...@developer.marklogic.com>>
 on behalf of Florent Georges <li...@fgeorges.org<mailto:li...@fgeorges.org>>
Reply-To: MarkLogic Developer Discussion 
<general@developer.marklogic.com<mailto:general@developer.marklogic.com>>
Date: Tuesday, November 28, 2017 at 10:14 PM
To: MarkLogic Developer Discussion 
<general@developer.marklogic.com<mailto:general@developer.marklogic.com>>
Subject: Re: [MarkLogic Dev General] Cannot install 9.3.1 on CentOS 7

Hi Geert,

Thank you.  When I try to install it, it tells me that...

Package glibc-2.17-196.el7.x86_64 already installed and latest version

Do you have a version above 2.14, or is it exactly 2.14?

Regards,

--
Florent Georges
H2O Consulting
http://h2o.consulting/


On 28 November 2017 at 06:42, Geert Josten wrote:
Hi Florent,

I think you need glibc.x86_64 as well. I use this in mlvagrant:

yum -y install glibc.i686 gdb.x86_64 redhat-lsb.x86_64 cyrus-sasl 
cyrus-sasl-lib cyrus-sasl-md5

Cheers,
Geert

From: 
<general-boun...@developer.marklogic.com<mailto:general-boun...@developer.marklogic.com>>
 on behalf of Florent Georges <li...@fgeorges.org<mailto:li...@fgeorges.org>>
Reply-To: MarkLogic Developer Discussion 
<general@developer.marklogic.com<mailto:general@developer.marklogic.com>>
Date: Tuesday, November 28, 2017 at 1:13 AM
To: MarkLogic Developer Discussion 
<general@developer.marklogic.com<mailto:general@developer.marklogic.com>>
Subject: [MarkLogic Dev General] Cannot install 9.3.1 on CentOS 7

Hi,

I've just tried to install the latest version, 9.3.1, on CentOS 7.  A 
dependency seems to be broken.  I used CentOS-7-x86_64-Minimal-1708.  Then I 
did (all details at http://h2o.consulting/blog/vbox-marklogic-centos-7):

# yum update
# yum install gcc kernel-devel
# yum groupinstall 'Development Tools'
# yum install gdb glibc glibc.i686 lsb cyrus-sasl

When I try to install the MarkLogic RPM, I get the following error:

# rpm -i MarkLogic-9.0-3.1.x86_64.rpm
error: Failed dependencies:
lsb-core-amd64 is needed by MarkLogic-9.0-3.1.x86_64
libc.so.6(GLIBC_2.14) is needed by MarkLogic-9.0-3.1.x86_64

I can fix the first one with:

# yum install lsb-core-amd64

For the second one, I do have glibc installed.  But it is version 2.17 (as 
shown by "yum info glibc").  Is it possible the dependency on glibc is too 
restrictive (to require 2.14 exactly and not accept 2.17)?

Did any one experienced this before?

Regards,

--
Florent Georges
H2O Consulting
http://h2o.consulting/



___
General mailing list
General@developer.marklogic.com<mailto:General@developer.marklogic.com>
Manage your subscription at:
http://developer.marklogic.com/mailman/listinfo/general


___
General mailing list
General@developer.marklogic.com
Manage your subscription at: 
http://developer.marklogic.com/mailman/listinfo/general


Re: [MarkLogic Dev General] Multi-Database Architecture

2017-11-28 Thread Geert Josten
Actually, it is the other way around. MarkLogic prefers multiple forests above 
a single forest. Each forest has its own in-memory stand, and MarkLogic prefers 
multiple smaller ones above one big one. The idea is that it allows 
parallelizing the workload to resolve from indexes, and also be able to pull 
content from disk in parallel (particularly if multiple hosts, or 
disks/controllers are involved).

Don’t put too many forests on a single host though, or they will just compete 
for resources.

Also note that a forest is not the same as a database. Each database will have 
at least one forest, but could have many more, potentially spread out over 
multiple hosts. So, one big database, or multiple small ones could end up 
resulting in the same in-memory stand sizes. It all depends on how many forests 
each database has, and how much data is inside them.

Whether it makes most sense to use one shared db, or multiple small ones, that 
really is a functional/business question primarily. I’d add though, that I’d 
personally prefer built-in backup over MLCP for backups..

Cheers,
Geert

From: 
<general-boun...@developer.marklogic.com<mailto:general-boun...@developer.marklogic.com>>
 on behalf of Andreas Hubmer 
<andreas.hub...@ebcont.com<mailto:andreas.hub...@ebcont.com>>
Reply-To: MarkLogic Developer Discussion 
<general@developer.marklogic.com<mailto:general@developer.marklogic.com>>
Date: Tuesday, November 28, 2017 at 10:59 AM
To: MarkLogic Developer Discussion 
<general@developer.marklogic.com<mailto:general@developer.marklogic.com>>
Subject: Re: [MarkLogic Dev General] Multi-Database Architecture

Hi,

The clients are different services in a larger micro-service landscape. Some of 
them will store small amounts of data (less than 1GB, maybe even less than 
100MB), others large amounts.
The services with small amounts of data make me worry about efficient usage of 
memory and in-memory-stands. If they share a database, the shared database 
could have larger in-memory stands (in contrast to many small in-memory stands 
of the individual databases). I assume that larger in-memory stands perform 
much better in peak moments?! Additionally, it is easier to tune the 
configuration of one database vs. to tune the configuration of many databases.

On the other hand, we want to have an easy backup & restore process. Do you 
have any suggestions or experience on how this could be done in a shared 
database on a directory level?
The backup could be done with the MLCP (export, point-in-time). The restore 
with MLCP would be a step-process: remove all content from the directory, then 
import the backup. This is not as straight-forward as the builtin backup 
features.

Security, SLAs and data sharing are relevant topics which I feel comfortable 
with.
Maybe we'll go with a mix of shared and individual databases, even though this 
means a more complex architecture.

Thanks,
Andreas


2017-11-23 21:18 GMT+01:00 David Gorbet 
<david.gor...@marklogic.com<mailto:david.gor...@marklogic.com>>:
If these are completely separate use cases please consider completely separate 
clusters. You can use virtualization to make the hardware work out.

On Nov 23, 2017, at 12:04 PM, Geert Josten 
<geert.jos...@marklogic.com<mailto:geert.jos...@marklogic.com>> wrote:

Hi Andreas,

I think each forest has its own in-memory stand, so if each client has a 
reasonable amount of data, you’ll need several forests per client anyhow. One 
or multiple databases wouldn’t matter much in that case. I wouldn’t worry too 
much about in-memory stands though. Memory is much faster than disk, so worth 
using. And you’ll want spare resources anyhow to handle peak moments, so not 
fully utilizing resources all the time isn’t bad necessarily. An average use of 
30% of cpu and mem is pretty typical i’d say.

I would suggest looking at it more from a business or functional perspective. 
For instance:

  *   Do you need to guarantee clients won’t be able to see each others data? 
That would be a strong argument to want to keep things separate without doubt.
  *   Could different clients have different SLA terms? Another vote for 
keeping things separate.
  *   What if one clients wants to step out, and you need to purge its data? 
Dead simple with separate databases
  *   Is there any change one of the clients would like to run it on-site, 
rather than hosted?
  *   Or for the opposite: would there be any need to mix datasets from 
different clients? Any kind of sharing for instance, even if just of 
statistics, or some anonymous cross-validation?

And you can probably think of many more yourself.

Cheers,
Geert

From: 
<general-boun...@developer.marklogic.com<mailto:general-boun...@developer.marklogic.com>>
 on behalf of Andreas Hubmer 
<andreas.hub...@ebcont.com<mailto:andreas.hub...@ebcont.com>>
Reply-To: MarkLogic Developer Discussion 
<general@develo

Re: [MarkLogic Dev General] Cannot install 9.3.1 on CentOS 7

2017-11-27 Thread Geert Josten
Hi Florent,

I think you need glibc.x86_64 as well. I use this in mlvagrant:

yum -y install glibc.i686 gdb.x86_64 redhat-lsb.x86_64 cyrus-sasl 
cyrus-sasl-lib cyrus-sasl-md5

Cheers,
Geert

From: 
>
 on behalf of Florent Georges >
Reply-To: MarkLogic Developer Discussion 
>
Date: Tuesday, November 28, 2017 at 1:13 AM
To: MarkLogic Developer Discussion 
>
Subject: [MarkLogic Dev General] Cannot install 9.3.1 on CentOS 7

Hi,

I've just tried to install the latest version, 9.3.1, on CentOS 7.  A 
dependency seems to be broken.  I used CentOS-7-x86_64-Minimal-1708.  Then I 
did (all details at http://h2o.consulting/blog/vbox-marklogic-centos-7):

# yum update
# yum install gcc kernel-devel
# yum groupinstall 'Development Tools'
# yum install gdb glibc glibc.i686 lsb cyrus-sasl

When I try to install the MarkLogic RPM, I get the following error:

# rpm -i MarkLogic-9.0-3.1.x86_64.rpm
error: Failed dependencies:
lsb-core-amd64 is needed by MarkLogic-9.0-3.1.x86_64
libc.so.6(GLIBC_2.14) is needed by MarkLogic-9.0-3.1.x86_64

I can fix the first one with:

# yum install lsb-core-amd64

For the second one, I do have glibc installed.  But it is version 2.17 (as 
shown by "yum info glibc").  Is it possible the dependency on glibc is too 
restrictive (to require 2.14 exactly and not accept 2.17)?

Did any one experienced this before?

Regards,

--
Florent Georges
H2O Consulting
http://h2o.consulting/


___
General mailing list
General@developer.marklogic.com
Manage your subscription at: 
http://developer.marklogic.com/mailman/listinfo/general


Re: [MarkLogic Dev General] How to Do Equivalent of While true() Loop In ML?

2017-11-27 Thread Geert Josten
I think ML does not allow to endlessly re-spawn a task. Probably linked to 
trigger depth. It is to prevent things from running wild. I would definitely 
recommend running a schedule. I had to do it that way years ago when I was 
playing around with a custom queue mechanism: https://github.com/grtjn/ml-queue

Cheers

From: 
>
 on behalf of William Sawyer 
>
Reply-To: MarkLogic Developer Discussion 
>
Date: Monday, November 27, 2017 at 5:59 PM
To: MarkLogic Developer Discussion 
>
Subject: Re: [MarkLogic Dev General] How to Do Equivalent of While true() Loop 
In ML?

You could recursively spawn or setup a schedule task to run every minute or 
faster if needed.

-Will

On Mon, Nov 27, 2017 at 9:56 AM, Eliot Kimber 
> wrote:
I have a client-server system where the client is spawning 100s of 1000s of 
jobs on the client. The client polls the servers to see when each server’s task 
queue is ready for more jobs. This all works fine.

Logically this polling is a while-true() loop that will continue until either 
all the servers are offline or all the tasks to be submitted are consumed.

In a procedural language this is trivial, but in XQuery 2 I’m not finding a way 
to do it that works. In XQuery 3 I could use the new iterate operator but that 
doesn’t seem to be available in MarkLogic 9.

My first attempt was to use a recursive process, relying on tail recursion 
optimization to avoid blowing the stack buffer. That worked logically but I 
still ran into out-of-memory on the server at some point (around 200K jobs 
submitted) and it seems likely that it was runaway recursion doing it.

So I tried using a simple loop with xdmp:set() to iterate over the tasks and 
use an exception to break out when all the tasks are done:

try {
for $i in 1 to 100 (: i.e., loop forever :)
if (empty($tasks))
then error()
else submit-task(head($tasks))
xdmp:set($tasks, tail($tasks))
 } catch ($e) {
(: We’re done. (
}

Is there a better way to do this kind of looping forever?

I’m also having a very strange behavior where in my new looping code I’m 
getting what I think must be a pending commit deadlock that I didn’t get in my 
recursive version of the code. I can trace the code to the xdmp:eval() that 
would commit an update to the task and that code never returns.

Each task is a document that I update to reflect the details of the task’s 
status (start and end times, current processing status, etc.). Those updates 
are all done either in separately-run modules or via xdmp:eval(), so as far as 
I can tell there shouldn’t be any issues with uncommitted updates. I didn’t 
change anything in the logic that updates the task documents, only the loop 
that iterates over the tasks.

Could it be that the use of xdmp:set() to modify the $tasks variable (a 
sequence of  elements) would be causing some kind of commit lock?

Thanks,

Eliot

--
Eliot Kimber
http://contrext.com




___
General mailing list
General@developer.marklogic.com
Manage your subscription at:
http://developer.marklogic.com/mailman/listinfo/general

___
General mailing list
General@developer.marklogic.com
Manage your subscription at: 
http://developer.marklogic.com/mailman/listinfo/general


Re: [MarkLogic Dev General] Multi-Database Architecture

2017-11-23 Thread Geert Josten
Hi Andreas,

I think each forest has its own in-memory stand, so if each client has a 
reasonable amount of data, you’ll need several forests per client anyhow. One 
or multiple databases wouldn’t matter much in that case. I wouldn’t worry too 
much about in-memory stands though. Memory is much faster than disk, so worth 
using. And you’ll want spare resources anyhow to handle peak moments, so not 
fully utilizing resources all the time isn’t bad necessarily. An average use of 
30% of cpu and mem is pretty typical i’d say.

I would suggest looking at it more from a business or functional perspective. 
For instance:

  *   Do you need to guarantee clients won’t be able to see each others data? 
That would be a strong argument to want to keep things separate without doubt.
  *   Could different clients have different SLA terms? Another vote for 
keeping things separate.
  *   What if one clients wants to step out, and you need to purge its data? 
Dead simple with separate databases
  *   Is there any change one of the clients would like to run it on-site, 
rather than hosted?
  *   Or for the opposite: would there be any need to mix datasets from 
different clients? Any kind of sharing for instance, even if just of 
statistics, or some anonymous cross-validation?

And you can probably think of many more yourself.

Cheers,
Geert

From: 
>
 on behalf of Andreas Hubmer 
>
Reply-To: MarkLogic Developer Discussion 
>
Date: Thursday, November 23, 2017 at 4:53 PM
To: MarkLogic Developer Discussion 
>
Subject: [MarkLogic Dev General] Multi-Database Architecture

Hi,

I am planning the architecture of an application with dozens of individual 
clients. I think of using either one database for all data or a separate 
database per client.

The main pros and cons for me are efficient memory usage and the possibility of 
individual backup I tend to prefer the first and accept more 
complicated restore scenarios.

These are my considerations.

one-db:
* each client would use a different base directory (security: uri-privileges)
* 1 in-memory-stand -> more efficient memory usage. Do you agree that this is 
relevant?
* individual backup & restore of data of one client => complicated (MLCP?)

many-dbs (one db per client):
* many in-memory-stands -> less efficient memory usage / more smaller stands / 
more merging. Do you agree?
* builtin backup & restore of data of one client is possible
* very flexible configuration (individual indexes, ...)
* deployment more complex

For configuration we will use Roxy.

Thanks,
Andreas

--
Andreas Hubmer
Senior IT Consultant

EBCONT enterprise technologies GmbH
Millennium Tower
Handelskai 94-96
A-1200 Vienna

Mobile: +43 664 60651861
Fax: +43 2772 512 69-9
Email: andreas.hub...@ebcont.com
Web: http://www.ebcont.com

OUR TEAM IS YOUR SUCCESS

UID-Nr. ATU68135644
HG St.Pölten - FN 399978 d

VERTRAULICHKEITSHINWEIS/HAFTUNGSAUSSCHLUSS:
Der Inhalt dieser E-Mail und alle beigefügten Anhänge sind vertraulich zu 
behandeln, sind vor Veröffentlichung rechtlich geschützt und sind 
ausschließlich für den bezeichneten Adressaten bestimmt. Wenn Sie nicht der 
vorgesehene Empfänger sind, informieren Sie den Absender bitte umgehend und 
vernichten Sie diese E-Mail samt allen beigefügten Anhängen. Der Inhalt dieser 
Email darf nicht an/oder von dritten weitergeleitet, veröffentlicht, verwendet, 
kopiert oder auf andere Medien gespeichert werden. Wir übernehmen keine Haftung 
für eventuelle Schäden, die durch diese E-Mail oder deren Anhänge entstehen 
könnten.

CONFIDENTIALITY/DISCLAIMER:
This email and any files transmitted with it are confidential, are legally 
protected before publication and are intended solely for the use of the 
individual or entity to whom they are addressed. If you have received this 
email in error, please notify the sender immediately and destroy this e-mail 
together with all attachments. The content of this e-mail may not be be 
disseminated, published, copied or stored on third parties. We assume no 
liability for any damage that may result from this e-mail or its annexes.
___
General mailing list
General@developer.marklogic.com
Manage your subscription at: 
http://developer.marklogic.com/mailman/listinfo/general


Re: [MarkLogic Dev General] Spawned Task Appears to Block Other Threads

2017-11-09 Thread Geert Josten
Hi Eliot,

I think you kicked off your watcher job with an HTTP request, and it keeps
the port open until it finishes. Only one thread can use the port at the
same time. Use a different port for task response traffic, or consider
running your watcher as a scheduled task.

Not super robust, and probably not used in production, but i did write an
alternative queque for MarkLogic. It might give you some ideas..

https://github.com/grtjn/ml-queue


Cheers,
Geert

On 11/10/17, 1:06 AM, "general-boun...@developer.marklogic.com on behalf
of Eliot Kimber"  wrote:

>I have a system where I have a ³client² ML server that submits jobs to a
>set of remote ML servers, checking their task queues and keeping each
>server¹s queue at a max of 100 queued items (the remote servers could go
>away without notice so the client needs to be able to restart tasks and
>not have too many things queued up that would just have to resubmitted).
>
>The remote tasks then talk back to the client to report status and return
>their final results.
>
>My job submission code use recursive functions to iterate over the set of
>tasks to be submitted, checking for free remote queue slots via the ML
>REST API and submitting jobs as the queues empty. This code is spawned
>into a separate task in the task server. It uses xdmp:sleep(1000) to
>pause between checking the job queues.
>
>This all works fine, in that my jobs are submitted correctly and the
>remote queues fill up.
>
>However, as long as the job-submission task in the task server is
>running, the HTTP app that handles the REST calls from the remote servers
>is blocked (which blocks the remote jobs, which are of course waiting for
>responses from the client).
>
>If I kill the task server task, then the remote responses are handled as
>I would expect.
>
>My question: Why would the task server task block the other app? There
>must be something I¹m doing or not doing but I have no idea what it might
>be.
>
>Thanks,
>
>Eliot
>--
>Eliot Kimber
>http://contrext.com
> 
>
>
>
>___
>General mailing list
>General@developer.marklogic.com
>Manage your subscription at:
>http://developer.marklogic.com/mailman/listinfo/general

___
General mailing list
General@developer.marklogic.com
Manage your subscription at: 
http://developer.marklogic.com/mailman/listinfo/general


Re: [MarkLogic Dev General] Full-text search and JSON

2017-11-07 Thread Geert Josten
Well, you could give your json doc a root property.. :)

{ root: { id: 1234, text: ³brown fox² } }

Cheers

On 11/8/17, 4:03 AM, "general-boun...@developer.marklogic.com on behalf of
Will Thompson"  wrote:

>Hi Rob,
>
>Likewise! I think I just figured it out. I am excluding root in the word
>query index settings. The root of the document I was searching for is an
>object-node though, so I suspect there's no way to include it.
>
>-Will
>
>
>> On Nov 7, 2017, at 8:36 PM, Rob Szkutak 
>>wrote:
>> 
>> Hi Will,
>> 
>> I hope you are doing well. It's nice to see your name pop up.
>> 
>> I tested out your example on a new database in ML 9.0-3 and it worked
>>just fine for me. Can you try to fn:doc() the document and make sure you
>>can see it? If you can, check to make sure your document is not a binary
>>node with xdmp:node-kind(fn:doc("/test.json")/node()) .
>> 
>> Best,
>> Rob
>> 
>> Rob Szkutak
>> Senior Consultant
>> MarkLogic Corporation
>> www.marklogic.com
>> 
>> From: general-boun...@developer.marklogic.com
>> on behalf of Will Thompson
>>
>> Sent: Tuesday, November 7, 2017 5:36:31 PM
>> To: MarkLogic Developer Discussion
>> Subject: [MarkLogic Dev General] Full-text search and JSON
>>  
>> Is it possible to search generally against text tokens in JSON
>>documents? All of the JSON-specific cts:queries require property names,
>>and cts:word-query doesn't appear to match JSON documents. For example,
>>if I have a document with URI "/test.json":
>> 
>> {
>>   "id" : 1234,
>>   "text" : "The quick brown fox jumps over the lazy dog."
>> }
>> 
>> cts:search(doc(), "brown fox") returns empty. Is there another way to
>>do this?
>> 
>> -Will
>> ___
>> General mailing list
>> General@developer.marklogic.com
>> Manage your subscription at:
>> http://developer.marklogic.com/mailman/listinfo/general
>> ___
>> General mailing list
>> General@developer.marklogic.com
>> Manage your subscription at:
>> 
>>https://urldefense.proofpoint.com/v2/url?u=http-3A__developer.marklogic.c
>>om_mailman_listinfo_general=DwICAg=IdrBOxAMwHPzAikPNzltHw=_thRNTuzv
>>zYaEDwaA_AfnAe5hN2lWgi6qdluz6ApLYI=IL-JssAmKFbFz-tiCY8C6KCgrkw4LBVMCgi9
>>znH0jKM=1DKnkzzBpnXr44ZMhD1XHTPdzU8QtXT0Ie0dhXUx-3o=
>
>___
>General mailing list
>General@developer.marklogic.com
>Manage your subscription at:
>http://developer.marklogic.com/mailman/listinfo/general

___
General mailing list
General@developer.marklogic.com
Manage your subscription at: 
http://developer.marklogic.com/mailman/listinfo/general


Re: [MarkLogic Dev General] Document access based on field value

2017-11-07 Thread Geert Josten
Hi Richard,

It is usually easiest to build up a few layers of roles. Most flexibility is 
gained when you create separate read and update roles for each group of 
documents to which you want to control access separately. You can then use role 
inheritance to give a user or usergroup-specific role access to particular 
groups of documents. You could also create one that has access to all.

To save on cross-products of roles, I’d also advice looking into compartment 
security. That allows restricting access to combinations of roles, bit like AND 
(compartments) versus OR (default behavior)..

Cheers,
Geert

From: 
>
 on behalf of Shmennen >
Reply-To: "shmen...@yahoo.com" 
>, MarkLogic Developer Discussion 
>
Date: Tuesday, November 7, 2017 at 9:57 PM
To: Rob Szkutak >, 
MarkLogic Developer Discussion 
>
Subject: Re: [MarkLogic Dev General] Document access based on field value

Thanks, it looks good!

Btw, another question, may be not related: is there any way to assign some 
capabilities (e.g. insert, update, execute) to an user who can access all docs, 
no matter what roles and privileges they have in db?
E.g. some power user to have access (read/write) to all docs, independent of 
users they were inserted, but to not be admin.

Regards
Richard W.


On Tue, Nov 7, 2017 at 19:20, Rob Szkutak
> wrote:

Hello,


One solution to implement this is to use amplified functions (amps).


The basic idea is this:


* Restrict the document so that the user cannot read or update it.

* Create a function which the user must use to read or update the document.

* Amplify the function so that the user can read or modify the document only 
within your function.

* Have your function perform the validation check and either perform the 
desired document operation or return the appropriate invalid document response 
to the user.



Another solution is that every time a document is inserted or updated, you 
could perform a check if the document is valid or not and assign the 
appropriate role to it when the document is placed into the database.


Something like :

let $valid := true or false

return

 xdmp:document-insert("uri", $document, if($valid) then xdmp:permission("user 
can read") else xdmp:permission("user cannot read"))



If required you may also combine these two techniques.


Hope this is helpful.


Best,

Rob


Rob Szkutak
Senior Consultant
MarkLogic Corporation
www.marklogic.com


From: 
general-boun...@developer.marklogic.com
 
>
 on behalf of Shmennen >
Sent: Tuesday, November 7, 2017 10:54:40 AM
To: MarkLogic Developer Discussion
Subject: [MarkLogic Dev General] Document access based on field value

Hello All,

   Is there any possibility to get access to a document (suppose an XML or 
JSON) from database only if the value of a tag has a specific values?

E.g. user1 can read/modify document if only check tag has value "VALID".

999
VALID


- Richard
___
General mailing list
General@developer.marklogic.com
Manage your subscription at: 
http://developer.marklogic.com/mailman/listinfo/general


Re: [MarkLogic Dev General] Set of elements to search for search:search()

2017-10-24 Thread Geert Josten
In that case, have a look at cts:parse. Much faster than search:parse 
(allegedly), and much more flexible. I have used it for various purposes, even 
ones that don’t result in a cts:query.. ;-)

Cheers

From: 
<general-boun...@developer.marklogic.com<mailto:general-boun...@developer.marklogic.com>>
 on behalf of Florent Georges <li...@fgeorges.org<mailto:li...@fgeorges.org>>
Reply-To: MarkLogic Developer Discussion 
<general@developer.marklogic.com<mailto:general@developer.marklogic.com>>
Date: Tuesday, October 24, 2017 at 11:10 AM
To: MarkLogic Developer Discussion 
<general@developer.marklogic.com<mailto:general@developer.marklogic.com>>
Subject: Re: [MarkLogic Dev General] Set of elements to search for 
search:search()

Well, the whole point is that I need to add a "search grammar" to the search 
text field the user sends (supporting AND, OR, double quotes, etc.)  So if I 
have a structured query, I would simply use CTS.

The point is precisely to parse such a string.  And also to get the snippets. 
Hence the try to switch to the Search API.

But if it is not possible to configure several elements for the un-constraint 
terms, I guess the solution is rather to implement and parse my own grammar, 
and generating the snippets using CTS... :-(

Regards,

--
Florent Georges
H2O Consulting
http://h2o.consulting/


On 24 October 2017 at 09:06, Geert Josten wrote:
Hi Florent,

Have you considered using rest api’s capability to take a structured query, 
rather than relying on search options? That way you can send in complex custom 
adhoc queries, including those you are after.

Yes, you can give different weights in a field, but as you might guess, that is 
fixed too..

Cheers,
Geert

From: 
<general-boun...@developer.marklogic.com<mailto:general-boun...@developer.marklogic.com>>
 on behalf of Florent Georges <li...@fgeorges.org<mailto:li...@fgeorges.org>>
Reply-To: MarkLogic Developer Discussion 
<general@developer.marklogic.com<mailto:general@developer.marklogic.com>>
Date: Tuesday, October 24, 2017 at 8:12 AM
To: MarkLogic Developer Discussion 
<general@developer.marklogic.com<mailto:general@developer.marklogic.com>>
Subject: Re: [MarkLogic Dev General] Set of elements to search for 
search:search()

Hi Geert,

Thank you.  Unfortunately, that would require to create fields for any possible 
combination of elements.  The list of elements is computed algorithmically, and 
that would not be possible with fields.

Furthermore, I think that would not allow to give different weights to 
different elements either, would it?  Basically, I need to be able to say 
"element foo weight 10, element bar weight 5, etc."

So I guess my question is, is not there any way to ask search:search() to 
generate cts:and-query((cts:element-word-query(), cts:element-word-query())) 
instead of a simple cts:element-word-query(), for a simple term search string?

Regards,

--
Florent Georges
http://fgeorges.org/
http://h2o.consulting/ - New website!


On 23 October 2017 at 17:44, Geert Joste wrote:
I think I would use a field for this..

Cheers,
Geert

From: 
<general-boun...@developer.marklogic.com<mailto:general-boun...@developer.marklogic.com>>
 on behalf of Florent Georges <li...@fgeorges.org<mailto:li...@fgeorges.org>>
Reply-To: MarkLogic Developer Discussion 
<general@developer.marklogic.com<mailto:general@developer.marklogic.com>>
Date: Monday, October 23, 2017 at 4:42 PM
To: MarkLogic Developer Discussion 
<general@developer.marklogic.com<mailto:general@developer.marklogic.com>>
Subject: [MarkLogic Dev General] Set of elements to search for search:search()

Hi,

I am using the Search API, AKA search:search().  I need to restrict the set of 
elements to use for a search string with no specific constraint (e.g. "this AND 
that" as opposed to "this:that").

As a simplification, let's say I need to restrict the search to two elements, 
namely "foo" and "bar", in no namespace, with different weights.  I would have 
used the following, but "default" seems to accept only one "word":

search:search(
   'this AND that',
   http://marklogic.com/appservices/search;>
  
 

   
   10.0


   
   5.0

 
  
   )

MarkLogic does not complain on this one, but only takes the first one into 
account (well, at least the results returned are as if it was).

I feel I am missing something obvious here.  How is it possible to restrict a 
full text search to a set of element names using search:search()?

Regards;

--
Florent Georges
http://fgeorges.org/
http://h2o.consulting/ - New website!



___
General mailing list
General@developer.marklogic.com<mai

Re: [MarkLogic Dev General] Set of elements to search for search:search()

2017-10-24 Thread Geert Josten
Hi Florent,

Have you considered using rest api’s capability to take a structured query, 
rather than relying on search options? That way you can send in complex custom 
adhoc queries, including those you are after.

Yes, you can give different weights in a field, but as you might guess, that is 
fixed too..

Cheers,
Geert

From: 
>
 on behalf of Florent Georges >
Reply-To: MarkLogic Developer Discussion 
>
Date: Tuesday, October 24, 2017 at 8:12 AM
To: MarkLogic Developer Discussion 
>
Subject: Re: [MarkLogic Dev General] Set of elements to search for 
search:search()

Hi Geert,

Thank you.  Unfortunately, that would require to create fields for any possible 
combination of elements.  The list of elements is computed algorithmically, and 
that would not be possible with fields.

Furthermore, I think that would not allow to give different weights to 
different elements either, would it?  Basically, I need to be able to say 
"element foo weight 10, element bar weight 5, etc."

So I guess my question is, is not there any way to ask search:search() to 
generate cts:and-query((cts:element-word-query(), cts:element-word-query())) 
instead of a simple cts:element-word-query(), for a simple term search string?

Regards,

--
Florent Georges
H2O Consulting
http://h2o.consulting/


On 23 October 2017 at 17:44, Geert Joste wrote:
I think I would use a field for this..

Cheers,
Geert

From: 
>
 on behalf of Florent Georges >
Reply-To: MarkLogic Developer Discussion 
>
Date: Monday, October 23, 2017 at 4:42 PM
To: MarkLogic Developer Discussion 
>
Subject: [MarkLogic Dev General] Set of elements to search for search:search()

Hi,

I am using the Search API, AKA search:search().  I need to restrict the set of 
elements to use for a search string with no specific constraint (e.g. "this AND 
that" as opposed to "this:that").

As a simplification, let's say I need to restrict the search to two elements, 
namely "foo" and "bar", in no namespace, with different weights.  I would have 
used the following, but "default" seems to accept only one "word":

search:search(
   'this AND that',
   http://marklogic.com/appservices/search;>
  
 

   
   10.0


   
   5.0

 
  
   )

MarkLogic does not complain on this one, but only takes the first one into 
account (well, at least the results returned are as if it was).

I feel I am missing something obvious here.  How is it possible to restrict a 
full text search to a set of element names using search:search()?

Regards;

--
Florent Georges
http://fgeorges.org/
http://h2o.consulting/ - New website!



___
General mailing list
General@developer.marklogic.com
Manage your subscription at:
http://developer.marklogic.com/mailman/listinfo/general







___
General mailing list
General@developer.marklogic.com
Manage your subscription at: 
http://developer.marklogic.com/mailman/listinfo/general


Re: [MarkLogic Dev General] Regarding Issue in setting element range index

2017-10-10 Thread Geert Josten
Hi Siva,

Make sure the reindexer has completed reindexing before adding the new index..

Cheers,
Geert

From: 
>
 on behalf of "Mani, Sivasubramani (ELS)" 
>
Reply-To: MarkLogic Developer Discussion 
>
Date: Tuesday, October 10, 2017 at 9:20 AM
To: "general@developer.marklogic.com" 
>
Cc: ConSyn-Infosys-Support 
>
Subject: [MarkLogic Dev General] Regarding Issue in setting element range index

Hi Team,

I have deleted an elementRangeIndex callled "OriginalReleaseYear" and tried to 
add it back with a different setting. The reindexer is enabled in the database 
and it triggered reindexing after the deletion of the element range index.

However when I try to add this index back it throws the erorr "two or more 
range element indexes are identical. error in Marklogic". Actually there is no 
other index with the same name.

Please help out on how to resolve this issue.


Thanks & Regards,
Siva

___
General mailing list
General@developer.marklogic.com
Manage your subscription at: 
http://developer.marklogic.com/mailman/listinfo/general


Re: [MarkLogic Dev General] How To Detect Task Time Limit Exceeded Failures?

2017-10-07 Thread Geert Josten
Hi Eliot,

I heard the other day that it should be possible to capture such timeouts
with a try catch within the code itself. That gives an extra 10 seconds
delay which might be sufficient to send out an alert email, or raise some
other flag. After those few extra seconds, the timeout gets rethrown if
you don¹t finish in time..

Might be worth investigating?

Cheers,
Geert

On 10/7/17, 12:10 AM, "general-boun...@developer.marklogic.com on behalf
of Eliot Kimber"  wrote:

>Using current ML 9:
>
>I¹ve set up a little client-server application where the client spawns a
>large number of tasks on a remote cluster. Each remote task reports its
>status back to the client via HTTP.
>
>However, if one of the tasks times out in the Task Server there¹s no way
>for it to report its own failure and there doesn¹t seem to be anything
>else other than the task server that can detect the failure and report it.
>
>Is there any built-in mechanism by which a task time limit exceeded
>failure can be detected in a way that would allow me to the report back
>to the calling client? For example, something that gets the task¹s
>current call stack at the time of failure, which would give me the info I
>need to report back to the calling client.
>
>Unfortunately, the code I¹m running in these tasks is pre-existing
>processing that I¹m building this remote processing around so I can¹t
>easily do something like provide a heartbeat signal for each running task
>that a separate process could poll in order to detect terminated
>processes, although I¹m guessing that¹s the most likely solution now that
>I think about it.
>
>I do report to the client when each task starts so I guess I could
>presume that if a task hasn¹t finished some time after the configured max
>time limit that it is presumed to have failed.
>
>Thanks,
>
>Eliot  
>--
>Eliot Kimber
>http://contrext.com
> 
>
>
>
>___
>General mailing list
>General@developer.marklogic.com
>Manage your subscription at:
>http://developer.marklogic.com/mailman/listinfo/general

___
General mailing list
General@developer.marklogic.com
Manage your subscription at: 
http://developer.marklogic.com/mailman/listinfo/general


Re: [MarkLogic Dev General] How To Reflect Specific Timezone in Formatted Date Time?

2017-10-02 Thread Geert Josten
Hi Eliot,

That is not covered by the XPath standard, from the looks of it:
https://www.w3.org/TR/xslt20/#date-picture-string

I¹m afraid you will have to glue the timezone name to the date yourself.
Consider doing a reverse lookup in this map:map:
https://github.com/grtjn/ml-datetime/blob/master/datetime.xqy#L536, though
I imagine that could result in ambiguity. There are a few different
timezones with same offset if I am not mistaken. That is probably also why
they chose to leave it out of the standard..

Cheers

On 9/29/17, 5:45 PM, "general-boun...@developer.marklogic.com on behalf of
Eliot Kimber"  wrote:

>I¹m trying to produce a formatted date that reflects a specific time zone
>name, rather than e.g., ³GMT-07:00²
>
>format-dateTime($time, "[Y0001]-[M01]-[D01] at [H01]:[m01]:[s01] [ZN]")
>
>where $time = 2017-09-29T08:01:54.216992-07:00
>
>Returns
>
>2017-09-29 at 08:01:54 GMT-07:00
>
>Running on server in pacific time zone.
>
>What I¹d like is 
>
>2017-09-29 at 08:01:54  PDT
>
>I¹ve tried setting the $place parameter to different values but nothing
>I¹ve tried gives me a different result except to add a prefix before the
>date indicating the location. I also tried different values for the time
>zone pattern with no change (or simple failure due to a bad pattern). The
>W3C docs suggest that ³[ZN]² should result in just the time zone name but
>those specs are very difficult to understand so I¹m never sure I¹m
>understanding them correctly.
>
>Thanks,
>
>Eliot
>--
>Eliot Kimber
>http://contrext.com
> 
>
>
>
>___
>General mailing list
>General@developer.marklogic.com
>Manage your subscription at:
>http://developer.marklogic.com/mailman/listinfo/general

___
General mailing list
General@developer.marklogic.com
Manage your subscription at: 
http://developer.marklogic.com/mailman/listinfo/general


Re: [MarkLogic Dev General] Querying DateTime values

2017-09-23 Thread Geert Josten
I think the issue is in how you write the reference to the element you want to 
query. You write 
cts:element-attribute-range-query(xs:QName("PUBLICATION-DATE”), but like Chris 
is suggesting, you better write xs:QName("PUBLICATION-DATE”) as 
fn:QName(“http://www.incisivemedia.com/summary”, “PUBLICATION-DATE”), or you 
should declare a namespace prefix in your code, and use that in xs:QName:

declare namespace sum = “http://www.incisivemedia.com/summary”;

xs:QName(“sum:PUBLICATION-DATE")

Cheers

From: 
<general-boun...@developer.marklogic.com<mailto:general-boun...@developer.marklogic.com>>
 on behalf of Kari Cowan <kco...@alm.com<mailto:kco...@alm.com>>
Reply-To: MarkLogic Developer Discussion 
<general@developer.marklogic.com<mailto:general@developer.marklogic.com>>
Date: Friday, September 22, 2017 at 11:02 PM
To: MarkLogic Developer Discussion 
<general@developer.marklogic.com<mailto:general@developer.marklogic.com>>
Subject: Re: [MarkLogic Dev General] Querying DateTime values

Let me get your opinion on this.  I confirmed that there IS a dateTime 
element-attribute range index.

Where the query works correctly, the data is written like this representation:

http://www.incisivemedia.com/summary;>
2017-08-09 15:14:00:000


Where it fails, the data is written this way - note the namespace precedes the 
declaration - which seems wrong to me.

If it's right or wrong, can you advise if that needs to be updated or if 
there's a way to run the query to check a date range, returning docs within say 
the last 10 days only?

http://www.incisivemedia.com/summary;>
2017-09-21T17:47:00Z


On Fri, Sep 22, 2017 at 1:05 PM, Kari Cowan 
<kco...@alm.com<mailto:kco...@alm.com>> wrote:
Ahh, once i fixed the misplaced paren, I can get back a proper error that tells 
me something more useful -- No dateTime element-attribute range index for 
fn:QName("", "PUBLICATION-DATE") fn:QName("", "datetime")

Thanks! - I think I know where to go from here :)

XDMP-ELEMATTRRIDXNOTFOUND: cts:search(fn:collection(), 
cts:and-query((cts:directory-query("/data-sources/sbm/", "infinity"), 
cts:or-query((cts:element-query(xs:QName("sum:PUBLICATION-NAME"), 
cts:word-query("BenefitsPro.com", ("lang=en"), 1), ()), 
cts:element-query(xs:QName("sum:PUBLICATION-NAME"), cts:word-query("CUTimes", 
("lang=en"), 1), ()), cts:element-query(xs:QName("sum:PUBLICATION-NAME"), 
cts:word-query("Treasury  Risk", ("lang=en"), 1), ()), ...)), 
cts:element-attribute-range-query(fn:QName("", "PUBLICATION-DATE"), 
fn:QName("", "datetime"), ">=", xs:dateTime("2017-08-23T23:59:00Z"), (), 1)), 
())) -- No dateTime element-attribute range index for fn:QName("", 
"PUBLICATION-DATE") fn:QName("", "datetime")

On Fri, Sep 22, 2017 at 8:15 AM, Christopher Hamlin 
<cbham...@gmail.com<mailto:cbham...@gmail.com>> wrote:
I'm not sure what is the real problem.

xs:dateTime ('2003-08-01T08:00:00Z') > xs:dateTime (fn:current-date()
- xs:dayTimeDuration("P30D"))
,
xs:dateTime ('2017-09-28T00:00:00-04:00') > xs:dateTime
(fn:current-date() - xs:dayTimeDuration("P30D"))

return false and true, no failure.  What type is the index, and what
is the failure?

On Fri, Sep 22, 2017 at 11:10 AM, Geert Josten
<geert.jos...@marklogic.com<mailto:geert.jos...@marklogic.com>> wrote:
> Hi Kari,
>
> Looks like you misplaced one of the parentheses. Make sure to wrap the
> string "2017-09-22T08:00:00Z” in xs:dateTime(..) before you try to substract
> the duration. In provided query you have the xs dateTime cast wrapping both
> current-date and the duration.
>
> Cheers,
> Geert
>
> From: 
> <general-boun...@developer.marklogic.com<mailto:general-boun...@developer.marklogic.com>>
>  on behalf of Kari Cowan
> <kco...@alm.com<mailto:kco...@alm.com>>
> Reply-To: MarkLogic Developer Discussion 
> <general@developer.marklogic.com<mailto:general@developer.marklogic.com>>
> Date: Friday, September 22, 2017 at 4:51 PM
> To: MarkLogic Developer Discussion 
> <general@developer.marklogic.com<mailto:general@developer.marklogic.com>>
> Subject: [MarkLogic Dev General] Querying DateTime values
>
> I need some expert tips on this bug.
>
> cts:element-attribute-range-query(xs:QName("PUBLICATION-DATE"),xs:QName("datetime"),
> ">=", xs:dateTime(fn:current-date() - xs:dayTimeDuration("P30D")))
>
> The above query works fine when the publication-date is in this format:
>  datetime="2017-09-28T00:00:00-04:00">2017-09-28
> 00:00:00:000
>
>

Re: [MarkLogic Dev General] Querying DateTime values

2017-09-22 Thread Geert Josten
Hi Kari,

Looks like you misplaced one of the parentheses. Make sure to wrap the string 
"2017-09-22T08:00:00Z” in xs:dateTime(..) before you try to substract the 
duration. In provided query you have the xs dateTime cast wrapping both 
current-date and the duration.

Cheers,
Geert

From: 
>
 on behalf of Kari Cowan >
Reply-To: MarkLogic Developer Discussion 
>
Date: Friday, September 22, 2017 at 4:51 PM
To: MarkLogic Developer Discussion 
>
Subject: [MarkLogic Dev General] Querying DateTime values

I need some expert tips on this bug.

cts:element-attribute-range-query(xs:QName("PUBLICATION-DATE"),xs:QName("datetime"),
 ">=", xs:dateTime(fn:current-date() - xs:dayTimeDuration("P30D")))

The above query works fine when the publication-date is in this format:
2017-09-28 00:00:00:000

But it fails when
2003-08-01T08:00:00Z

The datetime format is different, but if I manipulate the current-date to match 
that format, the query will fail with this message:

[1.0-ml] XDMP-EXPR: (err:XPTY0004) "2017-09-22T08:00:00Z" - 
xs:dayTimeDuration("P30D") -- Invalid expression

How would I write the query to properly compare the dates?

The goal above was to return content from the last 30 days.




___
General mailing list
General@developer.marklogic.com
Manage your subscription at: 
http://developer.marklogic.com/mailman/listinfo/general


Re: [MarkLogic Dev General] Construct a lexicon value?

2017-09-18 Thread Geert Josten
Hi Evan,

To my knowledge, no.. Except maybe via UDFs.. 
http://docs.marklogic.com/guide/app-dev/aggregateUDFs

cts:values and cts:value-tuples both take queries and options though. And you 
can filter the returned values manually too..

Cheers,
Geert

From: 
>
 on behalf of Evan Lenz 
>
Reply-To: MarkLogic Developer Discussion 
>
Date: Monday, September 18, 2017 at 10:54 PM
To: MarkLogic Developer Discussion 
>
Subject: [MarkLogic Dev General] Construct a lexicon value?

I'm wondering if it's possible to construct an artificial lexicon value, i.e. 
that has an associated frequency accessible by the cts:frequency() function.

The use case is a modification of existing code which returns value pairs from 
cts:value-tuples(). I'm only interested in one of the values of each pair (and 
I'm using the proximity option to constrain which tuples come back, which is 
why I can't just use cts:values()). I want to remove duplicate values (that 
appear in more than one pair) and sum their frequencies, so that what I return 
is of the same data type (string + cts:frequency) as before.

I'll probably just construct some XML and update the consuming code 
accordingly, but it would be nice to avoid that more extensive redesign. Maybe 
it's just not good practice to pass lexicon values around while relying on 
their cts:frequency to be accessible, especially since the data type can't be 
properly constrained.

Actually, I now read this in the docs for cts:frequency():

"If the value specified is not from a value lexicon lookup, cts:frequency 
returns a frequency of 0."

So maybe it's not really a different data type, but a hidden property of every 
simple value. I'll rephrase my question: is it possible to set the frequency 
for a value?

Thanks,
Evan


Evan Lenz
President, Lenz Consulting Group, Inc.
http://lenzconsulting.com
___
General mailing list
General@developer.marklogic.com
Manage your subscription at: 
http://developer.marklogic.com/mailman/listinfo/general


Re: [MarkLogic Dev General] Create temporary user

2017-09-18 Thread Geert Josten
Could SAML authorization be of use to you? 
http://docs.marklogic.com/guide/security/external-auth#id_81653

SAML support was added in MarkLogic 9.

Cheers,
Geert

From: 
>
 on behalf of Andreas Hubmer 
>
Reply-To: MarkLogic Developer Discussion 
>
Date: Monday, September 18, 2017 at 9:07 AM
To: MarkLogic Developer Discussion 
>
Subject: Re: [MarkLogic Dev General] Create temporary user

Justin,

I'll answer for my colleague.
We'd like to use JSON Web Tokens (JWT) and extract the user roles from the 
token.
The users are managed in an external system and similar to the LDAP connection 
we want to avoid that every user has to be created/updated in MarkLogic too.

Amps do not give the same flexibility as a temporary user with an arbitrary 
combination of roles.

Thanks,
Andreas

2017-09-15 17:50 GMT+02:00 Justin Makeig 
>:
Andreas,
Rather than describe your solution, can you explain the problem you’re trying 
to solve? Why do you think you need a temporary user? What permission/privilege 
challenge are you trying to address?

You might also take a look at amps 
. An amp allows a 
security administrator to elevate the privileges of a specific function. This 
is beneficial in that the security is defined in configuration, not code.

Justin


--
Justin Makeig
Senior Director, Product Management
MarkLogic
jmak...@marklogic.com



> On Sep 15, 2017, at 4:29 AM, Andreas Holzgethan 
> > wrote:
>
> Hi @all,
>
> I need the possibility to create temporary user for a transaction.
> I just found in the documentation that such a functionality is used when for 
> example LDAP is configured as an external security.
>
> Could you please explain me how this is done there?
>
> My thirst thought was to create a user with the function 
> "sec:create-user-with-role". At the end of the transaction I would just call 
> the function "sec:remove-user".
> Could you please give me feedback about this implementation?
> Is such a implementation a big influence on the performance?
>
> Thanks!
>
> Best regards
> Andreas Holzgethan
>
> Andreas Holzgethan BSc.
>
> IT Consultant

--
Andreas Hubmer
Senior IT Consultant

EBCONT enterprise technologies GmbH
___
General mailing list
General@developer.marklogic.com
Manage your subscription at: 
http://developer.marklogic.com/mailman/listinfo/general


Re: [MarkLogic Dev General] Regarding cts:element-value-query

2017-09-01 Thread Geert Josten
Hi Siva,

cts:not-query(cts:element-value-query(xs:QName("myelem"), "")) would exclude 
empty myelem elements..

Kind regards,
Geert

From: 
>
 on behalf of "Mani, Sivasubramani (ELS)" 
>
Reply-To: MarkLogic Developer Discussion 
>
Date: Friday, September 1, 2017 at 1:32 PM
To: "general@developer.marklogic.com" 
>
Cc: ConSyn-Infosys-Support 
>
Subject: [MarkLogic Dev General] Regarding cts:element-value-query

Hi Team,

I use cts:element –query() & cts:element-value-query() to filter the documents 
based on their elements and element values. I need to filter the documents 
based on elements with values only but above query’s consider the empty 
elements also.

S56789
ES






S56789
ES


This is my query cts:and-query(( cts:element-query(xs:Qname(“pii”),”*”), 
cts:element-query(xs:Qname(“cp”),”*”)  )) or cts:and-query(( 
cts:element-value-query(xs:Qname(“pii”),”*”), 
cts:element-value-query(xs:Qname(“cp”),”*”)  )) both the query includes empty 
element in the result. I need to filter out the empty element from the result. 
Kindly do the needful.

Thanks & Regards,
Siva

___
General mailing list
General@developer.marklogic.com
Manage your subscription at: 
http://developer.marklogic.com/mailman/listinfo/general


Re: [MarkLogic Dev General] xray tests

2017-08-31 Thread Geert Josten
Hi,

Could you share some more detail on what is happening inside those tests? Would 
you be able to isolate which test is the culprit by commenting out each one by 
one?

Cheers,
Geert

From: 
>
 on behalf of Oleksii Segeda 
>
Reply-To: MarkLogic Developer Discussion 
>
Date: Thursday, August 31, 2017 at 5:19 PM
To: MarkLogic Developer Discussion 
>
Subject: [MarkLogic Dev General] xray tests

Hi everyone,

I have around 20-30 xray unit tests (https://github.com/robwhitby/xray). I want 
to run a full set of tests locally, before I deploy my code somewhere else.
Unfortunately, ML dies with out of memory error. If I run each test 
individually it works perfectly fine, but it takes forever to go through all of 
them manually.
I’ve tried to increase swap, limit the number of debug threads, limit cache 
sizes, etc. – nothing helps.

What else can be done here?

Regards,

Oleksii Segeda

IT Analyst

Information and Technology Solutions

[http://siteresources.worldbank.org/NEWS/Images/spacer.png]

[http://siteresources.worldbank.org/NEWS/Images/WBG_Information_and_Technology_Solutions.png]



___
General mailing list
General@developer.marklogic.com
Manage your subscription at: 
http://developer.marklogic.com/mailman/listinfo/general


Re: [MarkLogic Dev General] Where is General Documentation for the Task Server App?

2017-08-23 Thread Geert Josten
Hi Eliot,

You could be hitting a bug in MarkLogic. It might be worth upgrading to
8.0-7, and seeing if it still happens with that version. A lot of patches
and performance improvements have been made since 8.0-3.2..

Cheers,
Geert

On 8/23/17, 5:47 PM, "general-boun...@developer.marklogic.com on behalf of
Eliot Kimber"  wrote:

>Yes, I checked the log and the messages are:
>
>2017-08-22 21:43:42.287 Info: TaskServer: profiling-task.xqy
>[4136613570697343302]: Starting, start: 32361, group size: 10,
>outdir="/profiling/trial-058/group-3237/
>2017-08-22 21:43:42.287 Info: TaskServer: "
>2017-08-22 21:43:42.405 Info: Saving /marklogic/Forests/Meters/0a02
>2017-08-22 21:44:34.572 Notice: Starting MarkLogic Server 8.0-3.2 x86_64
>in /opt/MarkLogic with data in /marklogic
>2017-08-22 21:44:34.617 Info: Host  running Linux
>3.10.0-327.18.2.el7.x86_64 (Red Hat Enterprise Linux Server release 6.8
>(Santiago))
>2017-08-22 21:44:34.690 Info: SSL FIPS mode has been enabled
>
>The first message is from my task indicating that the 3237st (out of
>50,000 in the queue) is starting.
>
>Then the MarkLogic start message for no obvious reason. It’s not a time
>at which a scheduled server restart would have likely happened and nobody
>was (or should have been) awake at that hour and I’m the only person who
>should be doing anything with this server anyway.
>
>What’s interesting is that I’m getting this restart consistently at about
>the 3200th task, so it feels like either a time out or a resource
>exhaustion that then triggers a restart, but there are no messages about
>any kind of failure, out of memory condition, etc.
>
>I’m pretty sure it’s an issue with the configuration of the underlying
>linux server but I wanted to know if there were any conditions under
>which the Task Server or ML server itself would spontaneously restart.
>
>Thanks,
>
>Eliot
>
>--
>Eliot Kimber
>http://contrext.com
> 
>
>
>On 8/23/17, 10:25 AM, "general-boun...@developer.marklogic.com on behalf
>of Dave Cassel" david.cas...@marklogic.com> wrote:
>
>I don't believe there's any reason why the Task Server would be
>triggering
>a restart (although some configuration changes affecting the Task
>Server
>would). I'd look elsewhere for an error. Specifically, I'd check
>ErrorLog.txt, find the time when a restart happened, and look to see
>if
>anything interesting was logged just before. (Perhaps you've already
>done
>that.) 
>
>-- 
>Dave Cassel, @dmcassel 
>Technical Community Manager
>MarkLogic Corporation 
>
>http://developer.marklogic.com/
>
>
>
>
>On 8/23/17, 11:18 AM, "general-boun...@developer.marklogic.com on
>behalf
>of Eliot Kimber" ekim...@contrext.com> wrote:
>
>>I¹m trying to understand the Task Server (and in my case, why it is
>>consistently restarting after satisfying a subset of its queue).
>>
>>Going through the ML 8 docs I¹m not finding any general discussion
>of the
>>Task Server, only references to it from elsewhere (e.g., in the docs
>for
>>xdmp:spawn() and in discussion of scheduling tasks).
>>
>>But not finding anything that would appear to provide insight into
>why
>>the server would perform an uncommanded restart (or information
>>indicating that it would never do that and thus the problem must be
>>elsewhere).
>>
>>Have I missed it? Given that the Task Server is a built-in and
>prominent
>>part of MarkLogic it seems odd that there¹s no general documentation
>for
>>it, which makes me think I must have missed it. But I both searched
>the
>>doc set and ToC and scanned the entire Guide ToC and didn¹t find
>anything.
>>
>>Thanks,
>>
>>Eliot
>>--
>>Eliot Kimber
>>http://contrext.com
>> 
>>
>>
>>
>>___
>>General mailing list
>>General@developer.marklogic.com
>>Manage your subscription at:
>>http://developer.marklogic.com/mailman/listinfo/general
>
>___
>General mailing list
>General@developer.marklogic.com
>Manage your subscription at:
>http://developer.marklogic.com/mailman/listinfo/general
>
>
>
>___
>General mailing list
>General@developer.marklogic.com
>Manage your subscription at:
>http://developer.marklogic.com/mailman/listinfo/general

___
General mailing list
General@developer.marklogic.com
Manage your subscription at: 
http://developer.marklogic.com/mailman/listinfo/general


Re: [MarkLogic Dev General] SPARQL 'SERVICE' ?

2017-08-23 Thread Geert Josten
Hi Norbert,

MarkLogic does not support:

- 14 Basic Federated Query
- SPARQL 1.1 Service Description
- SPARQL 1.1 Federated Query

That is with pure SPARQL. MarkLogic allows wrapping SPARQL statements in XQuery 
or SJS code that effectively allow mimicking federated search, and the same 
technique could also be applied outside of MarkLogic as suggested by Erik.

Whether Federated search will be added to MarkLogic depends on customer needs, 
as always, but I’d expect it to be not very likely. As Erik tried to say, 
MarkLogic is a transactional database, and because of that we cannot rely on 
3rd party sparql services. It would be more likely that external linked data 
sources are loaded into MarkLogic to be queried with MarkLogic.

Cheers,
Geert

From: 
>
 on behalf of "Weissenberg, Norbert" 
>
Reply-To: MarkLogic Developer Discussion 
>
Date: Tuesday, August 22, 2017 at 10:12 AM
To: MarkLogic Developer Discussion 
>
Subject: Re: [MarkLogic Dev General] SPARQL 'SERVICE' ?

Hi Eric,

as defined in https://www.w3.org/TR/sparql11-query/,
SPARQL 1.1 is defined by “eleven SPARQL 1.1 Recommendations produced by the 
SPARQL Working Group:
1. SPARQL 1.1 
Overview
2. SPARQL 1.1 Query 
Language (this document)
3. SPARQL 1.1 Update
4. SPARQL1.1 Service 
Description
5. SPARQL 1.1 Federated 
Query
6. SPARQL 1.1 Query Results JSON 
Format
7. SPARQL 1.1 Query Results CSV and TSV 
Formats
8. SPARQL Query Results XML Format (Second 
Edition)
9. SPARQL 1.1 Entailment 
Regimes
10.  SPARQL 1.1 
Protocol
11.  SPARQL 1.1 Graph Store HTTP 
Protocol
“
Which of them are not implemented by MarkLogic 9? A first one is Federated 
Query.
But document https://www.w3.org/TR/sparql11-query/ also defines in Chapter 14:
“14 Basic Federated Query

This document incorporates the syntax for SPARQL federation extensions.
This feature is defined in the document SPARQL 1.1 Federated 
Query.
”

Will Federated Query be implemented by MarkLogic in future?
Although there are risks as you mentioned, I think it is important for semantic 
data integration (e.g. for linked open data).
Thanks,
Norbert

Von: 
general-boun...@developer.marklogic.com
 [mailto:general-boun...@developer.marklogic.com] Im Auftrag von Erik Hennum
Gesendet: Mittwoch, 26. Juli 2017 15:27
An: MarkLogic Developer Discussion 
>
Betreff: Re: [MarkLogic Dev General] SPARQL 'SERVICE' ?

Hi, Tony:

SPARQL SERVICE is an extension to SPARQL 1.1 for federated query:

"The SERVICE keyword instructs a federated query processor to invoke a portion 
of a SPARQL query against a remote SPARQL endpoint."
https://www.w3.org/TR/2013/REC-sparql11-federated-query-20130321/#service

As an extension, SERVICE is not part of core SPARQL 1.1

Federated queries are at risk for fundamental architecture issues with respect 
to performance, transactional consistency, resilience, and so on. For that 
reason, MarkLogic does not federate queries across other data sources.

It should be possible to use MarkLogic as one data source for a SPARQL/RDF tool 
that implements federation.


Erik Hennum


From:general-boun...@developer.marklogic.com[general-boun...@developer.marklogic.com]
 on behalf of Tony Greaves 
[tony.grea...@hill-informatics.co.nz]
Sent: Tuesday, July 25, 2017 4:29 PM
To: general@developer.marklogic.com
Subject: [MarkLogic Dev General] SPARQL 'SERVICE' ?
 The MarkLogic documentation states
"SPARQL queries are executed natively in MarkLogic to query either
in-memory triples or triples stored in a database. When querying triples
stored in a database, SPARQL queries execute 

Re: [MarkLogic Dev General] Bug in SPARQL date functions

2017-08-23 Thread Geert Josten
Hi Norbert,

I don’t think this is a bug. According to the recommendation, the year function 
expects an xs:dateTime argument, and matches functionality of 
fn:year-from-dateTime..

https://www.w3.org/TR/sparql11-query/#func-year

Cheers,
Geert

From: 
>
 on behalf of "Weissenberg, Norbert" 
>
Reply-To: MarkLogic Developer Discussion 
>
Date: Tuesday, August 22, 2017 at 12:13 PM
To: "general@developer.marklogic.com" 
>
Subject: [MarkLogic Dev General] Bug in SPARQL date functions

Hello,

using Query Console of MarkLogic-9.0-1.1-amd64, executing

PREFIX xs:
SELECT (year("2017-08-17"^^xs:date) AS ?year) WHERE {}

returns null, while the following returns “2017”^^xs:integer correctly.
Same with other SPARQL date functions and with xs:datetime.

PREFIX xs:
SELECT (year("2017-08-17T00:00:00"^^xs:date) AS ?year) WHERE {}

Best regards,
Norbert

___
General mailing list
General@developer.marklogic.com
Manage your subscription at: 
http://developer.marklogic.com/mailman/listinfo/general


Re: [MarkLogic Dev General] Getting Impossible Value from count()--why?

2017-08-23 Thread Geert Josten
Hi Eliot,

Keep in mind that you pass in item-frequency in cts:element-values, but
the default for range constraints is likely fragment-frequency. Did you
pass in an item-frequency facet-option in there too?

Kind regards,
Geert

On 8/22/17, 10:47 PM, "general-boun...@developer.marklogic.com on behalf
of Eliot Kimber"  wrote:

>If I sum the counts of each bucket calculated using cts:frequency() it
>matches the total calculated using the initial result from the
>element-values() query, so I guess the 10,000 count is a side effect of
>some internal lexicon implementation magic.
>
>Cheers,
>
>E.
>
>--
>Eliot Kimber
>http://contrext.com
> 
>
>
>On 8/22/17, 3:25 PM, "general-boun...@developer.marklogic.com on behalf
>of Eliot Kimber" ekim...@contrext.com> wrote:
>
>I think this is again my weak understanding of lexicons and frequency
>counting. 
>
>If I change my code to sum the frequencies of the durations in each
>range then I get more sensible numbers, e.g.:
>
>let $count := sum(for $dur in $durations[. lt $upper-bound][. ge
>$lower-bound] return cts:frequency($dur))
>
>Having updated get-enrichment-durations() to:
>
>cts:element-values(xs:QName("prof:overall-elapsed"), (),
>("descending", "item-frequency"),
> cts:collection-query($collection))
>
>It still seems odd that the pure lexicon check returns exactly 10,000
>*values*--that still seems suspect, but then using those 10,000 values to
>calculate the total frequency does result in a more likely number. I
>guess I can do some brute-force querying to see if it¹s accurate.
>
>Cheers,
>
>Eliot
>--
>Eliot Kimber
>http://contrext.com
> 
>
>
>On 8/22/17, 2:52 PM, "general-boun...@developer.marklogic.com on
>behalf of Eliot Kimber" behalf of ekim...@contrext.com> wrote:
>
>Using ML 8.0-3.2
>
>As part of my profiling application I run a large number of
>profiles, storing the profiler results back to the database. I¹m then
>extracting the times from the profiling data to create histograms and do
>other analysis.
>
>My first attempt to do this with buckets ran into the problem
>that the index-based buckets were not returning accurate numbers, so I
>reimplemented it to construct the buckets manually from a list of the
>actual duration values.
>
>My code is:
>
>let $durations as xs:dayTimeDuration* :=
>epf:get-enrichment-durations($collection)
>let $search-range := epf:construct-search-range()
>let $facets :=
>for $bucket in $search-range/search:bucket
>let $upper-bound := if ($bucket/@lt) then
>xs:dayTimeDuration($bucket/@lt) else xs:dayTimeDuration("PT0S")
>let $lower-bound := xs:dayTimeDuration($bucket/@ge)
>let $count := count($durations[. lt $upper-bound][. ge
>$lower-bound]) 
>return if ($count gt 0)
>   then count="{$count}">{epf:format-day-time-duration($upper-bound)}t-value>
>   else ()
>
>The get-enrichment-durations() function does this:
>
>  cts:element-values(xs:QName("prof:overall-elapsed"), (),
>"descending",
> cts:collection-query($collection))
>
>This works nicely and seems to provide correct numbers except
>when the number of durations within a particular set of bounds exceeds
>10,000, at which point count() returns 10,000, which is an impossible
>number‹the chance of there being exactly 10,000 instances within a given
>range is basically zero. But I¹m getting 10,000 twice, which is
>absolutely impossible.
>
>Here¹s the results I get from running this in the query console:
>
>
>75778
>
>xmlns:search="http://marklogic.com/appservices/search;>0.01
>seconds
>xmlns:search="http://marklogic.com/appservices/search;>0.02
>seconds
>xmlns:search="http://marklogic.com/appservices/search;>0.03
>seconds
>xmlns:search="http://marklogic.com/appservices/search;>0.04
>seconds
>xmlns:search="http://marklogic.com/appservices/search;>0.05
>seconds
> Š
>
>
>
>There are 75,778 actual duration values and the count value for
>the 3rd and 4th ranges are exactly 10,000.
>
>If I change the let $count := expression to only test the upper
>or lower bound then I get numbers greater than 10,000. I also tried
>changing the order of the predicates and using a single predicate with
>³and². The problem only seems to be related to using both predicates when
>the resulting sequence would have more than 10K items.
>
>Is there an explanation for why count() gives me exactly 10,000
>in 

Re: [MarkLogic Dev General] Large job processing question.

2017-08-23 Thread Geert Josten
There are ways to prevent that..

You could have a look at CPF as a very robust way of processing files, though 
it is usually initiated after doc-insert, not before. It also doesn’t 
necessarily prevent queue overload, but you could always increase the queue 
size if necessary.

Taskbot is a very good library that takes a very smart approach to spawn tasks 
without flooding the task server queue. (https://github.com/mblakele/taskbot)

And Sam mentioned using external applications to push the processing. DMSDK is 
the latest option, and would work well in combination with something like 
Apache Camel to monitor external folder, and pushing whatever changes when they 
happen, rather than in a scheduled way..

Cheers,
Geert

From: 
<general-boun...@developer.marklogic.com<mailto:general-boun...@developer.marklogic.com>>
 on behalf of "Ladner, Eric (Eric.Ladner)" 
<eric.lad...@chevron.com<mailto:eric.lad...@chevron.com>>
Reply-To: MarkLogic Developer Discussion 
<general@developer.marklogic.com<mailto:general@developer.marklogic.com>>
Date: Tuesday, August 22, 2017 at 10:33 PM
To: MarkLogic Developer Discussion 
<general@developer.marklogic.com<mailto:general@developer.marklogic.com>>
Subject: Re: [MarkLogic Dev General] Large job processing question.

Is it smart enough not to spawn 100,000 jobs at once and swamp the system?

Eric Ladner
Systems Analyst
eric.lad...@chevron.com<mailto:eric.lad...@chevron.com>


From: 
general-boun...@developer.marklogic.com<mailto:general-boun...@developer.marklogic.com>
 [mailto:general-boun...@developer.marklogic.com] On Behalf Of Geert Josten
Sent: August 22, 2017 13:59
To: MarkLogic Developer Discussion 
<general@developer.marklogic.com<mailto:general@developer.marklogic.com>>
Subject: [**EXTERNAL**] Re: [MarkLogic Dev General] Large job processing 
question.

Hi Eric,

Personally, I would probably let go of the all-docs-at-once approach, and spawn 
processes for each input (sub)folder, and potentially for batches or individual 
files in any folder as well. Same for the existing documents, spawn a process 
for batches or individual docs that check if they still exist. If you make them 
append logs to the documents or their properties, you can gather reports about 
changes afterwards if needed.

Cheers,
Geert

From: 
<general-boun...@developer.marklogic.com<mailto:general-boun...@developer.marklogic.com>>
 on behalf of "Ladner, Eric (Eric.Ladner)" 
<eric.lad...@chevron.com<mailto:eric.lad...@chevron.com>>
Reply-To: MarkLogic Developer Discussion 
<general@developer.marklogic.com<mailto:general@developer.marklogic.com>>
Date: Tuesday, August 22, 2017 at 4:36 PM
To: "general@developer.marklogic.com<mailto:general@developer.marklogic.com>" 
<general@developer.marklogic.com<mailto:general@developer.marklogic.com>>
Subject: [MarkLogic Dev General] Large job processing question.

We have some large jobs (ingestion and validation of unstructured documents) 
that have timeout issues.
The way the jobs are structured is structured is that the first job checks that 
all the existing documents are valid (still exists on the file system).  It 
does this in two steps:

 1) gather all documents to be validated from the DB
 2) check that list against the file system.

The second job is:
 1) the filesystem is traversed to find any new documents (or that have 
been modified in the last X days),
 2) those new/modified documents are ingested.

The problem in the second step is there could be tens of thousands of documents 
in a hundred thousand folders (don’t ask).  The job will typically time out 
after an hour during the “go find all the new documents” phase.  I’m trying to 
find out if there’s a way to re-structure the job so that it runs faster and 
doesn’t time out, or maybe breaks up the task into different parts that run in 
parallel or something.  Any thoughts welcome.

Eric Ladner
Systems Analyst
eric.lad...@chevron.com<mailto:eric.lad...@chevron.com>

___
General mailing list
General@developer.marklogic.com
Manage your subscription at: 
http://developer.marklogic.com/mailman/listinfo/general


Re: [MarkLogic Dev General] MarkLogic as pubic sparql endpoint?

2017-08-23 Thread Geert Josten
I think I would just create an app-server with a custom rewriter that exposes a 
custom sparql endpoint only, one that does exactly what you describe. That way 
you have full control over what is allowed, what data can be read, how results 
are returned etc.

Kind regards,
Geert

From: 
>
 on behalf of Tony Greaves 
>
Reply-To: MarkLogic Developer Discussion 
>
Date: Tuesday, August 22, 2017 at 11:55 PM
To: "general@developer.marklogic.com" 
>
Subject: [MarkLogic Dev General] MarkLogic as pubic sparql endpoint?

Wondering of someone can show me how to achieve what is required to use 
MarkLogic (v9) as would be required for a public SPARQL endpoint, in particular

1) Limit SPARQL queries to read only (not CRUD!);
1) authentication turned off for appserver for SPARQL queries;
2) results returned in W3C SPARQL Results format by default.

thanks
-Tony


Attention:

The information in this email and in any attachments is confidential. If you 
are not the intended recipient then please do not distribute, copy or use this 
information. Please notify us immediately and then delete the message from your 
computer. Any views or opinions presented are solely those of the author.
___
General mailing list
General@developer.marklogic.com
Manage your subscription at: 
http://developer.marklogic.com/mailman/listinfo/general


Re: [MarkLogic Dev General] Count of cts:element-values() not equal to number of element instances--what's going on?

2017-08-15 Thread Geert Josten
Wild guess.. Empty prof:overall-elapsed elements, that are
ignored/rejected by the range index?

Cheers

On 8/14/17, 9:58 PM, "general-boun...@developer.marklogic.com on behalf of
Eliot Kimber"  wrote:

>Using both cts:frequence and cts:count-aggregate I get numbers that are
>closer to the correct count but are short by about 200. What would
>account for the difference?
>
>Queries:
>
>let $profiles := 
>collection($collection)/enrprof:profiling-instance/enrprof:enrichment/enrp
>rof:evalResult/prof:*
>let $histograms := $profiles/prof:histogram
>let $overall-elapsed := $profiles/prof:metadata/prof:overall-elapsed
>let $durations := cts:element-values(xs:QName("prof:overall-elapsed"),
>(), "descending",
> cts:collection-query($collection))
>let $count-frequency := sum(for $dur in $durations return
>cts:frequency($dur))
>let $overall-elapsed-ref :=
>cts:element-reference(fn:QName("http://marklogic.com/xdmp/profile","overal
>l-elapsed"),("type=dayTimeDuration"))
>
>let $count-frequency := sum(for $dur in $durations return
>cts:frequency($dur))
>let $count-aggregate := cts:count-aggregate($overall-elapsed-ref,(),
>cts:collection-query($collection))
>
>Results:
>
>47539
>47539
>47539
>47371
>47371
>21219
>
>Cheers,
>
>E.
>--
>Eliot Kimber
>http://contrext.com
> 
>
>
>
>On 8/14/17, 1:53 PM, "general-boun...@developer.marklogic.com on behalf
>of Mary Holstege" mary.holst...@marklogic.com> wrote:
>
>
>That is overkill.  The results you get out of cts:element-values have
>a  
>frequency (accessible via cts:frequency). The cts: aggregates (e.g.
>cts:count, cts:sum) take the frequency into account.
>
>//Mary
>
>On Mon, 14 Aug 2017 11:42:07 -0700, Oleksii Segeda
> wrote:
>
>> Eliot,
>>
>> You can do something like this:
>> 
>   
> cts:element-value-co-occurrences(xs:QName("prof:overall-elapsed"),xs:QNam
>e("xdmp:document"))
>> if you have only one element per document.
>>
>> Best,
>>
>> Oleksii Segeda
>> IT Analyst
>> Information and Technology Solutions
>> www.worldbank.org
>>
>>
>> -Original Message-
>> From: general-boun...@developer.marklogic.com
>> [mailto:general-boun...@developer.marklogic.com] On Behalf Of Eliot
> 
>> Kimber
>> Sent: Monday, August 14, 2017 2:31 PM
>> To: MarkLogic Developer Discussion 
>> Subject: [MarkLogic Dev General] Count of cts:element-values() not
>equal  
>> to number of element instances--what's going on?
>>
>> I have this query:
>>
>> let $durations :=
>cts:element-values(xs:QName("prof:overall-elapsed"),
>> (), "descending",
>>  cts:collection-query($collection))
>>
>> And this query:
>>
>> let $overall-elapsed := $profiles/prof:metadata/prof:overall-elapsed
>>
>> Where there an element range index for prof:overall-elapsed.
>>
>> Comparing the two results I get very different numbers when I
>expected  
>> them to be equal:
>>
>> 47539
>> 21219
>>
>> Doing this:
>>
>> count(distinct-values($overall-elapsed ! xs:dayTimeDuration(.))
>>
>> Returns 21219, making it clear that the range index is returning
>> distinct values, not all values. It makes sense in terms of how I
>would  
>> expect a range index to be structured (a one-to-many mapping for
>values  
>> to elements) but doesn¹t make sense as the return for a function
>named  
>> ³element-values² (and not element-distinct-values).
>>
>> I didn¹t see this behavior mentioned in the docs (although the
>> introduction to the Lexicon reference section does describe
>lexicons as  
>> sets of unique values).
>>
>> My requirement is to *quickly* get a list of the durations for all
>> prof:expression elements (which I use for both counting and for
>> bucketing, so I need all values, not just all distinct values).
>>
>> Is there a way to do what I want using only indexes?
>>
>> Thanks,
>>
>> E.
>> --
>> Eliot Kimber
>> http://contrext.com
>>
>>
>>
>> ___
>> General mailing list
>> General@developer.marklogic.com
>> Manage your subscription at:
>> http://developer.marklogic.com/mailman/listinfo/general
>> ___
>> General mailing list
>> General@developer.marklogic.com
>> Manage your subscription at:
>> http://developer.marklogic.com/mailman/listinfo/general
>
>
>-- 
>Using Opera's revolutionary email client: http://www.opera.com/mail/
>___
>General mailing list
>General@developer.marklogic.com

Re: [MarkLogic Dev General] Date Time value not correct while using DLS API

2017-07-27 Thread Geert Josten
Hi Amit,

It is a so-called epoch timestamp, which is calculated as sec or millisec from 
1970-01-01. The documentation shows how to convert dateTime to lock timestamps:

http://docs.marklogic.com/xdmp:document-locks

And this function shows how you could go back:

https://github.com/grtjn/ml-datetime/blob/master/datetime.xqy#L515

Though the latter assumes epoch timestamps in millisec, whereas lock timestamps 
seem to be in sec..

Cheers,
Geert

From: 
>
 on behalf of amit gope >
Reply-To: MarkLogic Developer Discussion 
>
Date: Thursday, July 27, 2017 at 10:49 AM
To: "general@developer.marklogic.com" 
>
Subject: [MarkLogic Dev General] Date Time value not correct while using DLS API

Hi Team.

We are using DLS api for one of our projects making different versions of a 
given document, when the dls:checkout is happening, the entry that is being 
made for the timestamp value is:
http://marklogic.com/xdmp/dls;>
http://www.google.com/metadata/core/00-2754195023.xml
auser
0
1501144206
http://marklogic.com/xdmp/security;>7071164303237443533


When we are using the 
format-dateTime/xdmp:timestamp-to-wallclock(xs:unsignedLong(1501144206)) on the 
dls:timestamp the date turns out to be 1970-01-01, whereas the checkout was 
done today, can you please suggest from where the timestamp is getting created 
and why it dates back to 1970?
--
Best Regards
Amit

___
General mailing list
General@developer.marklogic.com
Manage your subscription at: 
http://developer.marklogic.com/mailman/listinfo/general


Re: [MarkLogic Dev General] Binary Document Ingestion in MP4 and MP3 format

2017-07-20 Thread Geert Josten
Erik Hennum pointed me to this section of the Admin Guide, but that doesn’t 
provide much more technical details:

http://docs.marklogic.com/guide/admin/fields#id_47294

Cheers

From: 
<general-boun...@developer.marklogic.com<mailto:general-boun...@developer.marklogic.com>>
 on behalf of Geert Josten 
<geert.jos...@marklogic.com<mailto:geert.jos...@marklogic.com>>
Reply-To: MarkLogic Developer Discussion 
<general@developer.marklogic.com<mailto:general@developer.marklogic.com>>
Date: Thursday, July 20, 2017 at 5:00 PM
To: MarkLogic Developer Discussion 
<general@developer.marklogic.com<mailto:general@developer.marklogic.com>>
Subject: Re: [MarkLogic Dev General] Binary Document Ingestion in MP4 and MP3 
format

Sounds like that to me, but don’t know details. It is indexed, so changing it 
must involve updating indexes for sure though. But there might be subtleties 
about what is actually reindexed and what not..

I’ll forward your question though..

Cheers

From: 
<general-boun...@developer.marklogic.com<mailto:general-boun...@developer.marklogic.com>>
 on behalf of Andreas Hubmer 
<andreas.hub...@ebcont.com<mailto:andreas.hub...@ebcont.com>>
Reply-To: MarkLogic Developer Discussion 
<general@developer.marklogic.com<mailto:general@developer.marklogic.com>>
Date: Thursday, July 20, 2017 at 4:42 PM
To: MarkLogic Developer Discussion 
<general@developer.marklogic.com<mailto:general@developer.marklogic.com>>
Subject: Re: [MarkLogic Dev General] Binary Document Ingestion in MP4 and MP3 
format

Thanks, Geert.

In the release notes I've found the following statement 
(https://docs.marklogic.com/guide/relnotes/chap3#id_45632):
Storing the axes times in metadata enables MarkLogic to update the axes 
timestamps without changing the documents and invoking reindexing.
To me it seems that the metadata is connected to the fragment but stored 
somehow differently. Do you know any more details?

Cheers,
Andreas



2017-07-20 16:35 GMT+02:00 Geert Josten 
<geert.jos...@marklogic.com<mailto:geert.jos...@marklogic.com>>:
Hi Andreas,

I tried to look for a nice Guide section, but couldn’t find one. But there 
isn’t too much to say about it actually.

It starts with adding metadata to a doc using 
http://docs.marklogic.com/xdmp:document-set-metadata. It takes a map:map, and 
non-string values will be converted to quoted strings. It effectively lives 
inside the same document fragment as the documents contents, but it is not 
included nor embedded when you pull up the contents with for instance fn:doc.

You can also search on it using so-called metadata fields. That is a new 3rd 
type of field. You can create them with admin ui, or for instance with admin 
functions. The Temporal guide spends a few words on it: 
http://docs.marklogic.com/guide/temporal/temporal-quick-start#id_50302. Very 
useful for storing temporal properties, but you can use it for other purposes 
too.

In search constraints you just refer to the field by name, like any other 
field. You can range index metadata fields too, like other fields, and even 
index as dateTime and such, but you cannot store a fragment of XML inside it, 
and index on a sub-element of that. It will simply get stored as quoted xml, 
and it will full-text search that instead..

Cheers,
Geert

From: 
<general-boun...@developer.marklogic.com<mailto:general-boun...@developer.marklogic.com>>
 on behalf of Andreas Hubmer 
<andreas.hub...@ebcont.com<mailto:andreas.hub...@ebcont.com>>
Reply-To: MarkLogic Developer Discussion 
<general@developer.marklogic.com<mailto:general@developer.marklogic.com>>
Date: Thursday, July 20, 2017 at 11:53 AM
To: MarkLogic Developer Discussion 
<general@developer.marklogic.com<mailto:general@developer.marklogic.com>>
Subject: Re: [MarkLogic Dev General] Binary Document Ingestion in MP4 and MP3 
format

Hi Geert,

MarkLogic 9 also allows storing simple key/value pairs in hidden document 
metadata, which is more efficient than document properties
I am interested in that new feature. Is there somewhere an explanation how it 
works (regarding reindexing, ...)?

Thanks,
Andreas



2017-07-20 11:33 GMT+02:00 Geert Josten 
<geert.jos...@marklogic.com<mailto:geert.jos...@marklogic.com>>:
Hi Pavan,

If you need to store both the binary itself, and the meta info + textual 
contents, there are two general approaches:

- put meta info and textual contents in document properties
- store them separately as normal documents with a reference with the database 
uri of the actual binary

MarkLogic 9 also allows storing simple key/value pairs in hidden document 
metadata, which is more efficient than document properties or separate docs, 
but it is probably too limited for this use case.

You can store transcripts of videos including timestamps as XML, which would 
work for both the two-doc, and the doc-prop a

Re: [MarkLogic Dev General] Binary Document Ingestion in MP4 and MP3 format

2017-07-20 Thread Geert Josten
Sounds like that to me, but don’t know details. It is indexed, so changing it 
must involve updating indexes for sure though. But there might be subtleties 
about what is actually reindexed and what not..

I’ll forward your question though..

Cheers

From: 
<general-boun...@developer.marklogic.com<mailto:general-boun...@developer.marklogic.com>>
 on behalf of Andreas Hubmer 
<andreas.hub...@ebcont.com<mailto:andreas.hub...@ebcont.com>>
Reply-To: MarkLogic Developer Discussion 
<general@developer.marklogic.com<mailto:general@developer.marklogic.com>>
Date: Thursday, July 20, 2017 at 4:42 PM
To: MarkLogic Developer Discussion 
<general@developer.marklogic.com<mailto:general@developer.marklogic.com>>
Subject: Re: [MarkLogic Dev General] Binary Document Ingestion in MP4 and MP3 
format

Thanks, Geert.

In the release notes I've found the following statement 
(https://docs.marklogic.com/guide/relnotes/chap3#id_45632):
Storing the axes times in metadata enables MarkLogic to update the axes 
timestamps without changing the documents and invoking reindexing.
To me it seems that the metadata is connected to the fragment but stored 
somehow differently. Do you know any more details?

Cheers,
Andreas



2017-07-20 16:35 GMT+02:00 Geert Josten 
<geert.jos...@marklogic.com<mailto:geert.jos...@marklogic.com>>:
Hi Andreas,

I tried to look for a nice Guide section, but couldn’t find one. But there 
isn’t too much to say about it actually.

It starts with adding metadata to a doc using 
http://docs.marklogic.com/xdmp:document-set-metadata. It takes a map:map, and 
non-string values will be converted to quoted strings. It effectively lives 
inside the same document fragment as the documents contents, but it is not 
included nor embedded when you pull up the contents with for instance fn:doc.

You can also search on it using so-called metadata fields. That is a new 3rd 
type of field. You can create them with admin ui, or for instance with admin 
functions. The Temporal guide spends a few words on it: 
http://docs.marklogic.com/guide/temporal/temporal-quick-start#id_50302. Very 
useful for storing temporal properties, but you can use it for other purposes 
too.

In search constraints you just refer to the field by name, like any other 
field. You can range index metadata fields too, like other fields, and even 
index as dateTime and such, but you cannot store a fragment of XML inside it, 
and index on a sub-element of that. It will simply get stored as quoted xml, 
and it will full-text search that instead..

Cheers,
Geert

From: 
<general-boun...@developer.marklogic.com<mailto:general-boun...@developer.marklogic.com>>
 on behalf of Andreas Hubmer 
<andreas.hub...@ebcont.com<mailto:andreas.hub...@ebcont.com>>
Reply-To: MarkLogic Developer Discussion 
<general@developer.marklogic.com<mailto:general@developer.marklogic.com>>
Date: Thursday, July 20, 2017 at 11:53 AM
To: MarkLogic Developer Discussion 
<general@developer.marklogic.com<mailto:general@developer.marklogic.com>>
Subject: Re: [MarkLogic Dev General] Binary Document Ingestion in MP4 and MP3 
format

Hi Geert,

MarkLogic 9 also allows storing simple key/value pairs in hidden document 
metadata, which is more efficient than document properties
I am interested in that new feature. Is there somewhere an explanation how it 
works (regarding reindexing, ...)?

Thanks,
Andreas



2017-07-20 11:33 GMT+02:00 Geert Josten 
<geert.jos...@marklogic.com<mailto:geert.jos...@marklogic.com>>:
Hi Pavan,

If you need to store both the binary itself, and the meta info + textual 
contents, there are two general approaches:

- put meta info and textual contents in document properties
- store them separately as normal documents with a reference with the database 
uri of the actual binary

MarkLogic 9 also allows storing simple key/value pairs in hidden document 
metadata, which is more efficient than document properties or separate docs, 
but it is probably too limited for this use case.

You can store transcripts of videos including timestamps as XML, which would 
work for both the two-doc, and the doc-prop approach.

Document properties allows storing complete XML fragments, and is associated 
with the same database uri as the actual document (in this case the binary 
data). It is included in indexing automatically. You just need to indicate you 
like to include properties fragments in searching and faceting.

There are out of the box CPF pipelines for Document Filtering. There is one 
that saves the the result in doc properties, and one that saves the result in a 
separate doc. It should be possible to enable those via the Admin ui..

Kind regards,
Geert

From: GUPTA Pavan 
<pavan.gu...@soprasteria.com<mailto:pavan.gu...@soprasteria.com>>
Date: Thursday, July 20, 2017 at 11:07 AM
To: MarkLogic Developer Discussion 
<general@developer.marklogic.

Re: [MarkLogic Dev General] Binary Document Ingestion in MP4 and MP3 format

2017-07-20 Thread Geert Josten
Hi Andreas,

I tried to look for a nice Guide section, but couldn’t find one. But there 
isn’t too much to say about it actually.

It starts with adding metadata to a doc using 
http://docs.marklogic.com/xdmp:document-set-metadata. It takes a map:map, and 
non-string values will be converted to quoted strings. It effectively lives 
inside the same document fragment as the documents contents, but it is not 
included nor embedded when you pull up the contents with for instance fn:doc.

You can also search on it using so-called metadata fields. That is a new 3rd 
type of field. You can create them with admin ui, or for instance with admin 
functions. The Temporal guide spends a few words on it: 
http://docs.marklogic.com/guide/temporal/temporal-quick-start#id_50302. Very 
useful for storing temporal properties, but you can use it for other purposes 
too.

In search constraints you just refer to the field by name, like any other 
field. You can range index metadata fields too, like other fields, and even 
index as dateTime and such, but you cannot store a fragment of XML inside it, 
and index on a sub-element of that. It will simply get stored as quoted xml, 
and it will full-text search that instead..

Cheers,
Geert

From: 
<general-boun...@developer.marklogic.com<mailto:general-boun...@developer.marklogic.com>>
 on behalf of Andreas Hubmer 
<andreas.hub...@ebcont.com<mailto:andreas.hub...@ebcont.com>>
Reply-To: MarkLogic Developer Discussion 
<general@developer.marklogic.com<mailto:general@developer.marklogic.com>>
Date: Thursday, July 20, 2017 at 11:53 AM
To: MarkLogic Developer Discussion 
<general@developer.marklogic.com<mailto:general@developer.marklogic.com>>
Subject: Re: [MarkLogic Dev General] Binary Document Ingestion in MP4 and MP3 
format

Hi Geert,

MarkLogic 9 also allows storing simple key/value pairs in hidden document 
metadata, which is more efficient than document properties
I am interested in that new feature. Is there somewhere an explanation how it 
works (regarding reindexing, ...)?

Thanks,
Andreas



2017-07-20 11:33 GMT+02:00 Geert Josten 
<geert.jos...@marklogic.com<mailto:geert.jos...@marklogic.com>>:
Hi Pavan,

If you need to store both the binary itself, and the meta info + textual 
contents, there are two general approaches:

- put meta info and textual contents in document properties
- store them separately as normal documents with a reference with the database 
uri of the actual binary

MarkLogic 9 also allows storing simple key/value pairs in hidden document 
metadata, which is more efficient than document properties or separate docs, 
but it is probably too limited for this use case.

You can store transcripts of videos including timestamps as XML, which would 
work for both the two-doc, and the doc-prop approach.

Document properties allows storing complete XML fragments, and is associated 
with the same database uri as the actual document (in this case the binary 
data). It is included in indexing automatically. You just need to indicate you 
like to include properties fragments in searching and faceting.

There are out of the box CPF pipelines for Document Filtering. There is one 
that saves the the result in doc properties, and one that saves the result in a 
separate doc. It should be possible to enable those via the Admin ui..

Kind regards,
Geert

From: GUPTA Pavan 
<pavan.gu...@soprasteria.com<mailto:pavan.gu...@soprasteria.com>>
Date: Thursday, July 20, 2017 at 11:07 AM
To: MarkLogic Developer Discussion 
<general@developer.marklogic.com<mailto:general@developer.marklogic.com>>, 
Geert Josten <geert.jos...@marklogic.com<mailto:geert.jos...@marklogic.com>>
Subject: RE: [MarkLogic Dev General] Binary Document Ingestion in MP4 and MP3 
format

Hello Geert,

Thanks for information. I would also know how I can store the content (means 
spoken words) of a video and find the time when it was spoken as we load the 
content of any document file in metadata.
Is there any CPF I need to apply or suggest some library.

Thanks In Advance!


Regards,
Pavan

From:general-boun...@developer.marklogic.com<mailto:general-boun...@developer.marklogic.com>
 [mailto:general-boun...@developer.marklogic.com] On Behalf Of Geert Josten
Sent: Thursday, July 20, 2017 2:27 PM
To: MarkLogic Developer Discussion
Subject: Re: [MarkLogic Dev General] Binary Document Ingestion in MP4 and MP3 
format

Hi Pavan,

You can apply xdmp:document-filter on many binary formats, including mp3 and 
mp4. It will extract meta information like file size and content mime type, and 
for instance document properties from office documents, and exif tags from 
images. It will also attempt extract actual text, but that will only work if 
such text is inside the file in a machine readable form. E.g. text contained 
inside images or video streams will not be captured. This includes images 
embedded in office docs, im

Re: [MarkLogic Dev General] Binary Document Ingestion in MP4 and MP3 format

2017-07-20 Thread Geert Josten
Hi Pavan,

If you need to store both the binary itself, and the meta info + textual 
contents, there are two general approaches:

- put meta info and textual contents in document properties
- store them separately as normal documents with a reference with the database 
uri of the actual binary

MarkLogic 9 also allows storing simple key/value pairs in hidden document 
metadata, which is more efficient than document properties or separate docs, 
but it is probably too limited for this use case.

You can store transcripts of videos including timestamps as XML, which would 
work for both the two-doc, and the doc-prop approach.

Document properties allows storing complete XML fragments, and is associated 
with the same database uri as the actual document (in this case the binary 
data). It is included in indexing automatically. You just need to indicate you 
like to include properties fragments in searching and faceting.

There are out of the box CPF pipelines for Document Filtering. There is one 
that saves the the result in doc properties, and one that saves the result in a 
separate doc. It should be possible to enable those via the Admin ui..

Kind regards,
Geert

From: GUPTA Pavan 
<pavan.gu...@soprasteria.com<mailto:pavan.gu...@soprasteria.com>>
Date: Thursday, July 20, 2017 at 11:07 AM
To: MarkLogic Developer Discussion 
<general@developer.marklogic.com<mailto:general@developer.marklogic.com>>, 
Geert Josten <geert.jos...@marklogic.com<mailto:geert.jos...@marklogic.com>>
Subject: RE: [MarkLogic Dev General] Binary Document Ingestion in MP4 and MP3 
format

Hello Geert,

Thanks for information. I would also know how I can store the content (means 
spoken words) of a video and find the time when it was spoken as we load the 
content of any document file in metadata.
Is there any CPF I need to apply or suggest some library.

Thanks In Advance!


Regards,
Pavan

From: 
general-boun...@developer.marklogic.com<mailto:general-boun...@developer.marklogic.com>
 [mailto:general-boun...@developer.marklogic.com] On Behalf Of Geert Josten
Sent: Thursday, July 20, 2017 2:27 PM
To: MarkLogic Developer Discussion
Subject: Re: [MarkLogic Dev General] Binary Document Ingestion in MP4 and MP3 
format

Hi Pavan,

You can apply xdmp:document-filter on many binary formats, including mp3 and 
mp4. It will extract meta information like file size and content mime type, and 
for instance document properties from office documents, and exif tags from 
images. It will also attempt extract actual text, but that will only work if 
such text is inside the file in a machine readable form. E.g. text contained 
inside images or video streams will not be captured. This includes images 
embedded in office docs, image pdf, and also captions and subtitles on images 
and videos. You would need an OCR kind of solution for that..

Kind regards,
Geert

From: 
<general-boun...@developer.marklogic.com<mailto:general-boun...@developer.marklogic.com>>
 on behalf of GUPTA Pavan 
<pavan.gu...@soprasteria.com<mailto:pavan.gu...@soprasteria.com>>
Reply-To: MarkLogic Developer Discussion 
<general@developer.marklogic.com<mailto:general@developer.marklogic.com>>
Date: Thursday, July 20, 2017 at 9:19 AM
To: "general@developer.marklogic.com<mailto:general@developer.marklogic.com>" 
<general@developer.marklogic.com<mailto:general@developer.marklogic.com>>
Subject: [MarkLogic Dev General] Binary Document Ingestion in MP4 and MP3 format

Hi Team,

I am trying to ingest the .mp4 and .mp3 file and make them searchable. I have 
studied that these files are considered as binary files.

I have also seen how to make the binary files searchable but I have done for 
.doc, .ppt, .pdf etc file but could not do for .mp4 or .mp3.

Actually I want to make the files searchable.

Can you please direct me how to achieve this and tell me if I need to enable or 
set up any content processing framework for same.\

Thanks In Advance!


Regards,
Pavan
___
General mailing list
General@developer.marklogic.com
Manage your subscription at: 
http://developer.marklogic.com/mailman/listinfo/general


Re: [MarkLogic Dev General] Binary Document Ingestion in MP4 and MP3 format

2017-07-20 Thread Geert Josten
Hi Pavan,

You can apply xdmp:document-filter on many binary formats, including mp3 and 
mp4. It will extract meta information like file size and content mime type, and 
for instance document properties from office documents, and exif tags from 
images. It will also attempt extract actual text, but that will only work if 
such text is inside the file in a machine readable form. E.g. text contained 
inside images or video streams will not be captured. This includes images 
embedded in office docs, image pdf, and also captions and subtitles on images 
and videos. You would need an OCR kind of solution for that..

Kind regards,
Geert

From: 
>
 on behalf of GUPTA Pavan 
>
Reply-To: MarkLogic Developer Discussion 
>
Date: Thursday, July 20, 2017 at 9:19 AM
To: "general@developer.marklogic.com" 
>
Subject: [MarkLogic Dev General] Binary Document Ingestion in MP4 and MP3 format

Hi Team,

I am trying to ingest the .mp4 and .mp3 file and make them searchable. I have 
studied that these files are considered as binary files.

I have also seen how to make the binary files searchable but I have done for 
.doc, .ppt, .pdf etc file but could not do for .mp4 or .mp3.

Actually I want to make the files searchable.

Can you please direct me how to achieve this and tell me if I need to enable or 
set up any content processing framework for same.\

Thanks In Advance!


Regards,
Pavan
___
General mailing list
General@developer.marklogic.com
Manage your subscription at: 
http://developer.marklogic.com/mailman/listinfo/general


Re: [MarkLogic Dev General] Regarding versioning of documents without using DLS.

2017-07-10 Thread Geert Josten
Hi Shabana,

I’d recommend looking into the bi-temp functionality, or more specifically the 
uni-temporal variant that was added in MarkLogic 9. The temporal functionality 
is embedded much deeper into MarkLogic, and takes away some of the heavy burden 
of guarding temporal documents are not tampered with. It works with a simple 
‘latest’ collection which allows you to easily find the latest version of all 
documents.

I recommend looking through the Temporal guide to get a better understanding of 
it:

http://docs.marklogic.com/guide/temporal

Kind regards,
Geert

From: 
>
 on behalf of shabana khan 
>
Reply-To: MarkLogic Developer Discussion 
>
Date: Monday, July 10, 2017 at 9:19 AM
To: MarkLogic Developer Discussion 
>
Subject: [MarkLogic Dev General] Regarding versioning of documents without 
using DLS.

Hi All,

Has anyone come across a scenario where you need to support editing of 
documents but keeping the original document intact but not using DLS for 
versioning thing.

We need to track the different states in which a document can be possibly 
present at any point : that requires some kind of versioning being attached to 
the most recent copy of document.

But we don't intend to use DLS for that and plan to combine collection plus 
permissions to be able to see the documents from the final collection only.

 We do have a rough draft covering different scenarios but not a concrete plan.

Any suggestions will be highly appreciated to give us a good start.

Thanks and Regards,
Shabana Khan
LinkedIn



___
General mailing list
General@developer.marklogic.com
Manage your subscription at: 
http://developer.marklogic.com/mailman/listinfo/general


Re: [MarkLogic Dev General] RSASHA256 Javascript code for security token to get data from API

2017-07-07 Thread Geert Josten
Hi Nalini,

There is nothing MarkLogic specific about this question, so I think this
isn't the best place to ask this question. I¹d recommend looking and/or
posting a question on StackOverflow, and tagging it with JavaScript. That
way you also reach a potentially much bigger community.

Kind regards,
Geert


From:   on behalf of Nalini Shrma

Reply-To:  MarkLogic Developer Discussion 
Date:  Friday, July 7, i2017 at 4:40 AM
To:  MarkLogic Developer Discussion 
Subject:  [MarkLogic Dev General] RSASHA256 Javascript code for
securitytoken to get data from API


Hi,
I have written javascript code as below to make in HmacSHA256 but i need
in RSASHA256 to generate signature, can you plz help.

public key=
-BEGIN PUBLIC KEY-
MIGfMA0GCSqGSIb3DQEBAQUAA4GNADCBiQKBgQDdlatRjRjogo3WojgGHFHYLugdUWAY9iR3fy4
arWNA1KoS8kVw33cJibXr8bvwUAUparCwlvdbH6dvEOfou0/gCFQsHUfQrSDv+MuSUMAe8jzKE4
qW+jK+xQU9a03GUnKHkkle+Q0pX/g6jXZ7r1/xAK5Do2kQ+X5xK9cipRgEKwIDAQAB
-END PUBLIC KEY-

Private Key=
-BEGIN RSA PRIVATE KEY-
MIICWwIBAAKBgQDdlatRjRjogo3WojgGHFHYLugdUWAY9iR3fy4arWNA1KoS8kVw33cJibXr8bv
wUAUparCwlvdbH6dvEOfou0/gCFQsHUfQrSDv+MuSUMAe8jzKE4qW+jK+xQU9a03GUnKHkkle+Q
0pX/g6jXZ7r1/xAK5Do2kQ+X5xK9cipRgEKwIDAQABAoGAD+onAtVye4ic7VR7V50DF9bOnwRwN
XrARcDhq9LWNRrRGElESYYTQ6EbatXS3MCyjjX2eMhu/aF5YhXBwkppwxg+EOmXeh+MzL7Zh284
OuPbkglAaGhV9bb6/5CpuGb1esyPbYW+Ty2PC0GSZfIXkXs76jXAu9TOBvD0ybc2YlkCQQDywg2
R/7t3Q2OE2+yo382CLJdrlSLVROWKwb4tb2PjhY4XAwV8d1vy0RenxTB+K5Mu57uVSTHtrMK0GA
tFr833AkEA6avx20OHo61Yela/4k5kQDtjEf1N0LfI+BcWZtxsS3jDM3i1Hp0KSu5rsCPb8acJo
5RO26gGVrfAsDcIXKC+bQJAZZ2XIpsitLyPpuiMOvBbzPavd4gY6Z8KWrfYzJoI/Q9FuBo6rKwl
4BFoToD7WIUS+hpkagwWiz+6zLoX1dbOZwJACmH5fSSjAkLRi54PKJ8TFUeOP15h9sQzydI8zJU
+upvDEKZsZc/UhT/SySDOxQ4G/523Y0sz/OZtSWcol/UMgQJALesy++GdvoIDLfJX5GBQpuFgFe
nRiRDabxrE9MNUZ2aPFaFp+DyAe+b4nDwuJaW2LURbr8AEZga7oQj0uYxcYw==
  -END RSA PRIVATE KEY-



JS code:


___
General mailing list
General@developer.marklogic.com
Manage your subscription at: 
http://developer.marklogic.com/mailman/listinfo/general


Re: [MarkLogic Dev General] MLCP backward compatibility

2017-06-29 Thread Geert Josten
Hi Rajesh,

The MLCP guide tells you need MarkLogic 7.0-1 at the least: 
http://docs.marklogic.com/guide/mlcp/install#id_44231

MLCP relies on a few xqy libraries that should be present server-side, and they 
were not included in MarkLogic 6 and older. For MarkLogic 6 and before your 
best option would probably be XQSync and/or RecordLoader which should be listed 
here: http://developer.marklogic.com/code

Cheers,
Geert

From: 
>
 on behalf of Rajesh Kumar >
Reply-To: MarkLogic Developer Discussion 
>
Date: Thursday, June 29, 2017 at 11:24 AM
To: "general@developer.marklogic.com" 
>
Subject: [MarkLogic Dev General] MLCP backward compatibility

Hi Team,

Is mlcp backward compatible? Can i use in MarkLogic Version 4.2?

I'm facing errors when trying to use mlcp 8.0.6.3. Kindly let me know if there 
is any other version on mlcp which is backward compatible.

Thanks & Regards,
Rajesh
___
General mailing list
General@developer.marklogic.com
Manage your subscription at: 
http://developer.marklogic.com/mailman/listinfo/general


Re: [MarkLogic Dev General] Jenkins integration with Marklogic roxy deployer framework

2017-06-29 Thread Geert Josten
Hi Santhosh,

I think it is easiest to just run Shell commands as build steps, and issue 
commands like `./ml local bootstrap` etc. If you don’t like adding admin 
credentials into a deploy/local.properties, you can pass in admin pwd with 
something like ` --ml.password=`..

Cheers,
Geert

From: 
>
 on behalf of 
"santhosh.rajasekar...@cognizant.com"
 
>
Reply-To: MarkLogic Developer Discussion 
>
Date: Thursday, June 29, 2017 at 6:29 AM
To: "general@developer.marklogic.com" 
>
Subject: [MarkLogic Dev General] Jenkins integration with Marklogic roxy 
deployer framework

Hi All,

We are using roxy deployer to deploy code / modules to MarkLogic server.
We are planning to integrate Jenkins for the deployment process.
Can you please let us know the steps or process of integrating roxy project and 
deployment via Jenkins.

Regards,
Santhosh
This e-mail and any files transmitted with it are for the sole use of the 
intended recipient(s) and may contain confidential and privileged information. 
If you are not the intended recipient(s), please reply to the sender and 
destroy all copies of the original message. Any unauthorized review, use, 
disclosure, dissemination, forwarding, printing or copying of this email, 
and/or any action taken in reliance on the contents of this e-mail is strictly 
prohibited and may be unlawful. Where permitted by applicable law, this e-mail 
and other e-mail communications sent to and from Cognizant e-mail addresses may 
be monitored.
___
General mailing list
General@developer.marklogic.com
Manage your subscription at: 
http://developer.marklogic.com/mailman/listinfo/general


Re: [MarkLogic Dev General] MarkLogic 9 XSLT bug(?) with attribute matches

2017-06-23 Thread Geert Josten
Hi Inigo,

You are using curly braces inside your XSLT, but your XSLT is in fact literal 
XML embedded in XQuery, so {local-name()} is interpreted before the 
xdmp:xslt-eval call. You need to escape those curly braces by doubling them, 
e.g. {{local-name()}}

Cheers,
Geert

From: 
>
 on behalf of Inigo Surguy 
>
Reply-To: MarkLogic Developer Discussion 
>
Date: Friday, June 23, 2017 at 10:19 AM
To: "general@developer.marklogic.com" 
>
Subject: [MarkLogic Dev General] MarkLogic 9 XSLT bug(?) with attribute matches

Hi,

The following looks like a bug in MarkLogic's XSLT engine to me. It works fine 
via Saxon:

You can run the following in a QConsole:

-

let $xml := 

let $xslt := http://www.w3.org/1999/XSL/Transform; 
version="2.0">
  

  
  
  


return xdmp:xslt-eval($xslt, $xml)

-

Expected behaviour - output of 
Actual behaviour - XDMP-MISSINGCONTEXT

I've tested this on ML 9.0-1.1, and on ML 8.0-6.3.

Are there any problems with the XSLT above? It looks correct to me, and 
matching the context node inside an attribute match is pretty important for 
being able to do any sort of general transform of the document.

Inigo

___
General mailing list
General@developer.marklogic.com
Manage your subscription at: 
http://developer.marklogic.com/mailman/listinfo/general


Re: [MarkLogic Dev General] Regarding Marklogic space

2017-06-22 Thread Geert Josten
Hi Siva,

It may be wise to reach out to MarkLogic Support as well for more detailed 
guidance, but I can at least try to explain what the graph is showing you.

Your databases have a combined disk footprint of 450 Gb. If deleted fragments 
have been merged out fully, that is how much the actual data really needs for 
storage. However, MarkLogic needs additional space for normal maintenance.

It needs extra space for merging. This is because of the MVCC approach where 
any update gets written as a new copy of the document, creating new so-called 
stand files as it goes. Old copies of documents get marked deleted, and are 
cleaned up in the background with merging. Multiple stands get merged into new 
ones while trimming off deleted fragments, after which the old stands are 
purged. That way MarkLogic can ensure it stays performant even after many 
updates.

It also needs extra space for reindexing (in case index settings are changes 
after data was loaded). This causes new index files to be written, which 
replace existing ones as soon as re-indexation is done.

Background maintenance like this can occur any time, and for multiple databases 
and forests at the same time. It could very well be that a forest is both 
reindexing and merging at the same time. If all databases are reindexing and 
merging at the same time, you need a lot of extra disk space until that 
finishes.

The thumb rule is that you need roughly 1.5 times more free space than your 
forest data itself (see also 
http://docs.marklogic.com/guide/relnotes/other#id_43648). Your forest data is 
450 Gb, so it would be best to have 675 Gb of free disk space. 549 Gb is less 
than that, so that is why you are seeing an exclamation mark next to the free 
space.

Forest Reserve is the amount of free space you need to reserve to be able to 
merge all databases. See also 
http://docs.marklogic.com/guide/monitoring/dashboard#id_60621

Note though that these numbers don’t take MarkLogic backups into account. If 
you are running backup schedules, consider writing them to a separate mount, or 
offload them once finished.

Cheers,
Geert


From: 
>
 on behalf of "Mani, Sivasubramani (ELS)" 
>
Reply-To: MarkLogic Developer Discussion 
>
Date: Thursday, June 22, 2017 at 11:26 AM
To: MarkLogic Developer Discussion 
>
Cc: ConSyn-Infosys-Support 
>
Subject: [MarkLogic Dev General] Regarding Marklogic space

Hi Team,

I am new to Mark Logic. In the Mark Logic Monitoring Dashboard one of my Mark 
Logic Node show’s critical disk space. My Free disk space is 549GB but show’s 
9.8% capacity. What is meant by capacity here How capacity is arrived. If 
capacity is in critical it causes any problem to the node, if yes then how can 
we resolve this. Kindly help on this.



[cid:image001.png@01D2EB42.1330F470]

Thanks & Regards,
Siva

___
General mailing list
General@developer.marklogic.com
Manage your subscription at: 
http://developer.marklogic.com/mailman/listinfo/general


Re: [MarkLogic Dev General] Marklogic - xdmp:filesystem-file

2017-06-22 Thread Geert Josten
Hi Thichxai,

You are reading the entire file as a single value. I’d suggest putting bare 
id’s in your file (no quotes, no commas), one on each line. Then, after reading 
the file, use fn:tokenize to split on line-end before you pass in the list into 
your element-value-query..

Kind regards,
Geert

From: 
>
 on behalf of Ly CafeSua >
Reply-To: MarkLogic Developer Discussion 
>
Date: Thursday, June 22, 2017 at 6:41 AM
To: "general@developer.marklogic.com" 
>
Subject: [MarkLogic Dev General] Marklogic - xdmp:filesystem-file

I have a text file contains a list over 2 thousands customerid. when use 
xdmp:filesystem-file("opt/rec/listid.txt") to make sure it can read all 
customer id inside a file. it returns all id as expected.

"00013",
"00014"
"00015"

However when I assigned "xdmp:filesystem-file" to a variable below, the result 
always zero.  I could not figure out what wrong. please give me some hints or 
sample code.

let $listID := xdmp:filesystem-file("opt/rec/listid.txt")

let $uris := cts:uris( (),(),
  cts:and-query((
  
cts:collection-query("/collection/customers"),
  
cts:element-value-query(xs:QName("meta:custid",($listID))
   ))
 )

return count($uris)

result: 0

Thanks
Thichxai
___
General mailing list
General@developer.marklogic.com
Manage your subscription at: 
http://developer.marklogic.com/mailman/listinfo/general


Re: [MarkLogic Dev General] Question about bitemporal DB features

2017-06-20 Thread Geert Josten
Hi,

MarkLogic will save complete copies of documents, but whether a JSON file of 
500Kb on disk will really take a footprint of 500Kb of forest data is rather 
hard to predict. Values and property names are mapped to a string data table 
that is stored separately from the structure. If there is a lot of repetition 
in the data, it could be much less then 500Kb per copy. It is best to just try..

By the way, a 500Kb JSON sounds large. It might be worth looking into splitting 
it into pieces. MarkLogic works best with record-like documents. E.g. instead 
of saving an entire bookstore in one document, save books separately.

Kind regards,
Geert

From: 
>
 on behalf of Pinku Surana 
>
Reply-To: MarkLogic Developer Discussion 
>
Date: Tuesday, June 20, 2017 at 4:53 PM
To: "general@developer.marklogic.com" 
>
Subject: [MarkLogic Dev General] Question about bitemporal DB features


I'm considering MarkLogic and have a question about the implementation of the 
bitemporal DB feature.

Say I have a 500KB JSON document stored in the DB. I want to update a single 
field in the document 2000 times. Will MarkLogic store a duplicate of the 
entire object (resulting in 1GB of total storage for that object)? Or will it 
only store the difference between the object, hopefully resulting in 
significantly less space consumption?

I want to use this feature to look at the object in the past. I'm hoping 
MarkLogic can store changes efficiently while also reconstructing old versions 
of the object quickly.

Thanks.

___
General mailing list
General@developer.marklogic.com
Manage your subscription at: 
http://developer.marklogic.com/mailman/listinfo/general


Re: [MarkLogic Dev General] SJS: use async callback value in the response body

2017-06-20 Thread Geert Josten
Hi Florent,

As far as I know, event-driven or async processing simply doesn’t work in SJS..

Cheers,
Geert

From: 
>
 on behalf of Florent Georges >
Reply-To: MarkLogic Developer Discussion 
>
Date: Tuesday, June 20, 2017 at 1:13 PM
To: MarkLogic Developer Discussion 
>
Subject: [MarkLogic Dev General] SJS: use async callback value in the response 
body

Hi,

I am using a JavaScript library that uses callbacks intensively in its API to 
cope with async processing.  As it is usually the case in the Node universe.

One of the functions is essentially a source for data, which might require an 
async processing to get read.  So it does not return the data, it provides it 
instead as parameter to a callback function.

Because of the async nature, it does not return itself anything returned from 
the callback function, as the latter will be called typically after the end of 
completion of the readData call itself.

This represents more or less that situation:

// imported from a lib, cannot be changed
function readData(callback) {
   // actually not called directly, but it is lib implem details
   callback({ some: 'data' });
}

readData(data => {
   // how to use `data` in the response sent to the client?
   return 'Response body: ' + data.some;
});

The data must be used in the response returned to the client, but there is no 
way to use it, per se, in the return value of the script.  Because this is 
quite a common pattern in server-side JavaScript, I thought maybe there is a 
pattern addressing the issue in MarkLogic?

Any thought on this?

Regards,

--
Florent Georges
H2O Consulting
http://h2o.consulting/


___
General mailing list
General@developer.marklogic.com
Manage your subscription at: 
http://developer.marklogic.com/mailman/listinfo/general


Re: [MarkLogic Dev General] MLCP Error Return

2017-06-20 Thread Geert Josten
Ah thanks, that is being monitored by Engineering too. I added a link to this 
mail thread..

Cheers

From: 
<general-boun...@developer.marklogic.com<mailto:general-boun...@developer.marklogic.com>>
 on behalf of Hans Hübner 
<hans.hueb...@lambdawerk.com<mailto:hans.hueb...@lambdawerk.com>>
Reply-To: MarkLogic Developer Discussion 
<general@developer.marklogic.com<mailto:general@developer.marklogic.com>>
Date: Tuesday, June 20, 2017 at 9:33 AM
To: MarkLogic Developer Discussion 
<general@developer.marklogic.com<mailto:general@developer.marklogic.com>>
Subject: Re: [MarkLogic Dev General] MLCP Error Return

On Tue, Jun 20, 2017 at 9:25 AM, Geert Josten 
<geert.jos...@marklogic.com<mailto:geert.jos...@marklogic.com>> wrote:
I am not entirely sure about the reasoning behind the logic. It may be to 
continue processing as many files as possible, without stopping. It may also be 
that MLCP wasn’t really designed to be used in an embedded way. If you are 
really looking to automate processing, I think DMSDK (which was released 
together with MarkLogic 9) would be a better fit. It doesn’t come with a 
command-line tool though.

It is just that MLCP is being the recommended tool.  I'll have a look at DMSDK 
nevertheless, thanks!

Let me file a bug report for it. Could you tell me MLCP and MarkLogic version?

We're running MarkLogic 9.0-1.1 and MLCP 9.0.1.  I've also filed an issue in 
GitHub:   https://github.com/marklogic/marklogic-contentpump/issues/62

-Hans


Kind regards,
Geert

From: 
<general-boun...@developer.marklogic.com<mailto:general-boun...@developer.marklogic.com>>
 on behalf of Hans Hübner 
<hans.hueb...@lambdawerk.com<mailto:hans.hueb...@lambdawerk.com>>
Reply-To: MarkLogic Developer Discussion 
<general@developer.marklogic.com<mailto:general@developer.marklogic.com>>
Date: Tuesday, June 20, 2017 at 9:05 AM
To: MarkLogic Developer Discussion 
<general@developer.marklogic.com<mailto:general@developer.marklogic.com>>
Subject: Re: [MarkLogic Dev General] MLCP Error Return

I have reviewed the source code of MLCP and found that the problem with the 
missing exit codes is caused by its error handling strategy, which basically is 
"log the error and continue".  Was it a conscious descision to have MLCP deal 
with errors like this?  If not, would it maybe be a quick and clean way out to 
make MLCP exit with a non-zero error status whenever an error is encountered?  
This would be relatively easy to implement by changing the calls to LOG.error 
to a function that logs the error and exit, and the behavior could even be made 
optional so that if someone relies on MLCP to continue when an error is 
encountered, they can have that.

I find the overall error handling strategy rather puzzling, though.  In my 
experience, continuing after errors without corrective action calls for trouble.

-Hans

On Sat, Jun 17, 2017 at 9:09 PM, Hans Hübner 
<hans.hueb...@lambdawerk.com<mailto:hans.hueb...@lambdawerk.com>> wrote:
Hello Geert,

thank you for getting back.  I have tried invoking the jar directly, but I 
still get no meaningful exit status:

imogas 1245_% java -cp 
/opt/mlcp-9.0.1/bin/..//conf:/opt/mlcp-9.0.1/bin/..//lib/avro-1.7.4.jar[ELIDED] 
-DCONTENTPUMP_HOME=/opt/mlcp-9.0.1/bin/..//lib/ -DBUNDLE_ARTIFACT=apache 
-Dfile.encoding=UTF-8 -Djava.library.path=/opt/mlcp-9.0.1/bin/..//lib/native 
com.marklogic.contentpump.ContentPump import -input_file_path foo -host 
localhost -username foo -password foo
17/06/17 15:04:53 INFO contentpump.LocalJobRunner: Content type is set to 
MIXED.  The format of the  inserted documents will be determined by the MIME  
type specification configured on MarkLogic Server.
17/06/17 15:04:54 INFO contentpump.ContentPump: Job name: local_675746565_1
17/06/17 15:04:54 ERROR contentpump.LocalJobRunner: Error checking output 
specification:
17/06/17 15:04:54 ERROR contentpump.LocalJobRunner: No input files found with 
the specified input path file:/home/hans/Development/bpm-processes/foo and 
input file pattern .*
imogas 1246_% echo $?
0

Is there anything I'm doing wrong?

-Hans



--
LambdaWerk GmbH
Oranienburger Straße 87/89
10178 Berlin
Phone: +49 30 555 7335 0
Fax: +49 30 555 7335 99

HRB 169991 B Amtsgericht Charlottenburg
USt-ID: DE301399951
Geschäftsführer:  Hans Hübner

http://lambdawerk.com/



___
General mailing list
General@developer.marklogic.com<mailto:General@developer.marklogic.com>
Manage your subscription at:
http://developer.marklogic.com/mailman/listinfo/general




--
LambdaWerk GmbH
Oranienburger Straße 87/89
10178 Berlin
Phone: +49 30 555 7335 0
Fax: +49 30 555 7335 99

HRB 169991 B Amtsgericht Charlottenburg
USt-ID: DE301399951
Geschäftsführer:  Hans Hübner

http://lambdawerk.com/


___
General mailing list
General@developer.marklogic.com
Manage your subscription at: 
http://developer.marklogic.com/mailman/listinfo/general


Re: [MarkLogic Dev General] MLCP Error Return

2017-06-20 Thread Geert Josten
Hi Hans,

I am not entirely sure about the reasoning behind the logic. It may be to 
continue processing as many files as possible, without stopping. It may also be 
that MLCP wasn’t really designed to be used in an embedded way. If you are 
really looking to automate processing, I think DMSDK (which was released 
together with MarkLogic 9) would be a better fit. It doesn’t come with a 
command-line tool though.

Let me file a bug report for it. Could you tell me MLCP and MarkLogic version?

Kind regards,
Geert

From: 
>
 on behalf of Hans Hübner 
>
Reply-To: MarkLogic Developer Discussion 
>
Date: Tuesday, June 20, 2017 at 9:05 AM
To: MarkLogic Developer Discussion 
>
Subject: Re: [MarkLogic Dev General] MLCP Error Return

I have reviewed the source code of MLCP and found that the problem with the 
missing exit codes is caused by its error handling strategy, which basically is 
"log the error and continue".  Was it a conscious descision to have MLCP deal 
with errors like this?  If not, would it maybe be a quick and clean way out to 
make MLCP exit with a non-zero error status whenever an error is encountered?  
This would be relatively easy to implement by changing the calls to LOG.error 
to a function that logs the error and exit, and the behavior could even be made 
optional so that if someone relies on MLCP to continue when an error is 
encountered, they can have that.

I find the overall error handling strategy rather puzzling, though.  In my 
experience, continuing after errors without corrective action calls for trouble.

-Hans

On Sat, Jun 17, 2017 at 9:09 PM, Hans Hübner 
> wrote:
Hello Geert,

thank you for getting back.  I have tried invoking the jar directly, but I 
still get no meaningful exit status:

imogas 1245_% java -cp 
/opt/mlcp-9.0.1/bin/..//conf:/opt/mlcp-9.0.1/bin/..//lib/avro-1.7.4.jar[ELIDED] 
-DCONTENTPUMP_HOME=/opt/mlcp-9.0.1/bin/..//lib/ -DBUNDLE_ARTIFACT=apache 
-Dfile.encoding=UTF-8 -Djava.library.path=/opt/mlcp-9.0.1/bin/..//lib/native 
com.marklogic.contentpump.ContentPump import -input_file_path foo -host 
localhost -username foo -password foo
17/06/17 15:04:53 INFO contentpump.LocalJobRunner: Content type is set to 
MIXED.  The format of the  inserted documents will be determined by the MIME  
type specification configured on MarkLogic Server.
17/06/17 15:04:54 INFO contentpump.ContentPump: Job name: local_675746565_1
17/06/17 15:04:54 ERROR contentpump.LocalJobRunner: Error checking output 
specification:
17/06/17 15:04:54 ERROR contentpump.LocalJobRunner: No input files found with 
the specified input path file:/home/hans/Development/bpm-processes/foo and 
input file pattern .*
imogas 1246_% echo $?
0

Is there anything I'm doing wrong?

-Hans



--
LambdaWerk GmbH
Oranienburger Straße 87/89
10178 Berlin
Phone: +49 30 555 7335 0
Fax: +49 30 555 7335 99

HRB 169991 B Amtsgericht Charlottenburg
USt-ID: DE301399951
Geschäftsführer:  Hans Hübner

http://lambdawerk.com/


___
General mailing list
General@developer.marklogic.com
Manage your subscription at: 
http://developer.marklogic.com/mailman/listinfo/general


Re: [MarkLogic Dev General] cts:element-value-match for integers

2017-06-19 Thread Geert Josten
The only workaround using the int index that I could think of would be to use 
ranges with range-queries. Something like:

let $pattern := 200
return cts:or-query((
  for $i in 0 to 10
  let $power := xs:int(math:pow(10, $i))
  let $start := $pattern * $power
  let $end := ($pattern + 1) * $power
  return cts:and-query((
cts:element-range-query(xs:QName("element"), ">=", $start),
cts:element-range-query(xs:QName("element"), "<", $end)
  ))
))

Cheers

From: 
>
 on behalf of Evan Lenz 
>
Reply-To: MarkLogic Developer Discussion 
>
Date: Monday, June 19, 2017 at 11:29 PM
To: MarkLogic Developer Discussion 
>
Subject: Re: [MarkLogic Dev General] cts:element-value-match for integers

Hi Oleksii,

You'll have to create a string range index too. I've never used 
cts:element-value-match() with anything but a string index, but it looks like 
it's possible: you just get back the value you asked for, if it exists. And now 
that I think about it, that can be rather useful! So thanks for effectively 
cluing me in to this use case! :-)

The docs for the $pattern parameter say: "The parameter type must match the 
lexicon type. String parameters may include wildcard characters."[1]

So that seems to confirm that wildcards can only be used with string-typed 
range indexes.

Evan

[1] http://docs.marklogic.com/cts:element-value-match#pattern



Evan Lenz
President, Lenz Consulting Group, Inc.
http://lenzconsulting.com
+1 (206) 898-1654

On Mon, Jun 19, 2017 at 12:23 PM, Oleksii Segeda 
> wrote:
Christopher,

It gives false positives if I use it with cts:element-values.

Shan,

The rule is to find all values which start with given value. For example, 200 
should match 200, 2001, 2002, ... 20010, 20020, 2002123, etc..
Are you suggesting to guess all possible combinations? If so, it's not possible.

As I said, I need something like this (pseudo code):

cts:element-value-match(xs:QName("element"), "200*")

except that I don't have a string range index on that field, but I do have an 
int range index instead.

Best,
Oleksii.

-Original Message-
From: 
general-boun...@developer.marklogic.com
 
[mailto:general-boun...@developer.marklogic.com]
 On Behalf Of Shan Jiang
Sent: Monday, June 19, 2017 2:06 PM
To: MarkLogic Developer Discussion 
>
Subject: Re: [MarkLogic Dev General] cts:element-value-match for integers

What is your exact search rule? From your example, looks like you try to
look for another number by adding a ³0². If that is the case, can you run
a cts:or-query, one for 200, and one for 2000?

Shan Jiang
Principal Consultant
MarkLogic Corporation
shan.ji...@marklogic.com
Phone: +1 703 869 4672
www.marklogic.com 






On 6/19/17, 12:59 PM, 
"general-boun...@developer.marklogic.com
 on behalf
of Oleksii Segeda" 

 on behalf of
oseg...@worldbankgroup.org> wrote:

>Hi everyone,
>
>Any thoughts on this?
>
>Oleksii.
>
>
>-Original Message-
>From: Oleksii Segeda
>Sent: Friday, June 16, 2017 6:16 PM
>To: general@developer.marklogic.com
>Subject: cts:element-value-match for integers
>
>Hi everyone,
>
>Can someone explain how does cts:element-value-match work with integer
>indexes? I cannot pass a string as a second argument, so it's unclear how
>to do a wildcarded search.
>Ultimate goal is to find 2000 and 200, if user typed 200. I understand
>that I can create an additional string index, but I want to know if a
>better solution exists.
>
>Thanks.
>
>Oleksii Segeda
>IT Analyst
>Information and Technology Solutions
>www.worldbank.org
>
>
>
>
>___
>General mailing list
>General@developer.marklogic.com
>Manage your subscription at:
>http://developer.marklogic.com/mailman/listinfo/general

___
General mailing list
General@developer.marklogic.com
Manage your subscription at:
http://developer.marklogic.com/mailman/listinfo/general
___
General mailing list
General@developer.marklogic.com
Manage your 

Re: [MarkLogic Dev General] MLCP Error Return

2017-06-17 Thread Geert Josten
Hi Hans, Tim,

To my knowledge the Java code does return with exit statuses depending on 
outcome. It looks though they are not properly propagated through mlcp.sh/bat. 
I’ll see if I can file a bug report for this. In the meantime you could 
invoking the jar directly according to the rules in the mlcp scripts.

Kind regards,
Geert

From: 
>
 on behalf of Hans Hübner 
>
Reply-To: MarkLogic Developer Discussion 
>
Date: Saturday, June 17, 2017 at 8:44 AM
To: MarkLogic Developer Discussion 
>
Subject: Re: [MarkLogic Dev General] MLCP Error Return

Hi,

I would like to express our interest in seeing this be fixed as well.  Having 
to parse mlcp's output to determine whether it failed is rather brittle.  What 
is MarkLogic's approach to robust automation of processes that involve mlcp?

Thanks,
Hans

On Fri, May 12, 2017 at 2:33 PM, Timothy Pearce 
> wrote:
Hello,

I’m currently working on improving the continuous integration system I 
currently have which uses Jenkins. It’s deploying our modules into the database 
via MLCP and the XDBC server. I am seeing one issue with the mlcp script and it 
is not returning any errors via exit code when it fails to updated a document 
because it is locked. It is returning exit 0 with “exit $?” tacked on to the 
end of the run after the mlcp.sh script is called. Is there any way to get this 
to throw a non 0 exit code on error? I’m not seeing any reference to checking 
if it the script ran with success or parameters to add that in the mlcp 
documentation. For a snippet of the mlcp’s logging which shows the error:

17/05/11 15:10:41 INFO contentpump.FileAndDirectoryInputFormat: Total input 
paths to process : 14
17/05/11 15:10:43 INFO contentpump.LocalJobRunner:  completed 100%
17/05/11 15:10:43 ERROR mapreduce.ContentWriter: XDMP-LOCKED: Document or 
Directory is locked
17/05/11 15:10:43 WARN mapreduce.ContentWriter: Failed document /utils/a.xqy in 
file:/modules/utils/a.xqy
17/05/11 15:10:46 ERROR mapreduce.ContentWriter: XDMP-LOCKED: Document or 
Directory is locked
17/05/11 15:10:46 WARN mapreduce.ContentWriter: Failed document /utils/b.xqy in 
file:/modules/utils/b.xqy
17/05/11 15:10:46 INFO contentpump.LocalJobRunner: 
com.marklogic.mapreduce.MarkLogicCounter:
17/05/11 15:10:46 INFO contentpump.LocalJobRunner: INPUT_RECORDS: 143
17/05/11 15:10:46 INFO contentpump.LocalJobRunner: OUTPUT_RECORDS: 143
17/05/11 15:10:46 INFO contentpump.LocalJobRunner: OUTPUT_RECORDS_COMMITTED: 143
17/05/11 15:10:46 INFO contentpump.LocalJobRunner: OUTPUT_RECORDS_FAILED: 0
17/05/11 15:10:46 INFO contentpump.LocalJobRunner: Total execution time: 4 sec
+ exit 0

I’ve verified that the file was not overwritten via adding data to a file 
stored in marklogic and using mlcp to replace the file, then querying with 
fn:doc to verify the data was removed. It did not replace the file. Any 
guidance to helping verify the files were actually written into marklogic would 
be appreciated.

Thanks,
Tim



Nothing in this message is intended to constitute an electronic signature 
unless a specific statement to the contrary is included in this message. 
Confidentiality Note: This message is intended only for the person or entity to 
which it is addressed. It may contain confidential and/or proprietary material. 
Any review, transmission, dissemination or other use, or taking of any action 
in reliance upon this message by persons or entities other than the intended 
recipient is prohibited. If you received this message in error, please contact 
the sender and delete it from your computer.

___
General mailing list
General@developer.marklogic.com
Manage your subscription at:
http://developer.marklogic.com/mailman/listinfo/general




--
LambdaWerk GmbH
Oranienburger Straße 87/89
10178 Berlin
Phone: +49 30 555 7335 0
Fax: +49 30 555 7335 99

HRB 169991 B Amtsgericht Charlottenburg
USt-ID: DE301399951
Geschäftsführer:  Hans Hübner

http://lambdawerk.com/


___
General mailing list
General@developer.marklogic.com
Manage your subscription at: 
http://developer.marklogic.com/mailman/listinfo/general


Re: [MarkLogic Dev General] Accessing properties of in-memory JS object in XQuery

2017-06-16 Thread Geert Josten
It is a json:object (the map:map specialization)..

Try:

xdmp.xqueryEval(
  'declare variable $obj external; map:get($obj, "name")',
  { obj: {name: 'name', title: 'title' }});


Cheers,
Geert


On 6/16/17, 9:27 PM, "general-boun...@developer.marklogic.com on behalf of
Florent Georges"  wrote:

>Hi,
>
>I have an SJS script that calls a function from an XQuery library.  It
>passes a JS object to the function.  The function needs to access the
>value of one property of the object (in this case, a string).
>
>I can't find in the documentation how XQuery code can navigate through
>the properties of an in-memory JS object.  Any idea?
>
>A self-contained example (my code require() an XQuery library and
>calls a function instead of using code evaluation, but the issue is
>the same):
>
>xdmp.eval(
>  `declare variable $obj as external;
>   $obj ! xs:string(name)`,
>  { name: 'name', title: 'title' });
>
>I am using ML 9.
>
>Regards,
>
>-- 
>Florent Georges
>H2O Consulting
>http://h2o.consulting/
>___
>General mailing list
>General@developer.marklogic.com
>Manage your subscription at:
>http://developer.marklogic.com/mailman/listinfo/general

___
General mailing list
General@developer.marklogic.com
Manage your subscription at: 
http://developer.marklogic.com/mailman/listinfo/general


Re: [MarkLogic Dev General] Host restart issue after joining cluster

2017-06-14 Thread Geert Josten
Hi Rajesh,

It is important that both hosts can see each other. They reach out using the 
host-names defined within the MarkLogic configuration. Make sure both hosts can 
see the other using that. It is also important import ports are not blocked. If 
not mistaking that includes 7998 upto 8002.

Did you also check the service status and errorlogs on the second host, and if 
that is okay, what does admin ui say on the second host?

It could also just be a temporary issue. I believe hosts reach out regularly to 
check connectivity, so status may have changed after waiting a few minutes..

Cheers,
Geert

From: 
>
 on behalf of Rajesh Kumar >
Reply-To: MarkLogic Developer Discussion 
>
Date: Wednesday, June 14, 2017 at 1:11 PM
To: "general@developer.marklogic.com" 
>
Subject: [MarkLogic Dev General] Host restart issue after joining cluster

Hi Team,

I'm testing clustering in MarkLogic v8 . I tried joining to a host which 
already exist. In the Join cluster page I mentioned group as default as there 
is only default available in the first host. I recieved a message that host has 
joined cluster and host 2 has restarted but never became active therafter. When 
observed in Host 1 the status was disconnected.

We used enterprise license for v8 . Both hosts are running on WIndows 7 with 
same configuration and they both are new installations.

Thanks & Regards,
Rajesh
___
General mailing list
General@developer.marklogic.com
Manage your subscription at: 
http://developer.marklogic.com/mailman/listinfo/general


Re: [MarkLogic Dev General] Help :- Ingested document enrichment

2017-06-01 Thread Geert Josten
Hi Pavan,

To my knowledge, most of these do pretty straight-forward HTTP calls, which 
should work for other public enrichment sites as well. Doing an HTTP call using 
xdmp:http-get or xdmp:http-post is for sure the easiest way to integrate, and 
should work well from inside a pipeline action..

Kind regards,
Geert

From: 
>
 on behalf of GUPTA Pavan 
>
Reply-To: MarkLogic Developer Discussion 
>
Date: Thursday, June 1, 2017 at 7:52 AM
To: "general@developer.marklogic.com" 
>
Cc: SHARMA Archana 
>
Subject: [MarkLogic Dev General] Help :- Ingested document enrichment

Hi Team,

I am able to search the binary document after the CPF set up but I want to 
enhance the search capability so I thought to use the Enrichment Pipelines 
which are available in MarkLogic. I could find five pipelines below,

  I. TEMIS Luxid(R) Entity Enrichment Pipeline
II. Calais Entity Enrichment Pipeline
   III. SRA NetOwl Entity Enrichment Pipeline
IV. Janya Entity Enrichment Pipeline
 V. Data Harmony Enrichment Pipeline.

I have explored these and found that none is open source. I am interested in 
enrichment of ingested document in the same way as above pipelines does.

Can you please suggest the way for the same or how I can integrate the open 
source NLPs or Machine learning libraries.

Thanks in Advance!

Regards,
Pavan
___
General mailing list
General@developer.marklogic.com
Manage your subscription at: 
http://developer.marklogic.com/mailman/listinfo/general


Re: [MarkLogic Dev General] Restore reindex

2017-05-31 Thread Geert Josten
Hi Andreas,

I think this is something for support. Can you mail them, or reach out to your 
local MarkLogic contact?

Kind regards,
Geert

From: 
>
 on behalf of Andreas Holzgethan 
>
Reply-To: MarkLogic Developer Discussion 
>
Date: Wednesday, May 31, 2017 at 7:26 AM
To: MarkLogic Developer Discussion 
>
Subject: [MarkLogic Dev General] Restore reindex

Hi!

Actual we have a problem with indexing during the restore of a backup.
We create a backup from a different MarkLogic-Database and restore it to 
another one.

The settings are the same at both. But when the restore has finished the 
documents are not correct in the index so the search on the restored database 
returns wrong/different results.

The "reindexer enable" is set to true.
Is there something we also have to check against?

A manual start of reindxing solves the problem but it would be great to skip 
this step.

Best regards,
Andreas Holzgethan

Andreas Holzgethan BSc.
IT Consultant

EBCONT enterprise technologies GmbH
Millennium Tower
Handelskai 94-96
1200 Wien

Mobil: +43 664 606 517 05
Email:andreas.holzget...@ebcont.com
Web:http://www.ebcont-et.com/

OUR TEAM IS YOUR SUCCESS

HG St. Pölten - FN 293731 h
UID: ATU63444589
___
General mailing list
General@developer.marklogic.com
Manage your subscription at: 
http://developer.marklogic.com/mailman/listinfo/general


Re: [MarkLogic Dev General] Mlcp transform to break big aggregate xml file to different invidual files

2017-05-30 Thread Geert Josten
Hi Manoj,

Keep in mind MLCP transforms receive one $content map:map, but are allowed to 
return multiple, each representing a file that needs to be persisted. Just 
return map:map’s each with a unique `uri` and `value` property.

I’d recommend combining that with the aggregate_element option on employee for 
best scalability..

Kind regards,
Geert

From: 
>
 on behalf of manoj viswanadha 
>
Reply-To: MarkLogic Developer Discussion 
>
Date: Tuesday, May 30, 2017 at 8:19 AM
To: MarkLogic Developer Discussion 
>
Subject: [MarkLogic Dev General] Mlcp transform to break big aggregate xml file 
to different invidual files

Hi Team,

I have a requirement to take aggregate xml as input and break that file into 
different individual xml files and commit into database.

At a individual document level i have my content and single uri when I break 
into multiple documents i have more than one document  to be created with 
multiple uris.

Sample data:
















I want to create 4 different documents 2 from info and 2 from data and load 4 
documents from mlcp transform at first employee level. Similarly for other 
employee levels.


Is there any way to solve this using mlcp with transform?

Thanks,
Manoj Viswanadha
___
General mailing list
General@developer.marklogic.com
Manage your subscription at: 
http://developer.marklogic.com/mailman/listinfo/general


Re: [MarkLogic Dev General] Invoke SJS from XQuery

2017-05-29 Thread Geert Josten
http://docs.marklogic.com/xdmp:invoke

$path The path of the module to be executed as a string. The path is resolved 
against the root of the App Server evaluating the query, the Modules directory, 
or relative to the calling module. The module is considered to be JavaScript if 
the module path ends with a file extension matching the ones configured for 
application/vnd.marklogic-javascript in MarkLogic's Mimetypes configuration. 
For details on resolving paths, see "Importing XQuery Modules and Resolving 
Paths" in the Application Developer's Guide.

Cheers,
Geert

From: 
>
 on behalf of Florent Georges >
Reply-To: MarkLogic Developer Discussion 
>
Date: Monday, May 29, 2017 at 9:36 PM
To: MarkLogic Developer Discussion 
>
Subject: [MarkLogic Dev General] Invoke SJS from XQuery

Hi,

Is there any way to invoke a JavaScript script from XQuery, the same way it is
possible to invoke an XQuery module using xdmp:invoke()?

The only way I can think of is using xdmp:javascript-eval(), constructing an
expression using 'require(' || $href || ')'.  But it would be nice not to have
to rely on string concatenation and eval.  You know, little bobby tables...

Regards,

--
Florent Georges
H2O Consulting
http://h2o.consulting/


___
General mailing list
General@developer.marklogic.com
Manage your subscription at: 
http://developer.marklogic.com/mailman/listinfo/general


Re: [MarkLogic Dev General] Clarification :- Binary Document Search

2017-05-29 Thread Geert Josten
Hi Ankur,

The built-in pipeline `Document Filtering (Properties)` should be able to 
handle those. Just add them to the domain you’d like to use. Here is the 
section of the CPF guide on how to do that using the Admin UI: 
http://docs.marklogic.com/guide/cpf/domains#id_40535

For your reference, these are the formats supported by xdmp:document-filter: 
http://docs.marklogic.com/guide/search-dev/binary-document-metadata#id_68368

Kind regards,
Geert

From: MEHROTRA Ankur 
<ankur.mehro...@soprasteria.com<mailto:ankur.mehro...@soprasteria.com>>
Date: Monday, May 29, 2017 at 12:57 PM
To: MarkLogic Developer Discussion 
<general@developer.marklogic.com<mailto:general@developer.marklogic.com>>, 
Geert Josten <geert.jos...@marklogic.com<mailto:geert.jos...@marklogic.com>>
Cc: GUPTA Pavan 
<pavan.gu...@soprasteria.com<mailto:pavan.gu...@soprasteria.com>>, SHARMA 
Archana 
<archana.sha...@soprasteria.com<mailto:archana.sha...@soprasteria.com>>, 
MarkLogic Developer Discussion 
<general@developer.marklogic.com<mailto:general@developer.marklogic.com>>
Subject: RE: [MarkLogic Dev General] Clarification :- Binary Document Search

Hi Geert,

Can we have an option for configuring built-in CPF pipelines for MP3/Video 
files?

Thanks in advance,
Ankur Mehrotra

From: MEHROTRA Ankur
Sent: Monday, May 29, 2017 1:50 PM
To: MarkLogic Developer Discussion
Cc: GUPTA Pavan; SHARMA Archana
Subject: Re: [MarkLogic Dev General] Clarification :- Binary Document Search


Thanks a ton for such a useful response.

Thanks,
Ankur Mehrotra


From:general-boun...@developer.marklogic.com<mailto:general-boun...@developer.marklogic.com>
 
<general-boun...@developer.marklogic.com<mailto:general-boun...@developer.marklogic.com>>
 on behalf of Geert Josten 
<geert.jos...@marklogic.com<mailto:geert.jos...@marklogic.com>>
Sent: Monday, May 29, 2017 1:02:48 PM
To: MarkLogic Developer Discussion
Cc: GUPTA Pavan; SHARMA Archana
Subject: Re: [MarkLogic Dev General] Clarification :- Binary Document Search

Hi Ankur,

That is kind of by design. MarkLogic does not search binaries directly. Instead 
you can apply xdmp:document-filter (which uses a built-in 3rd party library) to 
scrape about 200 different formats for text and metadata. The result is XHTML, 
and can be saved in document properties or as separate documents. This is 
represented in the built-in CPF Conversion pipelines as the `Document Filtering 
(Properties)` and `Document Filtering (XHTML)`. These are not enabled by 
default.

MarkLogic also comes with functions like xdmp:pdf-convert and 
xdmp:word-convert. These usually yield better results, but work for very 
specific formats only. The built-in CPF Conversion pipelines that are enabled 
by default (Conversion Processing, DocBook Conversion, HTML Conversion, MS 
Office Conversion, PDF Conversion) make use of these, and attempt to further 
enhance the results, and convert into DocBook XML Format. These always store 
results as separate documents.

Simplest solution might be to use the `Document Filtering (Properties)` 
instead, and toggle searching to search over properties instead of over 
documents, but searching over properties can have performance impact (extra 
join between document and properties fragments), and makes combined search over 
binaries and non-binaries more difficult (potential need for 
fragment-scope-queries and such).

You could also just take the uris returned from your current search, and string 
manipulate the uri to get the link to the original binary. If memory serves me 
right, it is always original uri plus something like ‘.xml’ or ‘.xhtml’ 
appended to it..

Kind regards,
Geert

From: 
<general-boun...@developer.marklogic.com<mailto:general-boun...@developer.marklogic.com>>
 on behalf of MEHROTRA Ankur 
<ankur.mehro...@soprasteria.com<mailto:ankur.mehro...@soprasteria.com>>
Reply-To: MarkLogic Developer Discussion 
<general@developer.marklogic.com<mailto:general@developer.marklogic.com>>
Date: Monday, May 29, 2017 at 8:41 AM
To: "general@developer.marklogic.com<mailto:general@developer.marklogic.com>" 
<general@developer.marklogic.com<mailto:general@developer.marklogic.com>>
Cc: GUPTA Pavan 
<pavan.gu...@soprasteria.com<mailto:pavan.gu...@soprasteria.com>>, SHARMA 
Archana <archana.sha...@soprasteria.com<mailto:archana.sha...@soprasteria.com>>
Subject: Re: [MarkLogic Dev General] Clarification :- Binary Document Search

Any update on this.

From: MEHROTRA Ankur
Sent: Thursday, May 25, 2017 5:36 PM
To: 'general@developer.marklogic.com<mailto:'general@developer.marklogic.com>'
Cc: GUPTA Pavan; SHARMA Archana
Subject: Clarification :- Binary Document Search

Hi Team,

I have gone through the 'https://docs.marklogic.com/... to set up the pipeline 
to make the binary document searcha

Re: [MarkLogic Dev General] Clarification :- Binary Document Search

2017-05-29 Thread Geert Josten
Hi Ankur,

That is kind of by design. MarkLogic does not search binaries directly. Instead 
you can apply xdmp:document-filter (which uses a built-in 3rd party library) to 
scrape about 200 different formats for text and metadata. The result is XHTML, 
and can be saved in document properties or as separate documents. This is 
represented in the built-in CPF Conversion pipelines as the `Document Filtering 
(Properties)` and `Document Filtering (XHTML)`. These are not enabled by 
default.

MarkLogic also comes with functions like xdmp:pdf-convert and 
xdmp:word-convert. These usually yield better results, but work for very 
specific formats only. The built-in CPF Conversion pipelines that are enabled 
by default (Conversion Processing, DocBook Conversion, HTML Conversion, MS 
Office Conversion, PDF Conversion) make use of these, and attempt to further 
enhance the results, and convert into DocBook XML Format. These always store 
results as separate documents.

Simplest solution might be to use the `Document Filtering (Properties)` 
instead, and toggle searching to search over properties instead of over 
documents, but searching over properties can have performance impact (extra 
join between document and properties fragments), and makes combined search over 
binaries and non-binaries more difficult (potential need for 
fragment-scope-queries and such).

You could also just take the uris returned from your current search, and string 
manipulate the uri to get the link to the original binary. If memory serves me 
right, it is always original uri plus something like ‘.xml’ or ‘.xhtml’ 
appended to it..

Kind regards,
Geert

From: 
>
 on behalf of MEHROTRA Ankur 
>
Reply-To: MarkLogic Developer Discussion 
>
Date: Monday, May 29, 2017 at 8:41 AM
To: "general@developer.marklogic.com" 
>
Cc: GUPTA Pavan 
>, SHARMA 
Archana >
Subject: Re: [MarkLogic Dev General] Clarification :- Binary Document Search

Any update on this.

From: MEHROTRA Ankur
Sent: Thursday, May 25, 2017 5:36 PM
To: 'general@developer.marklogic.com'
Cc: GUPTA Pavan; SHARMA Archana
Subject: Clarification :- Binary Document Search

Hi Team,

I have gone through the 'https://docs.marklogic.com/... to set up the pipeline 
to make the binary document searchable. I can observe that .xml and .xhtml are 
being generated out of ingested file (for instance .doc/.docx/.pdf). When I 
tried searching using Java Client API search query, I got the results from 
generated xml file rather than getting the results from ingested file which in 
turn returned the uri of generated xml file (in response) but I need to point 
to the main document file uri as I need to show this on screen. How I can 
achieve this.

We have used below code to get the converted document uri (for example .xml 
file) but I need to ingested documents uri.


DatabaseClient client = DatabaseClientFactory.newClient(Config.host, 
Config.port, Config.user, Config.password, Config.authType);

  // create a manager for searching
  QueryManager queryMgr = client.newQueryManager();



  StringQueryDefinition query = queryMgr.newStringDefinition();
  query.setCriteria("text");



  SearchHandle resultsHandle = new SearchHandle();


  queryMgr.search(query, resultsHandle);

MatchDocumentSummary[] results = resultsHandle.getMatchResults();
  for (MatchDocumentSummary result: results) {

 System.out.println(result.getUri());
  }


Thanks and regards,
Ankur Mehrotra
___
General mailing list
General@developer.marklogic.com
Manage your subscription at: 
http://developer.marklogic.com/mailman/listinfo/general


Re: [MarkLogic Dev General] concurrent invocation of xquery ending up with duplicate writes

2017-05-23 Thread Geert Josten
Hi Raghu,

The best way to ensure concurrent threads not creating a file at the same uri, 
*is* by using locks. Here is code and some explanation on how to best do that: 
http://registry.demo.marklogic.com/package/ml-unique

Cheers,
Geert

From: 
>
 on behalf of Raghu 
>
Reply-To: MarkLogic Developer Discussion 
>
Date: Tuesday, May 23, 2017 at 8:54 PM
To: General MarkLogic Developer Discussion 
>
Subject: [MarkLogic Dev General] concurrent invocation of xquery ending up with 
duplicate writes

All,

I have a reader.xqy, which does only read operation and does not write to the 
forest, except for one doc insert. I don't want that reader query to obtain 
lock on all referenced documents, so I move that document insert logic to a 
seperate writer.xqy and invoke it from reader.xqy.

My current logic is

if random-xml already exists

DO NOTHING

else INSERT RANDOM-XML

The problem I am facing is,

when I invoke the reader xqy using multiple threads concurrently, I am ending 
up with duplicate writer xmls even though I have validations in place. How do I 
make sure that even if the reader xml is invoked concurrently by several 
threads, only one of the invocation has to insert an xml?


Note: I need that random-xml inserted, before the reader.xqy completes 
execution and the URI of the random-xml involves dynamically generated ID and 
NOT a constant URI.

Thanks in advance
Raghu

___
General mailing list
General@developer.marklogic.com
Manage your subscription at: 
http://developer.marklogic.com/mailman/listinfo/general


Re: [MarkLogic Dev General] Processing Large Number of Docs to Get Statistics

2017-05-23 Thread Geert Josten
Hi Eliot,

I¹d consider using taskbot
(http://registry.demo.marklogic.com/package/taskbot), and using that in
combination with either $tb:OPTIONS-SYNC or $tb:OPTIONS-SYNC-UPDATE. It
will make optimal use of the TaskServer of the host on which you initiate
the call. It doesn¹t scale endlessly, but it batches up the work
automatically for you, and will get you a lot further fairly easily..

Cheers,
Geert

On 5/23/17, 5:43 AM, "general-boun...@developer.marklogic.com on behalf of
Eliot Kimber"  wrote:

>I haven¹t yet seen anything in the docs that directly address what I¹m
>trying to do and suspect I¹m simply missing some ML basics or just going
>about things the wrong way.
>
>I have a corpus of several hundred thousand docs (but could be millions,
>of course), where each doc is an average of 200K and several thousand
>elements.
>
>I want to analyze the corpus to get details about the number of specific
>subelements within each document, e.g.:
>
>
>for $article in cts:search(/Article, cts:directory-query("/Default/",
>"infinity"))[$start to $end]
> return paras=²{count($article//p}²/>
>
>I¹m running this as a query from Oxygen (so I can capture the results
>locally so I can do other stuff with them).
>
>On the server I¹m using I blow the expanded tree cache if I try to
>request more than about 20,000 docs.
>
>Is there a way to do this kind of processing over an arbitrarily large
>set *and* get the results back from a single query request?
>
>I think the only solution is to write the results to back to the database
>and then fetch that as the last thing but I was hoping there was
>something simpler.
>
>Have I missed an obvious solution?
>
>Thanks,
>
>Eliot
>
>--
>Eliot Kimber
>http://contrext.com
> 
>
>
>
>___
>General mailing list
>General@developer.marklogic.com
>Manage your subscription at:
>http://developer.marklogic.com/mailman/listinfo/general

___
General mailing list
General@developer.marklogic.com
Manage your subscription at: 
http://developer.marklogic.com/mailman/listinfo/general


Re: [MarkLogic Dev General] Priorities for queries

2017-05-23 Thread Geert Josten
Hi Oleksii,

If you use xdmp:spawn or xdmp:spawn-function, you would be able to use the 
 option. It takes ’normal’ and ‘higher’ as values. These priorities 
have separate queues and worker threads, so they should interfere less with 
each other.

It might also be worth looking into a way to push out low priority work to a 
dedicated host for longer running tasks. You could do that by writing such 
queries to the database, have a schedule running on that particular host 
monitor for such tasks, which picks them up 1 by 1, and writes back results 
once done. It might be easiest to switch around script queries to an 
asynchronous process that polls regularly to see if results have been written. 
Makes sense?

Cheers,
Geert

From: 
>
 on behalf of Oleksii Segeda 
>
Reply-To: MarkLogic Developer Discussion 
>
Date: Monday, May 22, 2017 at 8:59 PM
To: "general@developer.marklogic.com" 
>
Subject: [MarkLogic Dev General] Priorities for queries

Hi,

Is there a way to give a lower priority to certain queries? We have two 
different types of API consumers – real users and various scripts.
No matter how often scripts are hitting endpoints or how “heavy” are their 
queries, they should not affect API performance for real users.
In other words, scripts are tolerant of high latency, but users are not.

Regards,

Oleksii Segeda

IT Analyst

Information and Technology Solutions

W

www.worldbank.org

[http://siteresources.worldbank.org/NEWS/Images/spacer.png]

[http://siteresources.worldbank.org/NEWS/Images/WBG_Information_and_Technology_Solutions.png]



___
General mailing list
General@developer.marklogic.com
Manage your subscription at: 
http://developer.marklogic.com/mailman/listinfo/general


Re: [MarkLogic Dev General] MLCP vs backup/restore

2017-05-16 Thread Geert Josten
Hi Rajesh,

I’d expect backup/restore to perform much faster. It essentially makes copies 
of Forest stands on filesystem level, much different than MLCP. It also 
includes Journals, and if selected Security data too.

Getting backup data off the system might be a different question though, but 
I’d expect an scp or mirror of backup files to outperform MLCP too.

Cheers,
Geert

From: 
>
 on behalf of Rajesh Kumar >
Reply-To: MarkLogic Developer Discussion 
>
Date: Tuesday, May 16, 2017 at 8:59 AM
To: "general@developer.marklogic.com" 
>
Subject: [MarkLogic Dev General] MLCP vs backup/restore

Hi Team,

Which is the best approach in terms of data backup in MarkLogic in terms of 
performance and time. Using MLCP or backup/restore at database level.

Regards,
Rajesh
___
General mailing list
General@developer.marklogic.com
Manage your subscription at: 
http://developer.marklogic.com/mailman/listinfo/general


Re: [MarkLogic Dev General] MarkLogic 9 Redaction

2017-05-10 Thread Geert Josten
Hi Tulasi,

With Flex Rep, you can configure push or pull approach, both driven from CPF if 
I recall correctly. In both cases you can add your own pipelines to do whatever 
is needed to get the right stuff replicated in the right way. I think in your 
case you might need to use push approach, since you need to run redaction from 
master side. There are no specific docs on doing Redaction with FlexRep, but it 
is a useful use case.

@DaveC, worth as a new chapter to Redaction or FlexRep guide?

Kind regards,
Geert

From: 
>
 on behalf of "Guraja, Tulasi" 
>
Reply-To: MarkLogic Developer Discussion 
>
Date: Wednesday, May 10, 2017 at 4:16 PM
To: MarkLogic Developer Discussion 
>
Subject: Re: [MarkLogic Dev General] MarkLogic 9 Redaction

Hi Dave,

Thank you for your response. Have referred to the FlexRep documentation and 
couldn’t find any usage of Redaction. Am I missing something?

Can you please provide me more insights on this?

Thanks,
Tulasi Guraja
IWM IT
+1 212 325 5954 (*105 5954)

From: 
general-boun...@developer.marklogic.com
 [mailto:general-boun...@developer.marklogic.com] On Behalf Of Dave Cassel
Sent: Tuesday, May 09, 2017 4:57 PM
To: MarkLogic Developer Discussion
Subject: Re: [MarkLogic Dev General] MarkLogic 9 Redaction

Hi Tulasi,

I asked about this and found that the new Redaction feature isn't for use with 
replication, but Flexrep 
does allow for redaction and was built to cover such cases.

Dave.

--
Dave Cassel, @dmcassel
Technical Community Manager
MarkLogic Corporation
http://developer.marklogic.com/

From: 
>
 on behalf of "Guraja, Tulasi" 
>
Reply-To: MarkLogic Developer Discussion 
>
Date: Tuesday, May 9, 2017 at 2:54 PM
To: "general@developer.marklogic.com" 
>
Subject: [MarkLogic Dev General] MarkLogic 9 Redaction

Hi,

I am looking at Redaction feature in MarkLogic 9. I understand that this can be 
used for Read and Export. I would like to check and confirm if Redaction 
feature can be used in Replication (replicating data across globes by masking 
data points)?

Thanks,
Tulasi Guraja




==
Please access the attached hyperlink for an important electronic communications 
disclaimer:
http://www.credit-suisse.com/legal/en/disclaimer_email_ib.html
==


==
Please access the attached hyperlink for an important electronic communications 
disclaimer:
http://www.credit-suisse.com/legal/en/disclaimer_email_ib.html
==
___
General mailing list
General@developer.marklogic.com
Manage your subscription at: 
http://developer.marklogic.com/mailman/listinfo/general


Re: [MarkLogic Dev General] Values endpoint issue in ROXY

2017-04-21 Thread Geert Josten
I think you have hit this issue:

https://github.com/marklogic/roxy/issues/758

That ticket contains a fix..

Cheers,
Geert

PS: note that using the /roxy/rewriter.xqy means you are running in Roxy hybrid 
mode. If you intend to use MarkLogic REST api only (not Roxy MVC), consider 
using the real app-type `rest`, which uses amongst others:


url-rewriter=/MarkLogic/rest-api/rewriter.xml

error-handler=/MarkLogic/rest-api/error-handler.xqy

rewrite-resolves-globally=true

From: 
>
 on behalf of Rajesh Kumar >
Reply-To: MarkLogic Developer Discussion 
>
Date: Friday, April 21, 2017 at 2:49 PM
To: "general@developer.marklogic.com" 
>
Subject: [MarkLogic Dev General] Values endpoint issue in ROXY

Hi Team,

When tried ROXY framework in REST app-type by pointing url-rewriter to 
/roxy/rewriter.xqy, we are facing issues with values endpoint in GET method.

API:
/v1/values/empID?options=ems=descending=1=json

Error:
REST-UNSUPPORTEDPARAM: (err:FOER) Endpoint does not support query 
parameter: invalid parameters: amp;name for request

Kindly help us in resolving the issue. We are using 8.0-6.4.

Thanks & Regards,
Rajesh Kumar P
___
General mailing list
General@developer.marklogic.com
Manage your subscription at: 
http://developer.marklogic.com/mailman/listinfo/general


Re: [MarkLogic Dev General] json:config for XML schema

2017-04-13 Thread Geert Josten
Yes, I am unaware of such work. Just warning you to be careful if you attempt 
something like that. You can easily loose data if you don’t recognize mixed 
data properly..

Cheers

From: 
<general-boun...@developer.marklogic.com<mailto:general-boun...@developer.marklogic.com>>
 on behalf of "Steiner, David J. (LNG-DAY)" 
<david.j.stei...@elsevier.com<mailto:david.j.stei...@elsevier.com>>
Reply-To: MarkLogic Developer Discussion 
<general@developer.marklogic.com<mailto:general@developer.marklogic.com>>
Date: Thursday, April 13, 2017 at 4:31 PM
To: MarkLogic Developer Discussion 
<general@developer.marklogic.com<mailto:general@developer.marklogic.com>>
Subject: Re: [MarkLogic Dev General] json:config for XML schema

So, the answer to my question is “No” – you’re unaware of whether anyone 
developed a method (utility?) of converting a XML Schema into custom strategy.

Thanks anyway.

From: 
general-boun...@developer.marklogic.com<mailto:general-boun...@developer.marklogic.com>
 [mailto:general-boun...@developer.marklogic.com] On Behalf Of Geert Josten
Sent: Thursday, April 13, 2017 10:22 AM
To: MarkLogic Developer Discussion
Subject: Re: [MarkLogic Dev General] json:config for XML schema


*** External email: use caution ***


Yes exactly. The only safe way to convert mixed content using that json library 
is using the full config or custom config that specifies mixed elements for 
full conversion:

xquery version "1.0-ml";

import module namespace json="http://marklogic.com/xdmp/json;
 at "/MarkLogic/json/json.xqy";

let $xml := hello world for the win!
let $config := json:config("full")
let $_ := map:put($config, "full-element-names", "p")
return json:transform-from-json(json:transform-to-json($xml, $config), $config)

Cheers

PS: nothing stops you from writing your own xml-json conversion lib though. 
Particularly xml 2 json should be fairly trivial with a bit of XSLT..

From: 
<general-boun...@developer.marklogic.com<mailto:general-boun...@developer.marklogic.com>>
 on behalf of Florent Georges <li...@fgeorges.org<mailto:li...@fgeorges.org>>
Reply-To: MarkLogic Developer Discussion 
<general@developer.marklogic.com<mailto:general@developer.marklogic.com>>
Date: Thursday, April 13, 2017 at 4:08 PM
To: MarkLogic Developer Discussion 
<general@developer.marklogic.com<mailto:general@developer.marklogic.com>>
Subject: Re: [MarkLogic Dev General] json:config for XML schema

Hi,

> JSON is ill-suited for hierarchical content.

For the record, JSON is particularly bad at what we call "mixed content" in XML 
(what I guess Geert meant by "inline elements").

The idiomatic example of mixed content is the P element in HTML: it can contain 
text nodes as direct children, intermingled with other elements like B or EM or 
SPAN containing text nodes or other such elements themselves.

Regards,

--
Florent Georges
H2O Consulting
\http://h2o.consulting/


On 13 April 2017 at 15:55, Steiner, David J. (LNG-DAY) wrote:
Hi Geert,

Yes, I’ve looked at “full” – way too verbose and I’m very well aware of JSON is 
ill-suited for hierarchical content.

Thanks,
David

From:general-boun...@developer.marklogic.com<mailto:general-boun...@developer.marklogic.com>
 
[mailto:general-boun...@developer.marklogic.com<mailto:general-boun...@developer.marklogic.com>]
 On Behalf Of Geert Josten
Sent: Thursday, April 13, 2017 9:47 AM
To: MarkLogic Developer Discussion
Subject: Re: [MarkLogic Dev General] json:config for XML schema


Hi David,

That sounds like a very large xsd. Keep in mind JSON is not very well suited 
for inline elements. I reckon you looked at the full strategy option of 
json:config? Rather verbose, but simple, and reliable roundtrip..

Cheers,
Geert

From: 
<general-boun...@developer.marklogic.com<mailto:general-boun...@developer.marklogic.com>>
 on behalf of "Steiner, David J. (LNG-DAY)" 
<david.j.stei...@elsevier.com<mailto:david.j.stei...@elsevier.com>>
Reply-To: MarkLogic Developer Discussion 
<general@developer.marklogic.com<mailto:general@developer.marklogic.com>>
Date: Thursday, April 13, 2017 at 3:18 PM
To: MarkLogic Developer Discussion 
<general@developer.marklogic.com<mailto:general@developer.marklogic.com>>
Subject: [MarkLogic Dev General] json:config for XML schema

Has anyone developed a method (utility?) of converting a XML Schema into custom 
strategy?

I’ve briefly looked at xml4js but from what I gleaned, it seems like you have 
to go through and enter every import/include and I’m starting with 44 imports 
in the first xsd and each of those probably has imports as well.

Thanks,
David



___
General mailing list
General@developer.marklogic.com<mailto:General@developer.marklogic.com>
Manage your subscription at:
http://developer.marklogic.com/mailman/listinfo/general




___
General mailing list
General@developer.marklogic.com
Manage your subscription at: 
http://developer.marklogic.com/mailman/listinfo/general


Re: [MarkLogic Dev General] json:config for XML schema

2017-04-13 Thread Geert Josten
Yes exactly. The only safe way to convert mixed content using that json library 
is using the full config or custom config that specifies mixed elements for 
full conversion:

xquery version "1.0-ml";

import module namespace json="http://marklogic.com/xdmp/json;
 at "/MarkLogic/json/json.xqy";

let $xml := hello world for the win!
let $config := json:config("full")
let $_ := map:put($config, "full-element-names", "p")
return json:transform-from-json(json:transform-to-json($xml, $config), $config)

Cheers

PS: nothing stops you from writing your own xml-json conversion lib though. 
Particularly xml 2 json should be fairly trivial with a bit of XSLT..

From: 
<general-boun...@developer.marklogic.com<mailto:general-boun...@developer.marklogic.com>>
 on behalf of Florent Georges <li...@fgeorges.org<mailto:li...@fgeorges.org>>
Reply-To: MarkLogic Developer Discussion 
<general@developer.marklogic.com<mailto:general@developer.marklogic.com>>
Date: Thursday, April 13, 2017 at 4:08 PM
To: MarkLogic Developer Discussion 
<general@developer.marklogic.com<mailto:general@developer.marklogic.com>>
Subject: Re: [MarkLogic Dev General] json:config for XML schema

Hi,

> JSON is ill-suited for hierarchical content.

For the record, JSON is particularly bad at what we call "mixed content" in XML 
(what I guess Geert meant by "inline elements").

The idiomatic example of mixed content is the P element in HTML: it can contain 
text nodes as direct children, intermingled with other elements like B or EM or 
SPAN containing text nodes or other such elements themselves.

Regards,

--
Florent Georges
H2O Consulting
\http://h2o.consulting/


On 13 April 2017 at 15:55, Steiner, David J. (LNG-DAY) wrote:
Hi Geert,

Yes, I’ve looked at “full” – way too verbose and I’m very well aware of JSON is 
ill-suited for hierarchical content.

Thanks,
David

From:general-boun...@developer.marklogic.com<mailto:general-boun...@developer.marklogic.com>
 
[mailto:general-boun...@developer.marklogic.com<mailto:general-boun...@developer.marklogic.com>]
 On Behalf Of Geert Josten
Sent: Thursday, April 13, 2017 9:47 AM
To: MarkLogic Developer Discussion
Subject: Re: [MarkLogic Dev General] json:config for XML schema


Hi David,

That sounds like a very large xsd. Keep in mind JSON is not very well suited 
for inline elements. I reckon you looked at the full strategy option of 
json:config? Rather verbose, but simple, and reliable roundtrip..

Cheers,
Geert

From: 
<general-boun...@developer.marklogic.com<mailto:general-boun...@developer.marklogic.com>>
 on behalf of "Steiner, David J. (LNG-DAY)" 
<david.j.stei...@elsevier.com<mailto:david.j.stei...@elsevier.com>>
Reply-To: MarkLogic Developer Discussion 
<general@developer.marklogic.com<mailto:general@developer.marklogic.com>>
Date: Thursday, April 13, 2017 at 3:18 PM
To: MarkLogic Developer Discussion 
<general@developer.marklogic.com<mailto:general@developer.marklogic.com>>
Subject: [MarkLogic Dev General] json:config for XML schema

Has anyone developed a method (utility?) of converting a XML Schema into custom 
strategy?

I’ve briefly looked at xml4js but from what I gleaned, it seems like you have 
to go through and enter every import/include and I’m starting with 44 imports 
in the first xsd and each of those probably has imports as well.

Thanks,
David



___
General mailing list
General@developer.marklogic.com<mailto:General@developer.marklogic.com>
Manage your subscription at:
http://developer.marklogic.com/mailman/listinfo/general





___
General mailing list
General@developer.marklogic.com
Manage your subscription at: 
http://developer.marklogic.com/mailman/listinfo/general


Re: [MarkLogic Dev General] json:config for XML schema

2017-04-13 Thread Geert Josten
Hi David,

That sounds like a very large xsd. Keep in mind JSON is not very well suited 
for inline elements. I reckon you looked at the full strategy option of 
json:config? Rather verbose, but simple, and reliable roundtrip..

Cheers,
Geert

From: 
>
 on behalf of "Steiner, David J. (LNG-DAY)" 
>
Reply-To: MarkLogic Developer Discussion 
>
Date: Thursday, April 13, 2017 at 3:18 PM
To: MarkLogic Developer Discussion 
>
Subject: [MarkLogic Dev General] json:config for XML schema

Has anyone developed a method (utility?) of converting a XML Schema into custom 
strategy?

I’ve briefly looked at xml4js but from what I gleaned, it seems like you have 
to go through and enter every import/include and I’m starting with 44 imports 
in the first xsd and each of those probably has imports as well.

Thanks,
David


___
General mailing list
General@developer.marklogic.com
Manage your subscription at: 
http://developer.marklogic.com/mailman/listinfo/general


Re: [MarkLogic Dev General] Sending an HTML Email With Attachments

2017-04-13 Thread Geert Josten
Here is the function that takes multiple content parts, and sends a multipart 
email:

declare function util:send-email(
  $from-name as xs:string,
  $from-email as xs:string,
  $to-names as xs:string+,
  $to-emails as xs:string+,
  $cc-names as xs:string*,
  $cc-emails as xs:string*,
  $subject as xs:string,
  $content-types as xs:string*,
  $content-filenames as xs:string*,
  $content as item()*
) as empty-sequence() {
  let $newline := ""
  let $boundary := concat("boundary-", xdmp:random())
  let $encoded-content := xdmp:multipart-encode(
$boundary,
{
  for $item at $i in $content
  let $content-type := ($content-types[$i], "text/html")[1]
  let $filename := ($content-filenames[$i], "untitled.html")[1]
  return

  
{$content-type}
{
  if ($item instance of binary() or $filename != "") then
concat("attachment; filename=", $filename)
  else
"inline"
}
{
  if ($item instance of binary() or $filename != "") then
"base64"
  else
"quoted-printable"
}
  

},
for $item at $i in $content
let $content-type := ($content-types[$i], "text/html")[1]
return
  if ($item instance of binary() or ($item instance of document-node() and 
$item/binary())) then
document{ xs:base64Binary($item) }
  else if (contains($content-type, "html") and (not($content instance of 
element()) or empty($content/xhtml:html))) then
http://www.w3.org/1999/xhtml;>
  
{$subject}
  
  {$item}

  else if ($item instance of node()) then
$item
  else
(: multipart encode requires nodes as input :)
document{ $item }
  )

  let $to :=
for $email at $i in $to-emails
let $name := $to-names[$i]
return
  
{$name}
{$email}
  
  let $cc :=
for $email at $i in $cc-emails
let $name := $cc-names[$i]
return
  
{$name}
{$email}
  
  return
  try {
  xdmp:email(

  {$subject}
  

  {$from-name}
  {$from-email}

  
  

  recipients
  {$to}

  
  

  cc
  {$cc}

  
  multipart/mixed; boundary={$boundary}
  {xdmp:binary-decode($encoded-content, 
"utf-8")}

  )
  } catch * {
()
  }
};

It might take a little effort to piece the two together, but it should contain 
all detail you need.

Kind regards,
Geert

From: 
<general-boun...@developer.marklogic.com<mailto:general-boun...@developer.marklogic.com>>
 on behalf of Geert Josten 
<geert.jos...@marklogic.com<mailto:geert.jos...@marklogic.com>>
Reply-To: MarkLogic Developer Discussion 
<general@developer.marklogic.com<mailto:general@developer.marklogic.com>>
Date: Thursday, April 13, 2017 at 10:57 AM
To: MarkLogic Developer Discussion 
<general@developer.marklogic.com<mailto:general@developer.marklogic.com>>
Subject: Re: [MarkLogic Dev General] Sending an HTML Email With Attachments

Hi William,

We use this method in demo-cat, which might make it a little easier for you to 
handle sending emails:

https://github.com/marklogic/demo-cat/blob/master/src/lib/utilities.xqy#L16

Note that the $message param is expecting xhtml elements..

It is lacking the attachments bit, but i’ll search my archive a bit. I might be 
able to recover how i did that..

Cheers,
Geert

From: 
<general-boun...@developer.marklogic.com<mailto:general-boun...@developer.marklogic.com>>
 on behalf of "William Holmes (WLT GB)" 
<william.hol...@wlt.com<mailto:william.hol...@wlt.com>>
Reply-To: MarkLogic Developer Discussion 
<general@developer.marklogic.com<mailto:general@developer.marklogic.com>>
Date: Thursday, April 13, 2017 at 10:00 AM
To: MarkLogic Developer Discussion 
<general@developer.marklogic.com<mailto:general@developer.marklogic.com>>
Subject: Re: [MarkLogic Dev General] Sending an HTML Email With Attachments

Thanks for your question Gert,

I’m trying to send the second one, an HTML formatted message with PDF 
attachments.

From:general-boun...@developer.marklogic.com<mailto:general-boun...@developer.marklogic.com>
 [mailto:general-boun...@developer.marklogic.com] On Behalf Of Geert Josten
Sent: 12 April 2017 20:05
To: MarkLogic Developer Discussion
Subject: Re: [MarkLogic Dev General] Sending an HTML Email With Attachments

Hi William,

Are you trying to send html with embedded images or such, or just a pretty 
formatted message (in html) with some pdf or other doc as collateral?

We use html formatted messages in demo-cat, but I am sure I have

Re: [MarkLogic Dev General] Sending an HTML Email With Attachments

2017-04-13 Thread Geert Josten
Hi William,

We use this method in demo-cat, which might make it a little easier for you to 
handle sending emails:

https://github.com/marklogic/demo-cat/blob/master/src/lib/utilities.xqy#L16

Note that the $message param is expecting xhtml elements..

It is lacking the attachments bit, but i’ll search my archive a bit. I might be 
able to recover how i did that..

Cheers,
Geert

From: 
<general-boun...@developer.marklogic.com<mailto:general-boun...@developer.marklogic.com>>
 on behalf of "William Holmes (WLT GB)" 
<william.hol...@wlt.com<mailto:william.hol...@wlt.com>>
Reply-To: MarkLogic Developer Discussion 
<general@developer.marklogic.com<mailto:general@developer.marklogic.com>>
Date: Thursday, April 13, 2017 at 10:00 AM
To: MarkLogic Developer Discussion 
<general@developer.marklogic.com<mailto:general@developer.marklogic.com>>
Subject: Re: [MarkLogic Dev General] Sending an HTML Email With Attachments

Thanks for your question Gert,

I’m trying to send the second one, an HTML formatted message with PDF 
attachments.

From: 
general-boun...@developer.marklogic.com<mailto:general-boun...@developer.marklogic.com>
 [mailto:general-boun...@developer.marklogic.com] On Behalf Of Geert Josten
Sent: 12 April 2017 20:05
To: MarkLogic Developer Discussion
Subject: Re: [MarkLogic Dev General] Sending an HTML Email With Attachments

Hi William,

Are you trying to send html with embedded images or such, or just a pretty 
formatted message (in html) with some pdf or other doc as collateral?

We use html formatted messages in demo-cat, but I am sure I have also sent an 
attachment with success. I had trouble figuring out how to show attachments as 
part of the html message, but maybe you don’t need to go that far?

Cheers,
Geert

From: 
<general-boun...@developer.marklogic.com<mailto:general-boun...@developer.marklogic.com>>
 on behalf of "William Holmes (WLT GB)" 
<william.hol...@wlt.com<mailto:william.hol...@wlt.com>>
Reply-To: MarkLogic Developer Discussion 
<general@developer.marklogic.com<mailto:general@developer.marklogic.com>>
Date: Wednesday, April 12, 2017 at 11:27 AM
To: "general@developer.marklogic.com<mailto:general@developer.marklogic.com>" 
<general@developer.marklogic.com<mailto:general@developer.marklogic.com>>
Subject: [MarkLogic Dev General] Sending an HTML Email With Attachments

Hi,

I am trying to send an email using xdmp:email with an HTML body and attachments.

The only example on http://docs.marklogic.com/xdmp:email is with plain text in 
the body.

Does anyone have any examples of an email body containing HTML, e.g., hello 
world with attachments?

Thanks,

William Holmes

GOGREEN Climate Protection with DHL: please consider your environmental 
responsibility before printing this email.

This email is intended exclusively for the individual or entity to which it is 
addressed. This communication may contain information that is proprietary, 
privileged or confidential. If you are not the named addressee, you are not 
authorized to read, print, retain, copy or disseminate this message or any part 
of it. If you have received this message in error, please notify the sender 
immediately by email and delete all copies of the message.


GOGREEN Climate Protection with DHL: please consider your environmental 
responsibility before printing this email.

This email is intended exclusively for the individual or entity to which it is 
addressed. This communication may contain information that is proprietary, 
privileged or confidential. If you are not the named addressee, you are not 
authorized to read, print, retain, copy or disseminate this message or any part 
of it. If you have received this message in error, please notify the sender 
immediately by email and delete all copies of the message.
___
General mailing list
General@developer.marklogic.com
Manage your subscription at: 
http://developer.marklogic.com/mailman/listinfo/general


Re: [MarkLogic Dev General] Sending an HTML Email With Attachments

2017-04-12 Thread Geert Josten
Hi William,

Are you trying to send html with embedded images or such, or just a pretty 
formatted message (in html) with some pdf or other doc as collateral?

We use html formatted messages in demo-cat, but I am sure I have also sent an 
attachment with success. I had trouble figuring out how to show attachments as 
part of the html message, but maybe you don’t need to go that far?

Cheers,
Geert

From: 
>
 on behalf of "William Holmes (WLT GB)" 
>
Reply-To: MarkLogic Developer Discussion 
>
Date: Wednesday, April 12, 2017 at 11:27 AM
To: "general@developer.marklogic.com" 
>
Subject: [MarkLogic Dev General] Sending an HTML Email With Attachments

Hi,

I am trying to send an email using xdmp:email with an HTML body and attachments.

The only example on http://docs.marklogic.com/xdmp:email is with plain text in 
the body.

Does anyone have any examples of an email body containing HTML, e.g., hello 
world with attachments?

Thanks,

William Holmes


GOGREEN Climate Protection with DHL: please consider your environmental 
responsibility before printing this email.

This email is intended exclusively for the individual or entity to which it is 
addressed. This communication may contain information that is proprietary, 
privileged or confidential. If you are not the named addressee, you are not 
authorized to read, print, retain, copy or disseminate this message or any part 
of it. If you have received this message in error, please notify the sender 
immediately by email and delete all copies of the message.
___
General mailing list
General@developer.marklogic.com
Manage your subscription at: 
http://developer.marklogic.com/mailman/listinfo/general


Re: [MarkLogic Dev General] xdmp:parse-dateTime

2017-04-11 Thread Geert Josten
The parse-dateTime function will parse dates before the start of the gregorian 
calendar, but it won’t really be a gDate. For instance:

xdmp:parse-dateTime('[D1] [MN] [Y001]', '15 OCTOBER 1582') - 
xs:dayTimeDuration("P1D")

returns 1582-10-14, but officially there was a jump from oct 4 on the Julian 
calendar to oct 15 on the Gregorian calendar.

Cheers,
Geert

From: 
>
 on behalf of John Snelson 
>
Reply-To: MarkLogic Developer Discussion 
>
Date: Wednesday, April 12, 2017 at 2:26 AM
To: "general@developer.marklogic.com" 
>
Subject: Re: [MarkLogic Dev General] xdmp:parse-dateTime

That is the start of the Gregorian calendar:

https://en.wikipedia.org/wiki/1582

You can't use Gregorian calendar based functionality to handle dates before 
that calendar began. If this is really a requirement, you'll probably know 
enough about older calendars to write your own date handling routines.

John

On 11/04/17 18:31, Oleksii Segeda wrote:
Hi everyone,

The docs says that xdmp:parse-dateTime will not return the correct dateTime 
value for dates before October 15, 1582. What should I use for dates before 
October 15, 1582?

Regards,
Oleksii Segeda

IT Analyst

Information and Technology Solutions

[http://siteresources.worldbank.org/NEWS/Images/spacer.png]

[http://siteresources.worldbank.org/NEWS/Images/WBG_Information_and_Technology_Solutions.png]






___
General mailing list
General@developer.marklogic.com
Manage your subscription at:
http://developer.marklogic.com/mailman/listinfo/general



--
John Snelson, Principal Engineer  http://twitter.com/jpcs
MarkLogic Corporation http://www.marklogic.com
___
General mailing list
General@developer.marklogic.com
Manage your subscription at: 
http://developer.marklogic.com/mailman/listinfo/general


Re: [MarkLogic Dev General] Regarding Text File

2017-04-11 Thread Geert Josten
Hi Siva,

Simplest would be to store them as binary nodes. That causes them to get 
excluded from universal index, but with files that big that might just be what 
you need..

Kind regards,
Geert

From: 
>
 on behalf of "Mani, Sivasubramani (ELS)" 
>
Reply-To: MarkLogic Developer Discussion 
>
Date: Tuesday, April 11, 2017 at 2:42 PM
To: "general@developer.marklogic.com" 
>
Cc: ConSyn-Infosys-Support 
>
Subject: [MarkLogic Dev General] Regarding Text File

Hi Team,

Text files have the limit of 64MB in Mark Logic , how I handle the text files 
larger than 64 MB ? Kindly do the needful.

Thanks & Regards,
Siva

___
General mailing list
General@developer.marklogic.com
Manage your subscription at: 
http://developer.marklogic.com/mailman/listinfo/general


Re: [MarkLogic Dev General] Scramble production data for testing in ML 8.x

2017-04-04 Thread Geert Josten
Hi Shan,

Rather than doing it on ingest, you should do the scrambling on export (note: 
redaction is an export option as well).

Unfortunately, MLCP does not allow transformation at export, but it does allow 
that on copy. You could write your own transform that obfuscates sensitive data.

Another alternative might be using CORB2..

Kind regards,
Geert

From: 
>
 on behalf of Shiv Shankar 
>
Reply-To: MarkLogic Developer Discussion 
>
Date: Monday, April 3, 2017 at 9:52 PM
To: MarkLogic Developer Discussion 
>
Subject: [MarkLogic Dev General] Scramble production data for testing in ML 8.x

Hi,
I am aware we have redact option in ML9.x to scramble the data while doing 
ingestion using MLCP; Is there any similar options available in ML 8.x? . Any 
other alternative approaches to scramble the production data (sensitive data) 
into testing environment?

Regards
Shan.
___
General mailing list
General@developer.marklogic.com
Manage your subscription at: 
http://developer.marklogic.com/mailman/listinfo/general


Re: [MarkLogic Dev General] Roxy rest deployment error to MarkLogic 7.0-2.3 on RedHat 5.1

2017-03-28 Thread Geert Josten
Hi Loren,

Open rest-api/config/properties.xml in some text editor, and remove or comment 
out the line:

  merge-metadata

Must have been added in ML8+. It is hard to track such subtle changes, and 
compensate or warn about them all from within Roxy. Would be worth a ticket 
though at https://github.com/marklogic/roxy/issues

Cheers,
Geert

From: 
>
 on behalf of Loren Cahlander 
>
Reply-To: MarkLogic Developer Discussion 
>
Date: Tuesday, March 28, 2017 at 4:33 PM
To: MarkLogic Developer Discussion 
>
Subject: [MarkLogic Dev General] Roxy rest deployment error to MarkLogic 
7.0-2.3 on RedHat 5.1


I am in the process of introducing roxy to my client. They are currently 
running MarkLogic 7.0-2.3 on RedHat 5.1. When I try to deploy modules to the 
server, I get the following error:

client-search lcahlander$ ./ml dev deploy modules

Loaded 0 documents from /Users/lcahlander/IdeaProjects/client-search/src to 
192.168.56.101:9020/client-search-modules at 03/28/2017 09:59:52 am

Loading REST properties in 
/Users/lcahlander/IdeaProjects/client-search/rest-api/config/properties.xml
ERROR: 400 "Bad Request"
ERROR: http://marklogic.com/rest-api;>400Bad
 
RequestRESTAPI-INVALIDCONTENTRESTAPI-INVALIDCONTENT:
 (err:FOER) Invalid content: Property specification invalid: update-policy 
invalid property name 


I created a VirtualBox copy with CentOS 5.1 and MarkLogic 7.0-2.3 and will make 
it available to MarkLogic employees.

I used the latest roxy on my local machine to create client-search.

./ml new client-search --server-version=7 --branch=master --app-type=rest


The bootstrap went file, but the deploy modules gave the very same error.

The dev.properties is:

user=admin
password=mladmin
dev-server=192.168.56.101


app-port=9022
xcc-port=9020

server-version=7

# path to your local mlcp shell
mlcp-home=/Users/lcahlander/Library/mlcp-8.0-5/bin/mlcp.sh


The client is going to be upgrading the RedHat 7 and MarkLogic 8, so this error 
will most likely be moot, but I thought that I should report it.

- Loren

___
General mailing list
General@developer.marklogic.com
Manage your subscription at: 
http://developer.marklogic.com/mailman/listinfo/general


Re: [MarkLogic Dev General] URI_ID whitespace problems with mlcp

2017-03-23 Thread Geert Josten
Sorry, ignore my reply, it only applies to delimited_text. Thanks to Martijn 
for pointing that out to me..

@Lucas, you did not mention XML parsing errors, so maybe your XML is just fine, 
and all you try to do is take an attribute value and use that as uri. 
Unfortunately, you can’t do that with -uri_id, it only takes xml element and 
json property names. To be able to do that would require using MLCP transforms..

Kind regards,
Geert

From: 
<general-boun...@developer.marklogic.com<mailto:general-boun...@developer.marklogic.com>>
 on behalf of Geert Josten 
<geert.jos...@marklogic.com<mailto:geert.jos...@marklogic.com>>
Reply-To: MarkLogic Developer Discussion 
<general@developer.marklogic.com<mailto:general@developer.marklogic.com>>
Date: Wednesday, March 22, 2017 at 8:18 PM
To: MarkLogic Developer Discussion 
<general@developer.marklogic.com<mailto:general@developer.marklogic.com>>
Subject: Re: [MarkLogic Dev General] URI_ID whitespace problems with mlcp

Valid points all, but MLCP warns about spaces in header names, and proceeds by 
converting them to underscores before generating XML out of them.

On the other hand, though unlikely nor practical, spaces in property names are 
allowed in JSON. ;-)

Cheers,
Geert

From: 
<general-boun...@developer.marklogic.com<mailto:general-boun...@developer.marklogic.com>>
 on behalf of Florent Georges <li...@fgeorges.org<mailto:li...@fgeorges.org>>
Reply-To: MarkLogic Developer Discussion 
<general@developer.marklogic.com<mailto:general@developer.marklogic.com>>
Date: Wednesday, March 22, 2017 at 3:01 PM
To: MarkLogic Developer Discussion 
<general@developer.marklogic.com<mailto:general@developer.marklogic.com>>
Subject: Re: [MarkLogic Dev General] URI_ID whitespace problems with mlcp

Hi,

That is indeed the most likely explanation.  Just to make it clear to the OP, 
in such a situation an XML parser MUST stop normal processing (see e.g. 
http://w3.org/TR/xml/#sec-terminology, and the fact that having "" where a 
start tag is possible is ultimately breaking the document production rule).

When it comes to XML (in general, not only with MarkLogic), sometimes working 
around validity might the right solution, depending on the technical and 
non-technical context.  But having ill-formed documents never is.  Fixing 
ill-formedness is always less painful than any other solution.

Just my 2 cents.  Regards,

--
Florent Georges
H2O Consulting
http://h2o.consulting/


On 22 March 2017 at 14:14, Martijn Sintemaartensdijk wrote:
Dear Lucas,

judging from your command, I think your input file contains an XML-starttag 
"" and corresponding endtag "". Unfortunately, XML tag names 
may not contain empty spaces (See also: 
https://www.w3.org/TR/2008/REC-xml-20081126/#NT-Name).

MLCP tries to interpret the xml-file and it reports an unexpected character, 
">". MLCP assumes "_id" to be an attribute name to the tag name "uri", like 
. The next character following "_id" is therefore expected to 
be an equal sign.

I would advice you to request the output file be offered in accordance with the 
XML-specification, rather than trying to fix the document. Otherwise, I fear, 
you will be forced to use sed, or a something similar, to replace the malformed 
XML-tags through the entire document each and every time you receive a new 
version.


Met vriendelijke groet / Kind regards,



Martijn Sintemaartensdijk



[http://www.dikw.com/wp-content/uploads/2016/02/DIKW-logo-250x88-a.png]



A: Einsteinbaan 12, 3439 NJ Nieuwegein

T: (+31) 06 40 59 09 36

E: martijn.sintemaartensd...@dikw.com<mailto:martijn.sintemaartensd...@dikw.com>

W: www.dikw.nl<http://www.dikw.nl/>



Hartelijk dank voor uw waardering en 
stem!<http://www.dikw.com/algemeen-nieuws/computable-awards-2016/>



[banner 468x60 DIKW 
prijswinnaar]<http://www.dikw.com/algemeen-nieuws/computable-awards-2016/>

On 21 March 2017 at 19:02, Lucas Davenport 
<nonameacco...@gmail.com<mailto:nonameacco...@gmail.com>> wrote:
I am a newb, so forgive me if I missed this answer while searching.

I am testing ML 8 for a project at work and we have a requirement to load large 
amounts of historical data. I've read the mlcp documentation and can 
successfully import some test data, but the problem I am facing is the archive 
data has a space in the record identifier.

My command is:
 mlcp.sh import -host localhost -port 8006 -username dataload -password 
dataload -mode local -input_file_path ../xml/MD2014aggregate.xml 
-input_file_type aggregates -aggregate_record_element row -uri_id "row _id" 
-output_uri_prefix /traffic/MD -output_uri_suffix .xml -output_collections 
published

This produces the following error:
17/03/21 13:49:20 ERROR contentpump.ContentPump: Unrecognized argument: \_id

I've escaped both the space a

Re: [MarkLogic Dev General] Unfiltered, exact searches

2017-03-23 Thread Geert Josten
Hi Andreas,

Sounds like a bug indeed. It is as if it appends a case-insensitive flag 
despite the ‘exact’, because of the all-lowercase ’new’. Can you tell which 
version of MarkLogic you are running, and on which architecture?

Cheers,
Geert

From: 
>
 on behalf of Andreas Hubmer 
>
Reply-To: MarkLogic Developer Discussion 
>
Date: Thursday, March 23, 2017 at 8:40 AM
To: MarkLogic Developer Discussion 
>
Subject: [MarkLogic Dev General] Unfiltered, exact searches

Hi,

There seems to be a bug related to unfiltered and exact value searches.

We are using value queries in the Java API, but I've boiled it down to cts 
searches.
The following snippet exhibits the wrong behavior:

xquery version "1.0-ml";
xdmp:document-insert("/bug/doc.xml", NEW)
;

"Document is found: OK",
cts:search(/,
  cts:and-query((cts:directory-query("/bug/", "infinity"), 
cts:element-value-query(xs:QName("status"), "NEW", ("exact",
  "unfiltered"
)

,"---",
"Document is not found: OK",
cts:search(/,
  cts:and-query((cts:directory-query("/bug/", "infinity"), 
cts:element-value-query(xs:QName("status"), "NEw", ("exact",
  "unfiltered"
)

,"---",
"Document is found: WRONG",
cts:search(/,
  cts:and-query((cts:directory-query("/bug/", "infinity"), 
cts:element-value-query(xs:QName("status"), "new", ("exact",
  "unfiltered"
)

We are using a database with fast-case-sensitive-searches and 
fast-diacritic-sensitive-searches turned on, while all other indexes are turned 
off.
As far as I know only the two indexes are needed for unfiltered exact value 
searches.

Regards,
Andreas
___
General mailing list
General@developer.marklogic.com
Manage your subscription at: 
http://developer.marklogic.com/mailman/listinfo/general


Re: [MarkLogic Dev General] potential non-conformance with RFC 3986?

2017-03-23 Thread Geert Josten
This seems to sum up all relevant parts nicely:

http://stackoverflow.com/questions/15641694/are-uris-case-insensitive/26196170#26196170

And it seems to confirm your statements. I raised RFE #3921 on your behalf..

Cheers,
Geert

From: 
>
 on behalf of Jakob Fix >
Reply-To: MarkLogic Developer Discussion 
>
Date: Thursday, March 23, 2017 at 1:02 AM
To: General Mark Logic Developer Discussion 
>
Subject: [MarkLogic Dev General] potential non-conformance with RFC 3986?

Hello,

we recently observed an unexpected behaviour in how MarkLogic treats the keys 
in the query part of a submitted URL (note the case of the two query param 
keys):

http://localhost:/app/test.xqy?param=yes=no

let $q1 := xdmp:get-request-field('param')
let $q2 := xdmp:get-request-field('Param')

return "q1: " || $q1 || " -- q2: " || $q2

one should reasonably expect to see the following result:

q1: yes -- q2: no

However, the actual result is an error because "arg1 is not of type 
xs:anyAtomicType?"

XDMP-ARGTYPE: (err:XPTY0004) "q1: " || $q1 || " -- q2: " || $q2 -- arg1 is not 
of type xs:anyAtomicType?in /app/test.xqy, at 7:11 [1.0-ml]
$q1 = ("yes", "no")
$q2 = ("yes", "no")

xdmp:get-request-field-names() correctly returns both 'param' and 'Param'.

For some reason, MarkLogic normalises (presumably lowercases) the keys of the 
query string when looking up a query parameter value which seems to be counter 
to what is described in section 6.2.2.1 Case normalisation of RFC 3986 [1]:

When a URI uses components of the generic syntax, the component syntax 
equivalence rules always apply; namely, that the scheme and host are 
case-insensitive and therefore should be normalized to lowercase. For example, 
the URI  is equivalent to . 
The other generic syntax components are assumed to be case-sensitive unless 
specifically defined otherwise by the scheme (see Section 6.2.3).

Are we interpreting the RFC wrongly?

Yes, I've tested this on 8.0-6.3.

cheers,
Jakob.

PS: Thanks to my colleague Romuald for mentioning this over beer! ;-)


[1] https://tools.ietf.org/html/rfc3986#section-6.2.2.1

___
General mailing list
General@developer.marklogic.com
Manage your subscription at: 
http://developer.marklogic.com/mailman/listinfo/general


Re: [MarkLogic Dev General] URI_ID whitespace problems with mlcp

2017-03-22 Thread Geert Josten
Valid points all, but MLCP warns about spaces in header names, and proceeds by 
converting them to underscores before generating XML out of them.

On the other hand, though unlikely nor practical, spaces in property names are 
allowed in JSON. ;-)

Cheers,
Geert

From: 
>
 on behalf of Florent Georges >
Reply-To: MarkLogic Developer Discussion 
>
Date: Wednesday, March 22, 2017 at 3:01 PM
To: MarkLogic Developer Discussion 
>
Subject: Re: [MarkLogic Dev General] URI_ID whitespace problems with mlcp

Hi,

That is indeed the most likely explanation.  Just to make it clear to the OP, 
in such a situation an XML parser MUST stop normal processing (see e.g. 
http://w3.org/TR/xml/#sec-terminology, and the fact that having "" where a 
start tag is possible is ultimately breaking the document production rule).

When it comes to XML (in general, not only with MarkLogic), sometimes working 
around validity might the right solution, depending on the technical and 
non-technical context.  But having ill-formed documents never is.  Fixing 
ill-formedness is always less painful than any other solution.

Just my 2 cents.  Regards,

--
Florent Georges
H2O Consulting
http://h2o.consulting/


On 22 March 2017 at 14:14, Martijn Sintemaartensdijk wrote:
Dear Lucas,

judging from your command, I think your input file contains an XML-starttag 
"" and corresponding endtag "". Unfortunately, XML tag names 
may not contain empty spaces (See also: 
https://www.w3.org/TR/2008/REC-xml-20081126/#NT-Name).

MLCP tries to interpret the xml-file and it reports an unexpected character, 
">". MLCP assumes "_id" to be an attribute name to the tag name "uri", like 
. The next character following "_id" is therefore expected to 
be an equal sign.

I would advice you to request the output file be offered in accordance with the 
XML-specification, rather than trying to fix the document. Otherwise, I fear, 
you will be forced to use sed, or a something similar, to replace the malformed 
XML-tags through the entire document each and every time you receive a new 
version.


Met vriendelijke groet / Kind regards,



Martijn Sintemaartensdijk



[http://www.dikw.com/wp-content/uploads/2016/02/DIKW-logo-250x88-a.png]



A: Einsteinbaan 12, 3439 NJ Nieuwegein

T: (+31) 06 40 59 09 36

E: martijn.sintemaartensd...@dikw.com

W: www.dikw.nl



Hartelijk dank voor uw waardering en 
stem!



[banner 468x60 DIKW 
prijswinnaar]

On 21 March 2017 at 19:02, Lucas Davenport 
> wrote:
I am a newb, so forgive me if I missed this answer while searching.

I am testing ML 8 for a project at work and we have a requirement to load large 
amounts of historical data. I've read the mlcp documentation and can 
successfully import some test data, but the problem I am facing is the archive 
data has a space in the record identifier.

My command is:
 mlcp.sh import -host localhost -port 8006 -username dataload -password 
dataload -mode local -input_file_path ../xml/MD2014aggregate.xml 
-input_file_type aggregates -aggregate_record_element row -uri_id "row _id" 
-output_uri_prefix /traffic/MD -output_uri_suffix .xml -output_collections 
published

This produces the following error:
17/03/21 13:49:20 ERROR contentpump.ContentPump: Unrecognized argument: \_id

I've escaped both the space and the underscore (row\ _id and row\ \_id) and 
still get the same error. I've also wrapped in in single quotes and double 
quotes.

I'm trying to keep from having to use sed to remove the space between row and 
_id in the entire file.

Is there a way to make mlcp see the URI_ID literally as "row _id"?

Thanks in advance.

___
General mailing list
General@developer.marklogic.com
Manage your subscription at:
http://developer.marklogic.com/mailman/listinfo/general



___
General mailing list
General@developer.marklogic.com
Manage your subscription at:
http://developer.marklogic.com/mailman/listinfo/general





___
General mailing list
General@developer.marklogic.com
Manage your subscription at: 
http://developer.marklogic.com/mailman/listinfo/general


Re: [MarkLogic Dev General] problem with importing Reduency tuples.

2017-03-22 Thread Geert Josten
If you talk about semantics, you probably mean triples instead of tuples (which 
is a more generic term). If you use SPARQL to query your RDF data / triples, 
you don’t need to worry about duplicate triples. The triple/sparql engine will 
deduplicate for you automatically.

Kind regards,
Geert

From: 
>
 on behalf of NAVEEN KUMAR MOTIPALLI Computer Science & Engineering 
>
Reply-To: MarkLogic Developer Discussion 
>
Date: Wednesday, March 22, 2017 at 5:29 AM
To: "general@developer.marklogic.com" 
>
Subject: [MarkLogic Dev General] problem with importing Reduency tuples.

I working with semantics using ML 8 as backend database. i storing tuples into 
database. now the problem occurred is, it unable to detect reduency tuples 
inserting into ML. Is there any way to set to eliminate reduency tuples 
entering into database. or we want pre-process every tuples in ML before 
inserting into database.
___
General mailing list
General@developer.marklogic.com
Manage your subscription at: 
http://developer.marklogic.com/mailman/listinfo/general


Re: [MarkLogic Dev General] URI_ID whitespace problems with mlcp

2017-03-21 Thread Geert Josten
Hi Lucas,

I’d recommend using option files. Put each arg on a separate line in a plain 
text file. Extension free to pick, extra empty lines are allowed for extra 
readability. The benefit is that you won’t be bothered by the double escaping 
of first passing in args to mlcp.sh, which in turn makes a sys-call to java 
with unescaped args.

Not sure it will be enough to solve the issue with spaces in your record 
identifier, but worth a shot.

If that is not enough, use -generate_uri to get sequential database uris, and 
optionally combine with an MLCP transform to rewrite the uri to the desired 
value yourself..

Cheers,
Geert

From: 
>
 on behalf of Lucas Davenport 
>
Reply-To: MarkLogic Developer Discussion 
>
Date: Tuesday, March 21, 2017 at 7:02 PM
To: "general@developer.marklogic.com" 
>
Subject: [MarkLogic Dev General] URI_ID whitespace problems with mlcp

I am a newb, so forgive me if I missed this answer while searching.

I am testing ML 8 for a project at work and we have a requirement to load large 
amounts of historical data. I've read the mlcp documentation and can 
successfully import some test data, but the problem I am facing is the archive 
data has a space in the record identifier.

My command is:
 mlcp.sh import -host localhost -port 8006 -username dataload -password 
dataload -mode local -input_file_path ../xml/MD2014aggregate.xml 
-input_file_type aggregates -aggregate_record_element row -uri_id "row _id" 
-output_uri_prefix /traffic/MD -output_uri_suffix .xml -output_collections 
published

This produces the following error:
17/03/21 13:49:20 ERROR contentpump.ContentPump: Unrecognized argument: \_id

I've escaped both the space and the underscore (row\ _id and row\ \_id) and 
still get the same error. I've also wrapped in in single quotes and double 
quotes.

I'm trying to keep from having to use sed to remove the space between row and 
_id in the entire file.

Is there a way to make mlcp see the URI_ID literally as "row _id"?

Thanks in advance.
___
General mailing list
General@developer.marklogic.com
Manage your subscription at: 
http://developer.marklogic.com/mailman/listinfo/general


Re: [MarkLogic Dev General] Regarding Error in Marklogic forests

2017-03-20 Thread Geert Josten
Hi Siva,

I think it would be wise to reach out to support for this.

Cheers,
Geert

From: 
>
 on behalf of "Mani, Sivasubramani (ELS)" 
>
Reply-To: MarkLogic Developer Discussion 
>
Date: Tuesday, March 21, 2017 at 6:34 AM
To: "general@developer.marklogic.com" 
>
Cc: "Sodihardjo, Aiwen (ELS-AMS)" 
>, Suresh 
Kaliyaperumal22 >
Subject: [MarkLogic Dev General] Regarding Error in Marklogic forests

Hi Team,

I get the following error in Mark Logic Forests “There is currently an 
XDMP-FORESTERR: Error in reindex of forest forestname. XDMP-NEWSTAMP: Timestamp 
too new for forest forestname (14900473821330230) exception. Information on 
this page may be missing.” Out 16 forests 6 forests face this issue, Kindly 
provide me a solution to resolve this.


Thanks & Regards,
Siva

___
General mailing list
General@developer.marklogic.com
Manage your subscription at: 
http://developer.marklogic.com/mailman/listinfo/general


Re: [MarkLogic Dev General] Using RegEx in xQuery

2017-03-20 Thread Geert Josten
You may want to unwrap entity:entity and suppress entity:entityattr instead, 
but otherwise this should work just fine all the way down to at least MarkLogic 
5.. :)

Cheers

From: 
>
 on behalf of Christopher Hamlin >
Reply-To: MarkLogic Developer Discussion 
>
Date: Monday, March 20, 2017 at 4:29 PM
To: MarkLogic Developer Discussion 
>
Subject: Re: [MarkLogic Dev General] Using RegEx in xQuery

I don't know off-hand of changes in xslt between 7 and 8.

Something like this in 8 is what I was thinking, don't know if it is really 
what you need:

let $doc := (: blah blah blah :)
let $xslt :=
http://www.w3.org/1999/XSL/Transform; 
xmlns:ir="incisive-repository">
  

  

  
  

return xdmp:xslt-eval ($xslt, $doc)
___
General mailing list
General@developer.marklogic.com
Manage your subscription at: 
http://developer.marklogic.com/mailman/listinfo/general


Re: [MarkLogic Dev General] Error 500 when queryinq with Curl

2017-03-19 Thread Geert Josten
Hi Ghislain,

In you original mail, you wrote you used:
curl --anyauth -u user:password -H "Content-type: application/sparql-query" -H 
"Accept: application/sparql-results+xml" --data-binary '@./q1.rq' 
http://localhost:8000/v1/graphs/sparql
To target a specific database you could write:
curl --anyauth -u user:password -H "Content-type: application/sparql-query" -H 
"Accept: application/sparql-results+xml" --data-binary '@./q1.rq' 
http://localhost:8000/v1/graphs/sparql?database=myExistingDB
To target a different rest-api:
curl --anyauth -u user:password -H "Content-type: application/sparql-query" -H 
"Accept: application/sparql-results+xml" --data-binary '@./q1.rq' 
http://localhost:8020/v1/graphs/sparql<http://localhost:8000/v1/graphs/sparql>
Kind regards,
Geert

From: 
<general-boun...@developer.marklogic.com<mailto:general-boun...@developer.marklogic.com>>
 on behalf of Ghislain Atemezing 
<ghislain.atemez...@mondeca.com<mailto:ghislain.atemez...@mondeca.com>>
Reply-To: MarkLogic Developer Discussion 
<general@developer.marklogic.com<mailto:general@developer.marklogic.com>>
Date: Sunday, March 19, 2017 at 8:04 PM
To: MarkLogic Developer Discussion 
<general@developer.marklogic.com<mailto:general@developer.marklogic.com>>
Subject: Re: [MarkLogic Dev General] Error 500 when queryinq with Curl

Hi Geert,
Many thanks for your input. I don’t understand well the first approach. Could 
you tell me how I can add the database to query in my Curl example?
Also, I’ve tried to follow the second recommendation. See below the steps.
[
$ cat config.xml
http://marklogic.com/rest-api;>
  MyREST
  myExistingDB
  8020


# curl to create

$ curl -X POST --anyauth --user user:pwd -d @"./config.xml" \
-H "Content-type: application/xml" \
http://localhost:8002/LATEST/rest-apis After creating the rest-api in a 
different
t port, using the existing database, I still get the same error.
]

Could you please tell me more details on how to solve this issue?
 I really need it to make some analysis on users queries.

TIA.
Best,
Ghislain

Le 17 mars 2017 à 19:43, Geert Josten 
<geert.jos...@marklogic.com<mailto:geert.jos...@marklogic.com>> a écrit :

Hi Ghislain,

You probably want to add a database parameter, pointing to the content database 
you’d like to query. Or use a rest-api instance linked to that content database 
directly, running on a different port. App-services (which runs on 8000) is 
linked to the Documents database, which out of the box does not have triple 
index enabled.

Kind regards,
Geert

From: 
<general-boun...@developer.marklogic.com<mailto:general-boun...@developer.marklogic.com>>
 on behalf of Ghislain Atemezing-Pro 
<ghislain.atemez...@mondeca.com<mailto:ghislain.atemez...@mondeca.com>>
Reply-To: MarkLogic Developer Discussion 
<general@developer.marklogic.com<mailto:general@developer.marklogic.com>>
Date: Friday, March 17, 2017 at 6:39 PM
To: MarkLogic Developer Discussion 
<general@developer.marklogic.com<mailto:general@developer.marklogic.com>>
Subject: [MarkLogic Dev General] Error 500 when queryinq with Curl

Hi all?

I am trying to run a SPARQL query using Curl in my endpoint.
I am doing the following:

curl --anyauth -u user:password -H "Content-type: application/sparql-query" -H 
"Accept: application/sparql-results+xml" --data-binary '@./q1.rq' 
http://localhost:8000/v1/graphs/sparql

But I receive back a 500 errors as described below:

http://marklogic.com/xdmp/error;>
  500
  Internal Server Error
  INTERNAL ERROR
  XDMP-TRPLIDXNOTFOUND: 
xdmp:security-assert("http://marklogic.com/xdmp/privileges/rest-reader;, 
"execute"), let $rule := conf:get-sparql-protocol-rule() let $params := 
rest:process-request($rule) let $headers := eput:get-request-headers() let 
$method := eput:get-request-method($headers) let $env := map:map() let $params 
:= local:validate-params($rule, $env, $params) let $body := switch ($method) 
case "GET" return text { fn:head((map:get($params, "query"), map:get($params, 
"update"))) } case "POST" return 
xdmp:get-request-body(eput:get-content-format($headers, $params))/node() 
default return fn:error((), "REST-UNSUPPORTEDMETHOD", $method) let $result := 
semmod:sparql-query($headers, $params, $body) let $response := if ($result 
instance of xs:string and $result = ("EMPTY-CONSTRUCT", "EMPTY-DESCRIBE")) then 
semmod:empty-construct($headers, $params, local:sparql-callback#2) else if 
($result instance of xs:string and $result eq "EMPTY-SELECT") then 
semmod:empty-select($headers, $params, local:sparql-callback#2) else 
semmod:results-payload($headers, $params, $result, local:sparql-callback#2) 
return if ($response instance of node() and 
$response/sel

  1   2   3   4   5   6   7   8   9   10   >