[MarkLogic Dev General] full stack trace

2017-11-21 Thread Oleksii Segeda
Hello,

Is there a way to turn off stack trace trimming? By default it puts ellipsis 
after certain number of characters, which is a pain for debugging.

http://developer.marklogic.com/mailman/listinfo/general


Re: [MarkLogic Dev General] MLCP heap error

2017-11-16 Thread Oleksii Segeda
Gary,

Try to set -Xmx in addition to -Xms


Best,
Oleksii

From: general-boun...@developer.marklogic.com 
[mailto:general-boun...@developer.marklogic.com] On Behalf Of Gary Larsen
Sent: Thursday, November 16, 2017 4:55 PM
To: 'General MarkLogic Developer Discussion' 
Subject: [MarkLogic Dev General] MLCP heap error

Hi,

Trying to use MLCP on Windows to export to an archive in local mode but can't 
get past a heap error.  It's probably something obvious I'm missing and would 
appreciate the help.

Thanks,
Gary

This is the script I created:

SET JAVA_OPTS= -Xms5000m
mlcp.bat export -host localhost -port 8104 -username Admin -password envisn  
-mode local -output_file_path C:\a-backup\MLCP\c1108  -copy_collections true 
-output_type archive

Here's the output:

C:\a-work\mlcp\mlcp-8.0.6.3\bin>rem export to archive

C:\a-work\mlcp\mlcp-8.0.6.3\bin>SET JAVA_OPTS= -Xms5000m

C:\a-work\mlcp\mlcp-8.0.6.3\bin>mlcp.bat export -host localhost -port 8104 
-username Admin -password envisn  -mode local -output_file_path 
C:\a-backup\MLCP\c1108  -copy_collections true -output_type archive
17/11/16 16:43:17 INFO mapreduce.MarkLogicInputFormat: Fetched 1 forest splits.
17/11/16 16:43:17 INFO mapreduce.MarkLogicInputFormat: Made 30 splits.
17/11/16 16:43:18 INFO contentpump.LocalJobRunner:  completed 0%
17/11/16 16:43:27 INFO contentpump.LocalJobRunner:  completed 1%
17/11/16 16:43:45 ERROR contentpump.LocalJobRunner: Error running task:
java.lang.OutOfMemoryError: Java heap space
at java.util.Arrays.copyOf(Unknown Source)
at java.lang.AbstractStringBuilder.ensureCapacityInternal(Unknown Sourc)
at java.lang.AbstractStringBuilder.append(Unknown Source)
at java.lang.StringBuffer.append(Unknown Source)
at com.marklogic.io.IOHelper.literalStringFromReader(IOHelper.java:50)
at com.marklogic.io.IOHelper.literalStringFromStream(IOHelper.java:66)
at 
com.marklogic.xcc.types.impl.AbstractStreamableItem.asString(AbstracStreamableItem.java:120)
at 
com.marklogic.xcc.impl.ResultItemImpl.asString(ResultItemImpl.java:10)
at 
com.marklogic.mapreduce.DatabaseDocument.set(DatabaseDocument.java:16)
at 
com.marklogic.contentpump.DatabaseContentReader.nextKeyValue(DatabasContentReader.java:442)
at 
com.marklogic.contentpump.LocalJobRunner$TrackingRecordReader.nextKeValue(LocalJobRunner.java:444)
at 
org.apache.hadoop.mapreduce.task.MapContextImpl.nextKeyValue(MapContxtImpl.java:80)
at 
org.apache.hadoop.mapreduce.lib.map.WrappedMapper$Context.nextKeyVale(WrappedMapper.java:91)
at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:144)
at 
com.marklogic.contentpump.LocalJobRunner$LocalMapTask.call(LocalJobRnner.java:378)
at java.util.concurrent.FutureTask.run(Unknown Source)
at java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source)
at java.lang.Thread.run(Unknown Source)
17/11/16 16:44:16 INFO contentpump.LocalJobRunner: 
com.marklogic.mapreduce.MarkogicCounter:
17/11/16 16:44:16 INFO contentpump.LocalJobRunner: INPUT_RECORDS: 7005
17/11/16 16:44:16 INFO contentpump.LocalJobRunner: OUTPUT_RECORDS: 7005
17/11/16 16:44:16 INFO contentpump.LocalJobRunner: Total execution time: 58 sec

C:\a-work\mlcp\mlcp-8.0.6.3\bin>
___
General mailing list
General@developer.marklogic.com
Manage your subscription at: 
http://developer.marklogic.com/mailman/listinfo/general


Re: [MarkLogic Dev General] xray tests

2017-09-01 Thread Oleksii Segeda
Geert,

There is nothing unusual about these tests, they call different submodules of 
our application.
I can run xray when I cut the number of these tests roughly in half. It doesn't 
matter which tests to include - what matters is the amount.
My colleague, who has slightly more powerful machine with more RAM, can execute 
all these tests without any issues.

Regards,
Oleksii


From: general-boun...@developer.marklogic.com 
[mailto:general-boun...@developer.marklogic.com] On Behalf Of Geert Josten
Sent: Thursday, August 31, 2017 11:32 AM
To: MarkLogic Developer Discussion <general@developer.marklogic.com>
Subject: Re: [MarkLogic Dev General] xray tests

Hi,

Could you share some more detail on what is happening inside those tests? Would 
you be able to isolate which test is the culprit by commenting out each one by 
one?

Cheers,
Geert

From: 
<general-boun...@developer.marklogic.com<mailto:general-boun...@developer.marklogic.com>>
 on behalf of Oleksii Segeda 
<oseg...@worldbankgroup.org<mailto:oseg...@worldbankgroup.org>>
Reply-To: MarkLogic Developer Discussion 
<general@developer.marklogic.com<mailto:general@developer.marklogic.com>>
Date: Thursday, August 31, 2017 at 5:19 PM
To: MarkLogic Developer Discussion 
<general@developer.marklogic.com<mailto:general@developer.marklogic.com>>
Subject: [MarkLogic Dev General] xray tests

Hi everyone,

I have around 20-30 xray unit tests (https://github.com/robwhitby/xray). I want 
to run a full set of tests locally, before I deploy my code somewhere else.
Unfortunately, ML dies with out of memory error. If I run each test 
individually it works perfectly fine, but it takes forever to go through all of 
them manually.
I've tried to increase swap, limit the number of debug threads, limit cache 
sizes, etc. - nothing helps.

What else can be done here?

Regards,

Oleksii Segeda

IT Analyst

Information and Technology Solutions

[http://siteresources.worldbank.org/NEWS/Images/spacer.png]

[http://siteresources.worldbank.org/NEWS/Images/WBG_Information_and_Technology_Solutions.png]



___
General mailing list
General@developer.marklogic.com
Manage your subscription at: 
http://developer.marklogic.com/mailman/listinfo/general


[MarkLogic Dev General] xray tests

2017-08-31 Thread Oleksii Segeda
Hi everyone,

I have around 20-30 xray unit tests (https://github.com/robwhitby/xray). I want 
to run a full set of tests locally, before I deploy my code somewhere else.
Unfortunately, ML dies with out of memory error. If I run each test 
individually it works perfectly fine, but it takes forever to go through all of 
them manually.
I've tried to increase swap, limit the number of debug threads, limit cache 
sizes, etc. - nothing helps.

What else can be done here?

Regards,

Oleksii Segeda

IT Analyst

Information and Technology Solutions

[http://siteresources.worldbank.org/NEWS/Images/spacer.png]

[http://siteresources.worldbank.org/NEWS/Images/WBG_Information_and_Technology_Solutions.png]



___
General mailing list
General@developer.marklogic.com
Manage your subscription at: 
http://developer.marklogic.com/mailman/listinfo/general


Re: [MarkLogic Dev General] Count of cts:element-values() not equal to number of element instances--what's going on?

2017-08-14 Thread Oleksii Segeda
Eliot,

You can do something like this:

cts:element-value-co-occurrences(xs:QName("prof:overall-elapsed"),xs:QName("xdmp:document"))
if you have only one element per document.

Best,

Oleksii Segeda
IT Analyst
Information and Technology Solutions
www.worldbank.org


-Original Message-
From: general-boun...@developer.marklogic.com 
[mailto:general-boun...@developer.marklogic.com] On Behalf Of Eliot Kimber
Sent: Monday, August 14, 2017 2:31 PM
To: MarkLogic Developer Discussion <general@developer.marklogic.com>
Subject: [MarkLogic Dev General] Count of cts:element-values() not equal to 
number of element instances--what's going on?

I have this query:

let $durations := cts:element-values(xs:QName("prof:overall-elapsed"), (), 
"descending",
 cts:collection-query($collection))

And this query:

let $overall-elapsed := $profiles/prof:metadata/prof:overall-elapsed

Where there an element range index for prof:overall-elapsed.

Comparing the two results I get very different numbers when I expected them to 
be equal:

47539
21219

Doing this: 

count(distinct-values($overall-elapsed ! xs:dayTimeDuration(.))

Returns 21219, making it clear that the range index is returning distinct 
values, not all values. It makes sense in terms of how I would expect a range 
index to be structured (a one-to-many mapping for values to elements) but 
doesn’t make sense as the return for a function named “element-values” (and not 
element-distinct-values).

I didn’t see this behavior mentioned in the docs (although the introduction to 
the Lexicon reference section does describe lexicons as sets of unique values).

My requirement is to *quickly* get a list of the durations for all 
prof:expression elements (which I use for both counting and for bucketing, so I 
need all values, not just all distinct values).

Is there a way to do what I want using only indexes? 

Thanks,

E.
--
Eliot Kimber
http://contrext.com
 



___
General mailing list
General@developer.marklogic.com
Manage your subscription at: 
http://developer.marklogic.com/mailman/listinfo/general
___
General mailing list
General@developer.marklogic.com
Manage your subscription at: 
http://developer.marklogic.com/mailman/listinfo/general


[MarkLogic Dev General] fitness score

2017-08-03 Thread Oleksii Segeda
Hi everyone,


I calculate relevance of a single document to some query by doing this:

cts:fitness(
  cts:search(
  fn:doc(),
  cts:and-query((
cts:document-query(... document uri ...),
... some query ...
  )),
  ("score-logtf","unfiltered")
  )
)

I want to do the same thing, but for an element, without ingesting this element 
to my database.

Any ideas?



Oleksii Segeda

IT Analyst

Information and Technology Solutions

[http://siteresources.worldbank.org/NEWS/Images/spacer.png]

[http://siteresources.worldbank.org/NEWS/Images/WBG_Information_and_Technology_Solutions.png]



___
General mailing list
General@developer.marklogic.com
Manage your subscription at: 
http://developer.marklogic.com/mailman/listinfo/general


Re: [MarkLogic Dev General] special character using fn:replace and regex

2017-08-03 Thread Oleksii Segeda
You can use regex:

let $input := "$10,000 value $6,000 , Not able to find other"

return fn:replace($input, "(\$[0-9]+),([0-9]+)", "$1$2")

=>
$1 value $6000 , Not able to find other




Oleksii Segeda

IT Analyst

Information and Technology Solutions

[http://siteresources.worldbank.org/NEWS/Images/spacer.png]

[http://siteresources.worldbank.org/NEWS/Images/WBG_Information_and_Technology_Solutions.png]



From: general-boun...@developer.marklogic.com 
[mailto:general-boun...@developer.marklogic.com] On Behalf Of 
vikas.sin...@cognizant.com
Sent: Thursday, August 3, 2017 9:33 AM
To: general@developer.marklogic.com
Subject: [MarkLogic Dev General] special character using fn:replace and regex

Hi All,

How to replace special character using fn:replace and value .I want to remove 
special character ","  for both the $ values but other"," should exist .

Example:
Input : " $10,000 value $6,000 , Not able to find other"

After using functx library I am able to get "$10" and "$6" now I want to 
replace these corresponding value to "$10,000" and "$6,000" respectively

I am using fn:replace($input,"$10,",$10)
but unable to replace "$10," with "$10"

output : $1 value $6000 , Not able to find other


This e-mail and any files transmitted with it are for the sole use of the 
intended recipient(s) and may contain confidential and privileged information. 
If you are not the intended recipient(s), please reply to the sender and 
destroy all copies of the original message. Any unauthorized review, use, 
disclosure, dissemination, forwarding, printing or copying of this email, 
and/or any action taken in reliance on the contents of this e-mail is strictly 
prohibited and may be unlawful. Where permitted by applicable law, this e-mail 
and other e-mail communications sent to and from Cognizant e-mail addresses may 
be monitored.
___
General mailing list
General@developer.marklogic.com
Manage your subscription at: 
http://developer.marklogic.com/mailman/listinfo/general


Re: [MarkLogic Dev General] indexes

2017-07-11 Thread Oleksii Segeda
Hi,


Does this mean that index metadata stored along with documents? Was there a 
good reason to do so? Wouldn't it be better to mark this data for deletion and 
delete it during database merges?

I'm asking because reindexing is a pain when you have terabytes of data.


Thanks,


From: general-boun...@developer.marklogic.com 
<general-boun...@developer.marklogic.com> on behalf of John Snelson 
<john.snel...@marklogic.com>
Sent: Tuesday, July 11, 2017 5:58:35 PM
To: general@developer.marklogic.com
Subject: Re: [MarkLogic Dev General] indexes

To reclaim the space used by the range index.

On 11/07/2017 14:11, Oleksii Segeda wrote:
Hi,

Question to ML engineers here. Can someone explain why deletion of a range 
index causes reindexing?

Thanks.

Oleksii Segeda

IT Analyst

Information and Technology Solutions

T

+12024736798

E

oseg...@worldbankgroup.org<mailto:oseg...@worldbankgroup.org>

W

www.worldbank.org<http://www.worldbank.org/>

[http://siteresources.worldbank.org/NEWS/Images/spacer.png]

[http://siteresources.worldbank.org/NEWS/Images/WBG_Information_and_Technology_Solutions.png]






___
General mailing list
General@developer.marklogic.com<mailto:General@developer.marklogic.com>
Manage your subscription at:
http://developer.marklogic.com/mailman/listinfo/general




--
John Snelson, Principal Engineer  http://twitter.com/jpcs
MarkLogic Corporation http://www.marklogic.com
___
General mailing list
General@developer.marklogic.com
Manage your subscription at: 
http://developer.marklogic.com/mailman/listinfo/general


[MarkLogic Dev General] indexes

2017-07-11 Thread Oleksii Segeda
Hi,

Question to ML engineers here. Can someone explain why deletion of a range 
index causes reindexing?

Thanks.

Oleksii Segeda

IT Analyst

Information and Technology Solutions

T

+12024736798

E

oseg...@worldbankgroup.org<mailto:oseg...@worldbankgroup.org>

W

www.worldbank.org<http://www.worldbank.org/>

[http://siteresources.worldbank.org/NEWS/Images/spacer.png]

[http://siteresources.worldbank.org/NEWS/Images/WBG_Information_and_Technology_Solutions.png]



___
General mailing list
General@developer.marklogic.com
Manage your subscription at: 
http://developer.marklogic.com/mailman/listinfo/general


Re: [MarkLogic Dev General] cts:element-value-match for integers

2017-06-19 Thread Oleksii Segeda
Christopher,

It gives false positives if I use it with cts:element-values. 

Shan,

The rule is to find all values which start with given value. For example, 200 
should match 200, 2001, 2002, ... 20010, 20020, 2002123, etc..
Are you suggesting to guess all possible combinations? If so, it's not possible.

As I said, I need something like this (pseudo code):

cts:element-value-match(xs:QName("element"), "200*")

except that I don't have a string range index on that field, but I do have an 
int range index instead.

Best,
Oleksii.

-Original Message-
From: general-boun...@developer.marklogic.com 
[mailto:general-boun...@developer.marklogic.com] On Behalf Of Shan Jiang
Sent: Monday, June 19, 2017 2:06 PM
To: MarkLogic Developer Discussion <general@developer.marklogic.com>
Subject: Re: [MarkLogic Dev General] cts:element-value-match for integers

What is your exact search rule? From your example, looks like you try to
look for another number by adding a ³0². If that is the case, can you run
a cts:or-query, one for 200, and one for 2000?

Shan Jiang
Principal Consultant
MarkLogic Corporation
shan.ji...@marklogic.com
Phone: +1 703 869 4672
www.marklogic.com <http://www.marklogic.com/>






On 6/19/17, 12:59 PM, "general-boun...@developer.marklogic.com on behalf
of Oleksii Segeda" <general-boun...@developer.marklogic.com on behalf of
oseg...@worldbankgroup.org> wrote:

>Hi everyone,
>
>Any thoughts on this?
>
>Oleksii.
>
>
>-Original Message-
>From: Oleksii Segeda
>Sent: Friday, June 16, 2017 6:16 PM
>To: general@developer.marklogic.com
>Subject: cts:element-value-match for integers
>
>Hi everyone,
>
>Can someone explain how does cts:element-value-match work with integer
>indexes? I cannot pass a string as a second argument, so it's unclear how
>to do a wildcarded search.
>Ultimate goal is to find 2000 and 200, if user typed 200. I understand
>that I can create an additional string index, but I want to know if a
>better solution exists.
>
>Thanks.
>
>Oleksii Segeda
>IT Analyst
>Information and Technology Solutions
>www.worldbank.org
>
>
>
>
>___
>General mailing list
>General@developer.marklogic.com
>Manage your subscription at:
>http://developer.marklogic.com/mailman/listinfo/general

___
General mailing list
General@developer.marklogic.com
Manage your subscription at: 
http://developer.marklogic.com/mailman/listinfo/general
___
General mailing list
General@developer.marklogic.com
Manage your subscription at: 
http://developer.marklogic.com/mailman/listinfo/general


Re: [MarkLogic Dev General] cts:element-value-match for integers

2017-06-19 Thread Oleksii Segeda
Hi everyone,

Any thoughts on this?

Oleksii.


-Original Message-
From: Oleksii Segeda 
Sent: Friday, June 16, 2017 6:16 PM
To: general@developer.marklogic.com
Subject: cts:element-value-match for integers

Hi everyone,

Can someone explain how does cts:element-value-match work with integer indexes? 
I cannot pass a string as a second argument, so it's unclear how to do a 
wildcarded search.
Ultimate goal is to find 2000 and 200, if user typed 200. I understand that I 
can create an additional string index, but I want to know if a better solution 
exists.

Thanks.

Oleksii Segeda
IT Analyst
Information and Technology Solutions
www.worldbank.org




___
General mailing list
General@developer.marklogic.com
Manage your subscription at: 
http://developer.marklogic.com/mailman/listinfo/general


[MarkLogic Dev General] cts:element-value-match for integers

2017-06-16 Thread Oleksii Segeda
Hi everyone,

Can someone explain how does cts:element-value-match work with integer indexes? 
I cannot pass a string as a second argument, so it's unclear how to do a 
wildcarded search.
Ultimate goal is to find 2000 and 200, if user typed 200. I understand that I 
can create an additional string index, but I want to know if a better solution 
exists.

Thanks.

Oleksii Segeda
IT Analyst
Information and Technology Solutions
www.worldbank.org




___
General mailing list
General@developer.marklogic.com
Manage your subscription at: 
http://developer.marklogic.com/mailman/listinfo/general


Re: [MarkLogic Dev General] Priorities for queries

2017-05-26 Thread Oleksii Segeda
Hi Gary,

I want to prioritize it because the application can use up to 90% of maximum 
storage IOPS and it’s normal (we have continuous ingestion with a lot of 
writes/full document lookups). Unfortunately, SSD is not an option at this time.

Regards,
Oleksii



From: general-boun...@developer.marklogic.com 
[mailto:general-boun...@developer.marklogic.com] On Behalf Of Gary Vidal
Sent: Thursday, May 25, 2017 6:44 AM
To: general@developer.marklogic.com
Subject: [MarkLogic Dev General] Priorities for queries

Olekseii,

Why would you want to prioritize queries the way you expressed?  It would not 
make sense to deprioritize disk i/o from happening unless you have some issues 
with disk performance. Consider disk i/o from stand merges to be a natural part 
of doing business in MarkLogic and any system that does "log level compaction". 
 If you are creating documents in bulk and at same time running queries there 
are a few techniques you could employ, such as using "fast-data directory" 
attached to SSD or figure out why your disk's are slow using dd command.  Again 
without knowing your write/read patterns and cardinalities/shape of your data 
its a very hard problem to answer correctly.  But you may want to look at 
pausing stand merges using blackouts for periods of high query load.  But this 
should be done with extreme caution to your query patterns.  Happy to discuss 
directly with you.  Feel free to email for that discussion.

Regards,

Gary Vidal


___
General mailing list
General@developer.marklogic.com
Manage your subscription at: 
http://developer.marklogic.com/mailman/listinfo/general


Re: [MarkLogic Dev General] Priorities for queries

2017-05-24 Thread Oleksii Segeda
Gary,

Please correct me if I’m wrong, but this will only parallelize queries without 
addressing priorities. This means if one of them creates a lot of disk IO, the 
second one hangs.

Best,
Oleksii



From: general-boun...@developer.marklogic.com 
[mailto:general-boun...@developer.marklogic.com] On Behalf Of Gary Vidal
Sent: Wednesday, May 24, 2017 6:58 AM
To: general@developer.marklogic.com
Subject: [MarkLogic Dev General] Priorities for queries

Oleksii,

Why dont you just create 2 app servers.  1 for query traffic and 1 for admin

Regards

Gary
___
General mailing list
General@developer.marklogic.com
Manage your subscription at: 
http://developer.marklogic.com/mailman/listinfo/general


Re: [MarkLogic Dev General] Priorities for queries

2017-05-23 Thread Oleksii Segeda
Hi Geert,

It makes sense. I guess on first query we can only return a ticket number, 
which can be used to access results.

Best,
Oleksii

From: general-boun...@developer.marklogic.com 
[mailto:general-boun...@developer.marklogic.com] On Behalf Of Geert Josten
Sent: Tuesday, May 23, 2017 3:25 AM
To: MarkLogic Developer Discussion <general@developer.marklogic.com>
Subject: Re: [MarkLogic Dev General] Priorities for queries

Hi Oleksii,

If you use xdmp:spawn or xdmp:spawn-function, you would be able to use the 
 option. It takes 'normal' and 'higher' as values. These priorities 
have separate queues and worker threads, so they should interfere less with 
each other.

It might also be worth looking into a way to push out low priority work to a 
dedicated host for longer running tasks. You could do that by writing such 
queries to the database, have a schedule running on that particular host 
monitor for such tasks, which picks them up 1 by 1, and writes back results 
once done. It might be easiest to switch around script queries to an 
asynchronous process that polls regularly to see if results have been written. 
Makes sense?

Cheers,
Geert

From: 
<general-boun...@developer.marklogic.com<mailto:general-boun...@developer.marklogic.com>>
 on behalf of Oleksii Segeda 
<oseg...@worldbankgroup.org<mailto:oseg...@worldbankgroup.org>>
Reply-To: MarkLogic Developer Discussion 
<general@developer.marklogic.com<mailto:general@developer.marklogic.com>>
Date: Monday, May 22, 2017 at 8:59 PM
To: "general@developer.marklogic.com<mailto:general@developer.marklogic.com>" 
<general@developer.marklogic.com<mailto:general@developer.marklogic.com>>
Subject: [MarkLogic Dev General] Priorities for queries

Hi,

Is there a way to give a lower priority to certain queries? We have two 
different types of API consumers - real users and various scripts.
No matter how often scripts are hitting endpoints or how "heavy" are their 
queries, they should not affect API performance for real users.
In other words, scripts are tolerant of high latency, but users are not.

Regards,

Oleksii Segeda

IT Analyst

Information and Technology Solutions

W

www.worldbank.org<http://www.worldbank.org/>

[http://siteresources.worldbank.org/NEWS/Images/spacer.png]

[http://siteresources.worldbank.org/NEWS/Images/WBG_Information_and_Technology_Solutions.png]



___
General mailing list
General@developer.marklogic.com
Manage your subscription at: 
http://developer.marklogic.com/mailman/listinfo/general


[MarkLogic Dev General] Priorities for queries

2017-05-22 Thread Oleksii Segeda
Hi,

Is there a way to give a lower priority to certain queries? We have two 
different types of API consumers - real users and various scripts.
No matter how often scripts are hitting endpoints or how "heavy" are their 
queries, they should not affect API performance for real users.
In other words, scripts are tolerant of high latency, but users are not.

Regards,

Oleksii Segeda

IT Analyst

Information and Technology Solutions

W

www.worldbank.org<http://www.worldbank.org/>

[http://siteresources.worldbank.org/NEWS/Images/spacer.png]

[http://siteresources.worldbank.org/NEWS/Images/WBG_Information_and_Technology_Solutions.png]



___
General mailing list
General@developer.marklogic.com
Manage your subscription at: 
http://developer.marklogic.com/mailman/listinfo/general


[MarkLogic Dev General] xdmp:parse-dateTime

2017-04-11 Thread Oleksii Segeda
Hi everyone,

The docs says that xdmp:parse-dateTime will not return the correct dateTime 
value for dates before October 15, 1582. What should I use for dates before 
October 15, 1582?

Regards,
Oleksii Segeda

IT Analyst

Information and Technology Solutions

[http://siteresources.worldbank.org/NEWS/Images/spacer.png]

[http://siteresources.worldbank.org/NEWS/Images/WBG_Information_and_Technology_Solutions.png]



___
General mailing list
General@developer.marklogic.com
Manage your subscription at: 
http://developer.marklogic.com/mailman/listinfo/general


Re: [MarkLogic Dev General] Regular Expressions

2017-03-22 Thread Oleksii Segeda
Hi Erik,

Unfortunately, all my codebase is in XQuery, but I need to use word-boundaries 
and non-matching groups. So far the only idea that comes to my mind is to use 
xdmp:javascript-eval.
I'm curious if this approach is considered as a normal practice.

Best,


Oleksii Segeda

IT Analyst

Information and Technology Solutions

[http://siteresources.worldbank.org/NEWS/Images/spacer.png]

[http://siteresources.worldbank.org/NEWS/Images/WBG_Information_and_Technology_Solutions.png]



From: general-boun...@developer.marklogic.com 
[mailto:general-boun...@developer.marklogic.com] On Behalf Of Erik Hennum
Sent: Wednesday, March 22, 2017 4:58 PM
To: MarkLogic Developer Discussion <general@developer.marklogic.com>
Subject: Re: [MarkLogic Dev General] Regular Expressions

Hi, Oleksii:

Regarding question 2, aside from a few edge cases,
the MarkLogic libraries have the same core implementation
with JavaScript and XQuery interfaces.

The core behavior of functions in the MarkLogic libraries are
(in almost every case) consistent across environments.

If you are working in JavaScript and the regex implementation
from v8 is a good fit for your requirements, you should take
advantage of JavaScript regex objects and methods.


Hoping that clarifies,

Erik Hennum


From: 
general-boun...@developer.marklogic.com<mailto:general-boun...@developer.marklogic.com>
 [general-boun...@developer.marklogic.com] on behalf of Sewell, David R. 
(drs2n) [dsew...@virginia.edu]
Sent: Wednesday, March 22, 2017 1:40 PM
To: MarkLogic Developer Discussion
Subject: Re: [MarkLogic Dev General] Regular Expressions
I'm not sure what the answer is to question 2, but for question 1, the answer 
is that MarkLogic's implementation of XPath doesn't support the \b character 
escape because it is not included in the  XPath specification for regular 
expressions, which itself is based on "XML Schema Part 2: Datatypes Second 
Edition". The only single-character escapes are these:

https://www.w3.org/TR/xmlschema-2/#nt-charClassEsc

Some XSLT and XQuery processors support extended regular expressions as a 
proprietary feature (for example, Saxon has a semi-documented extension that 
allows full Java regex), but MarkLogic doesn't (unless there is undocumented 
support that I don't know about).

David

On Mar 22, 2017, at 3:55 PM, Oleksii Segeda 
<oseg...@worldbankgroup.org<mailto:oseg...@worldbankgroup.org>> wrote:

Hi everyone,

Quick questions regarding regex in ML:

1.   What's ML alternative to word boundaries \b? Seems that 
fn:analyze-string doesn't support this special character.
2.   Does  JS version of this function (fn.analyzeString) use JS regex 
engine? If so, why it gives me error for fn.analyzeString("foo bar bar", 
"\\b(bar)\\b") ?

Regards,

Oleksii Segeda

IT Analyst

Information and Technology Solutions







___
General mailing list
General@developer.marklogic.com<mailto:General@developer.marklogic.com>
Manage your subscription at:
http://developer.marklogic.com/mailman/listinfo/general

___
General mailing list
General@developer.marklogic.com
Manage your subscription at: 
http://developer.marklogic.com/mailman/listinfo/general


[MarkLogic Dev General] Regular Expressions

2017-03-22 Thread Oleksii Segeda
Hi everyone,

Quick questions regarding regex in ML:


1.   What's ML alternative to word boundaries \b? Seems that 
fn:analyze-string doesn't support this special character.

2.   Does  JS version of this function (fn.analyzeString) use JS regex 
engine? If so, why it gives me error for fn.analyzeString("foo bar bar", 
"\\b(bar)\\b") ?

Regards,

Oleksii Segeda

IT Analyst

Information and Technology Solutions

[http://siteresources.worldbank.org/NEWS/Images/spacer.png]

[http://siteresources.worldbank.org/NEWS/Images/WBG_Information_and_Technology_Solutions.png]



___
General mailing list
General@developer.marklogic.com
Manage your subscription at: 
http://developer.marklogic.com/mailman/listinfo/general


[MarkLogic Dev General] transform

2017-02-15 Thread Oleksii Segeda
Hi all,

When I'm trying to use transforms with DocumentManager (Java API), MarkLogic 
gives me this error:

XDMP-MULTIPART-DONE: xdmp:document-load("rest::", /original/4ea94612..) -- 
All parts are already processed

Please advise.

Regards,

Oleksii Segeda

IT Analyst

Information and Technology Solutions


___
General mailing list
General@developer.marklogic.com
Manage your subscription at: 
http://developer.marklogic.com/mailman/listinfo/general


Re: [MarkLogic Dev General] Custom search grammar

2017-02-01 Thread Oleksii Segeda
Hi Erik,

Did you figure out how to extend the grammar?

Regards,
Oleksii Segeda
IT Analyst
Information and Technology Solutions


-Original Message-
From: general-boun...@developer.marklogic.com 
[mailto:general-boun...@developer.marklogic.com] On Behalf Of Oleksii Segeda
Sent: Monday, January 30, 2017 3:09 PM
To: MarkLogic Developer Discussion <general@developer.marklogic.com>
Subject: Re: [MarkLogic Dev General] Custom search grammar

Hi Erik,

Yes, that's is desired behavior. 

Ideally, I would like to avoid custom constraints, simply because search 
grammar looks cleaner in the search box. In addition, some of our users are 
already familiar with simple search operators like AND, OR, so BOOST won't look 
like an alien to them.

I guess a postprocessing can be used as you suggested, however I'm interested 
in custom search grammar, because I may need to extend it more in the future.

Thank you,
Oleksii Segeda
IT Analyst
Information and Technology Solutions


-Original Message-
From: general-boun...@developer.marklogic.com 
[mailto:general-boun...@developer.marklogic.com] On Behalf Of Erik Hennum
Sent: Monday, January 30, 2017 2:42 PM
To: MarkLogic Developer Discussion <general@developer.marklogic.com>
Subject: Re: [MarkLogic Dev General] Custom search grammar

Hi, Oleksii:

Thanks for providing more detail.  

Just to confirm, is it clear that, in a boost query, the right-hand
term is optional?  Documents with only the left-hand term will still
appear in the results though with less relevance than documents
that have both terms.

By contrast, AND-related terms are both required and both 
contribute to relevance.

Anyway, to increase weight, one approach would be to define a tag
for a quoted phrase and pass the phrase to a Search API custom 
constraint or to cts:parse() with a binding to a query generator function:

http://docs.marklogic.com/guide/search-dev/cts_query#id_13456

The custom code could then tokenize the phrase and combine the
terms with a boost-query or and-query, adding appropriate weight.

Another approach would be to do postprocessing of the query tree
returned by cts:parse() or search:parse() to replace the default 
boost-query or and-query with a query that has more weight.

In either approach, you would then search on the query.

I mention cts:parse() because it parses query text more quickly
than search:parse()


Hoping that helps,

Erik Hennum


From: general-boun...@developer.marklogic.com 
[general-boun...@developer.marklogic.com] on behalf of Oleksii Segeda 
[oseg...@worldbankgroup.org]
Sent: Monday, January 30, 2017 10:55 AM
To: general@developer.marklogic.com
Subject: Re: [MarkLogic Dev General] Custom search grammar

Hi Erik,

I'm trying to boost some parts of search query. For example, if user types 
`trade BOOST water`, I want documents with the word "water" to be higher in the 
results.
cts:boost-query seems to be a perfect fit, but the default BOOST doesn't let 
you specify weights.

My ultimate goal is to convert `trade BOOST water` to something like this:

cts:boost-query(cts:word-query("trade"), cts:word-query("water", (), 10.0) )

Regards,
Oleksii Segeda
IT Analyst
Information and Technology Solutions

-Original Message-
From: general-boun...@developer.marklogic.com 
[mailto:general-boun...@developer.marklogic.com] On Behalf Of 
general-requ...@developer.marklogic.com
Sent: Monday, January 30, 2017 1:08 PM
To: general@developer.marklogic.com
Subject: General Digest, Vol 151, Issue 42

Send General mailing list submissions to
general@developer.marklogic.com

To subscribe or unsubscribe via the World Wide Web, visit
http://developer.marklogic.com/mailman/listinfo/general
or, via email, send a message with subject or body 'help' to
general-requ...@developer.marklogic.com

You can reach the person managing the list at
general-ow...@developer.marklogic.com

When replying, please edit your Subject line so it is more specific
than "Re: Contents of General digest..."


Today's Topics:

   1. Custom search grammar (Oleksii Segeda)
   2. Re: Custom search grammar (Erik Hennum)


--

Message: 1
Date: Mon, 30 Jan 2017 16:51:26 +
From: Oleksii Segeda <oseg...@worldbankgroup.org>
Subject: [MarkLogic Dev General] Custom search grammar
To: "general@developer.marklogic.com"
<general@developer.marklogic.com>
Message-ID:

<bn1pr0101mb0769b9cdcd5e7697ace8381bcb...@bn1pr0101mb0769.prod.exchangelabs.com>

Content-Type: text/plain; charset="us-ascii"

Hi there,

I'm trying to declare a custom search grammar. I declared a custom function via 
search options, which supposed to parse "BOOST" keyword:

http://worldbankgroup.org/search/grammar; at="/lib/grammar-boost.xqy" 
tokenize="word"

Re: [MarkLogic Dev General] Custom search grammar

2017-01-30 Thread Oleksii Segeda
Hi Erik,

Yes, that's is desired behavior. 

Ideally, I would like to avoid custom constraints, simply because search 
grammar looks cleaner in the search box. In addition, some of our users are 
already familiar with simple search operators like AND, OR, so BOOST won't look 
like an alien to them.

I guess a postprocessing can be used as you suggested, however I'm interested 
in custom search grammar, because I may need to extend it more in the future.

Thank you,
Oleksii Segeda
IT Analyst
Information and Technology Solutions


-Original Message-
From: general-boun...@developer.marklogic.com 
[mailto:general-boun...@developer.marklogic.com] On Behalf Of Erik Hennum
Sent: Monday, January 30, 2017 2:42 PM
To: MarkLogic Developer Discussion <general@developer.marklogic.com>
Subject: Re: [MarkLogic Dev General] Custom search grammar

Hi, Oleksii:

Thanks for providing more detail.  

Just to confirm, is it clear that, in a boost query, the right-hand
term is optional?  Documents with only the left-hand term will still
appear in the results though with less relevance than documents
that have both terms.

By contrast, AND-related terms are both required and both 
contribute to relevance.

Anyway, to increase weight, one approach would be to define a tag
for a quoted phrase and pass the phrase to a Search API custom 
constraint or to cts:parse() with a binding to a query generator function:

http://docs.marklogic.com/guide/search-dev/cts_query#id_13456

The custom code could then tokenize the phrase and combine the
terms with a boost-query or and-query, adding appropriate weight.

Another approach would be to do postprocessing of the query tree
returned by cts:parse() or search:parse() to replace the default 
boost-query or and-query with a query that has more weight.

In either approach, you would then search on the query.

I mention cts:parse() because it parses query text more quickly
than search:parse()


Hoping that helps,

Erik Hennum


From: general-boun...@developer.marklogic.com 
[general-boun...@developer.marklogic.com] on behalf of Oleksii Segeda 
[oseg...@worldbankgroup.org]
Sent: Monday, January 30, 2017 10:55 AM
To: general@developer.marklogic.com
Subject: Re: [MarkLogic Dev General] Custom search grammar

Hi Erik,

I'm trying to boost some parts of search query. For example, if user types 
`trade BOOST water`, I want documents with the word "water" to be higher in the 
results.
cts:boost-query seems to be a perfect fit, but the default BOOST doesn't let 
you specify weights.

My ultimate goal is to convert `trade BOOST water` to something like this:

cts:boost-query(cts:word-query("trade"), cts:word-query("water", (), 10.0) )

Regards,
Oleksii Segeda
IT Analyst
Information and Technology Solutions

-Original Message-
From: general-boun...@developer.marklogic.com 
[mailto:general-boun...@developer.marklogic.com] On Behalf Of 
general-requ...@developer.marklogic.com
Sent: Monday, January 30, 2017 1:08 PM
To: general@developer.marklogic.com
Subject: General Digest, Vol 151, Issue 42

Send General mailing list submissions to
general@developer.marklogic.com

To subscribe or unsubscribe via the World Wide Web, visit
http://developer.marklogic.com/mailman/listinfo/general
or, via email, send a message with subject or body 'help' to
general-requ...@developer.marklogic.com

You can reach the person managing the list at
general-ow...@developer.marklogic.com

When replying, please edit your Subject line so it is more specific
than "Re: Contents of General digest..."


Today's Topics:

   1. Custom search grammar (Oleksii Segeda)
   2. Re: Custom search grammar (Erik Hennum)


--

Message: 1
Date: Mon, 30 Jan 2017 16:51:26 +
From: Oleksii Segeda <oseg...@worldbankgroup.org>
Subject: [MarkLogic Dev General] Custom search grammar
To: "general@developer.marklogic.com"
<general@developer.marklogic.com>
Message-ID:

<bn1pr0101mb0769b9cdcd5e7697ace8381bcb...@bn1pr0101mb0769.prod.exchangelabs.com>

Content-Type: text/plain; charset="us-ascii"

Hi there,

I'm trying to declare a custom search grammar. I declared a custom function via 
search options, which supposed to parse "BOOST" keyword:

http://worldbankgroup.org/search/grammar; at="/lib/grammar-boost.xqy" 
tokenize="word">BOOST

I declared this function and just copied existing implementation from 
impl:joiner-boost function in /MarkLogic/appservices/search/search-impl.xqy :

declare function grammar:custom-boost($ps as map:map, $left as element()?, 
$opts as element()?) as schema-element(cts:query) {
let $symbol := impl:symbol-lookup($ps)
let $_ := tdop:advance($ps)
let $expr1 := tdop:expression($ps, $symbol/@strength)
return
   

[MarkLogic Dev General] Custom search grammar

2017-01-30 Thread Oleksii Segeda
Hi there,

I'm trying to declare a custom search grammar. I declared a custom function via 
search options, which supposed to parse "BOOST" keyword:

http://worldbankgroup.org/search/grammar; at="/lib/grammar-boost.xqy" 
tokenize="word">BOOST

I declared this function and just copied existing implementation from 
impl:joiner-boost function in /MarkLogic/appservices/search/search-impl.xqy :

declare function grammar:custom-boost($ps as map:map, $left as element()?, 
$opts as element()?) as schema-element(cts:query) {
let $symbol := impl:symbol-lookup($ps)
let $_ := tdop:advance($ps)
let $expr1 := tdop:expression($ps, $symbol/@strength)
return
if (empty($left))
then ($left, impl:msg($ps, ))
else
element { xs:QName($symbol/@element) } {
attribute qtextjoin {concat($symbol/string())},
attribute strength {$symbol/@strength},
attribute qtextgroup { 
impl:opts($ps)/opt:grammar/opt:starter[@apply eq "grouping"]/(string(), 
@delimiter/string()) },
for $opt in 
$symbol/@options/tokenize(normalize-space(.)<mailto:$symbol/@options/tokenize(normalize-space(.)>,
 "\s") return {$opt},
element cts:matching-query {
attribute qtextref { "schema-element(cts:query)" },
$left },
element cts:boosting-query {
attribute qtextref { "schema-element(cts:query)" },
$expr1 }
}
};

Unfortunately this doesn't work, because for some reason impl:symbol-lookup 
returns an empty sequence.
Any ideas what went wrong here?


Oleksii Segeda

IT Analyst

Information and Technology Solutions

[http://siteresources.worldbank.org/NEWS/Images/spacer.png]

[http://siteresources.worldbank.org/NEWS/Images/WBG_Information_and_Technology_Solutions.png]



___
General mailing list
General@developer.marklogic.com
Manage your subscription at: 
http://developer.marklogic.com/mailman/listinfo/general