It looks like MLCP understands XPath 3.0 braced URI literals. E.g. given a namespace "my.name.space" and an element with local name "lastname", this works:

  -document_selector collection()[descendant::Q{my.name.space}lastname[. eq 
"Smith"]]

The XPath parser is still a bit buggy, I'd say, even with the options put in an options file per Geert. Using single quotes instead of double quotes in the preceding example causes MLCP to throw a syntax error.

David

On Thu, 2 Apr 2015, Markus Flatscher wrote:

Follow-up question: can mlcp's EXPORT with -document_selector handle
namespaces? Looking at the documentation, I can see namespace functionality
for import transforms and for aggregate records, but don't see how one
would pass a namespace into a -document_selector.

Markus

On Thu, Apr 2, 2015 at 10:42 AM, Markus Flatscher <
[email protected]> wrote:

And I can confirm that Geert's suggestion works like a charm for complex
XPaths including comparison operators.

Thanks, Geert!

Markus

On Thu, Apr 2, 2015 at 10:39 AM, David Sewell <[email protected]>
wrote:

Actually, I need to retract my confirmation of an error in
-document_selector. The problem was not with MLCP's XPath parser but with
my XPath (although the error diagnostic from MLCP didn't indicate that). I
originally tried

  -document_selector '/root[descendant::*:hid[starts-with(., "AK")]'

but left off a right bracket at the end of the expression. This works:

  -document_selector '/root[descendant::*:hid[starts-with(., "AK")]]'

returning the expected document set. So at least some types of predicate
epxressions are working OK.

David


On Thu, 2 Apr 2015, Geert Josten wrote:

 I recommend putting the arguments in a text file, and pointing to it with
the -option_file argument. It looks like your xpath gets chunked on the
spaces..

Cheers

On 4/2/15, 4:26 PM, "David Sewell" <[email protected]> wrote:

 I can also confirm that MLCP's -document_selector filter doesn't like
XPath
predicates. For example, trying to return documents matching a
particular
metadata value only:

 -document_selector '/root[descendant::*:hid[starts-with(., "AK")]'

fails with 'ERROR contentpump.ContentPump: Unrecognized argument:
"AK")]'

The older XQSync program *is* able to use more complex XPath to select
documents. Its version of the above filter,

 -DINPUT_QUERY="collection()[descendant::*:hid[starts-with(.,
'AK')]]/base-uri()"

works perfectly.

Is there a reason for the limitation in MLCP?

David

On Mon, 30 Mar 2015, Markus Flatscher wrote:

 According to the mlcp docs, the value passed into -document_selector
"specifies an XPath expression used to select which documents are
exported
from the database."

However, using mlcp-1.2-2, this only seems to work for simple XPaths.
Any
comparison operators or even parentheses in the XPath throw errors for
me:

mlcp export [Š] -document_selector '/root[child/child = "1"]'


 (ERROR contentpump.ContentPump: Unrecognized argument: =)

Does anyone have a working example for an mlcp export with
-document_selector and a complex XPath expression?

Thanks,

Markus



--
David Sewell, Editorial and Technical Manager
ROTUNDA, The University of Virginia Press
PO Box 400314, Charlottesville, VA 22904-4314 USA
Email: [email protected]   Tel: +1 434 924 9973
Web: http://rotunda.upress.virginia.edu/


_______________________________________________
General mailing list
[email protected]
Manage your subscription at:
http://developer.marklogic.com/mailman/listinfo/general


--
David Sewell, Editorial and Technical Manager
ROTUNDA, The University of Virginia Press
PO Box 400314, Charlottesville, VA 22904-4314 USA
Email: [email protected]   Tel: +1 434 924 9973
Web: http://rotunda.upress.virginia.edu/

_______________________________________________
General mailing list
[email protected]
Manage your subscription at:
http://developer.marklogic.com/mailman/listinfo/general




--
Markus Flatscher
Consultant | Avalon Consulting, LLC
[email protected]

LinkedIn: http://www.linkedin.com/company/avalon-consulting-llc
Google+: http://www.google.com/+AvalonConsultingLLC
Twitter:    https://twitter.com/avalonconsult






--
David Sewell, Editorial and Technical Manager
ROTUNDA, The University of Virginia Press
PO Box 400314, Charlottesville, VA 22904-4314 USA
Email: [email protected]   Tel: +1 434 924 9973
Web: http://rotunda.upress.virginia.edu/
_______________________________________________
General mailing list
[email protected]
Manage your subscription at: 
http://developer.marklogic.com/mailman/listinfo/general

Reply via email to