It looks like MLCP understands XPath 3.0 braced URI literals. E.g. given a
namespace "my.name.space" and an element with local name "lastname", this works:
-document_selector collection()[descendant::Q{my.name.space}lastname[. eq
"Smith"]]
The XPath parser is still a bit buggy, I'd say, even with the options put in an
options file per Geert. Using single quotes instead of double quotes in the
preceding example causes MLCP to throw a syntax error.
David
On Thu, 2 Apr 2015, Markus Flatscher wrote:
Follow-up question: can mlcp's EXPORT with -document_selector handle
namespaces? Looking at the documentation, I can see namespace functionality
for import transforms and for aggregate records, but don't see how one
would pass a namespace into a -document_selector.
Markus
On Thu, Apr 2, 2015 at 10:42 AM, Markus Flatscher <
[email protected]> wrote:
And I can confirm that Geert's suggestion works like a charm for complex
XPaths including comparison operators.
Thanks, Geert!
Markus
On Thu, Apr 2, 2015 at 10:39 AM, David Sewell <[email protected]>
wrote:
Actually, I need to retract my confirmation of an error in
-document_selector. The problem was not with MLCP's XPath parser but with
my XPath (although the error diagnostic from MLCP didn't indicate that). I
originally tried
-document_selector '/root[descendant::*:hid[starts-with(., "AK")]'
but left off a right bracket at the end of the expression. This works:
-document_selector '/root[descendant::*:hid[starts-with(., "AK")]]'
returning the expected document set. So at least some types of predicate
epxressions are working OK.
David
On Thu, 2 Apr 2015, Geert Josten wrote:
I recommend putting the arguments in a text file, and pointing to it with
the -option_file argument. It looks like your xpath gets chunked on the
spaces..
Cheers
On 4/2/15, 4:26 PM, "David Sewell" <[email protected]> wrote:
I can also confirm that MLCP's -document_selector filter doesn't like
XPath
predicates. For example, trying to return documents matching a
particular
metadata value only:
-document_selector '/root[descendant::*:hid[starts-with(., "AK")]'
fails with 'ERROR contentpump.ContentPump: Unrecognized argument:
"AK")]'
The older XQSync program *is* able to use more complex XPath to select
documents. Its version of the above filter,
-DINPUT_QUERY="collection()[descendant::*:hid[starts-with(.,
'AK')]]/base-uri()"
works perfectly.
Is there a reason for the limitation in MLCP?
David
On Mon, 30 Mar 2015, Markus Flatscher wrote:
According to the mlcp docs, the value passed into -document_selector
"specifies an XPath expression used to select which documents are
exported
from the database."
However, using mlcp-1.2-2, this only seems to work for simple XPaths.
Any
comparison operators or even parentheses in the XPath throw errors for
me:
mlcp export [Š] -document_selector '/root[child/child = "1"]'
(ERROR contentpump.ContentPump: Unrecognized argument: =)
Does anyone have a working example for an mlcp export with
-document_selector and a complex XPath expression?
Thanks,
Markus
--
David Sewell, Editorial and Technical Manager
ROTUNDA, The University of Virginia Press
PO Box 400314, Charlottesville, VA 22904-4314 USA
Email: [email protected] Tel: +1 434 924 9973
Web: http://rotunda.upress.virginia.edu/
_______________________________________________
General mailing list
[email protected]
Manage your subscription at:
http://developer.marklogic.com/mailman/listinfo/general
--
David Sewell, Editorial and Technical Manager
ROTUNDA, The University of Virginia Press
PO Box 400314, Charlottesville, VA 22904-4314 USA
Email: [email protected] Tel: +1 434 924 9973
Web: http://rotunda.upress.virginia.edu/
_______________________________________________
General mailing list
[email protected]
Manage your subscription at:
http://developer.marklogic.com/mailman/listinfo/general
--
Markus Flatscher
Consultant | Avalon Consulting, LLC
[email protected]
LinkedIn: http://www.linkedin.com/company/avalon-consulting-llc
Google+: http://www.google.com/+AvalonConsultingLLC
Twitter: https://twitter.com/avalonconsult
--
David Sewell, Editorial and Technical Manager
ROTUNDA, The University of Virginia Press
PO Box 400314, Charlottesville, VA 22904-4314 USA
Email: [email protected] Tel: +1 434 924 9973
Web: http://rotunda.upress.virginia.edu/
_______________________________________________
General mailing list
[email protected]
Manage your subscription at:
http://developer.marklogic.com/mailman/listinfo/general