Actually, I need to retract my confirmation of an error in -document_selector. The problem was not with MLCP's XPath parser but with my XPath (although the error diagnostic from MLCP didn't indicate that). I originally tried

  -document_selector '/root[descendant::*:hid[starts-with(., "AK")]'

but left off a right bracket at the end of the expression. This works:

  -document_selector '/root[descendant::*:hid[starts-with(., "AK")]]'

returning the expected document set. So at least some types of predicate epxressions are working OK.

David

On Thu, 2 Apr 2015, Geert Josten wrote:

I recommend putting the arguments in a text file, and pointing to it with
the -option_file argument. It looks like your xpath gets chunked on the
spaces..

Cheers

On 4/2/15, 4:26 PM, "David Sewell" <[email protected]> wrote:

I can also confirm that MLCP's -document_selector filter doesn't like
XPath
predicates. For example, trying to return documents matching a particular
metadata value only:

 -document_selector '/root[descendant::*:hid[starts-with(., "AK")]'

fails with 'ERROR contentpump.ContentPump: Unrecognized argument: "AK")]'

The older XQSync program *is* able to use more complex XPath to select
documents. Its version of the above filter,

 -DINPUT_QUERY="collection()[descendant::*:hid[starts-with(.,
'AK')]]/base-uri()"

works perfectly.

Is there a reason for the limitation in MLCP?

David

On Mon, 30 Mar 2015, Markus Flatscher wrote:

According to the mlcp docs, the value passed into -document_selector
"specifies an XPath expression used to select which documents are
exported
from the database."

However, using mlcp-1.2-2, this only seems to work for simple XPaths.
Any
comparison operators or even parentheses in the XPath throw errors for
me:

mlcp export [Š] -document_selector '/root[child/child = "1"]'

(ERROR contentpump.ContentPump: Unrecognized argument: =)

Does anyone have a working example for an mlcp export with
-document_selector and a complex XPath expression?

Thanks,

Markus



--
David Sewell, Editorial and Technical Manager
ROTUNDA, The University of Virginia Press
PO Box 400314, Charlottesville, VA 22904-4314 USA
Email: [email protected]   Tel: +1 434 924 9973
Web: http://rotunda.upress.virginia.edu/

_______________________________________________
General mailing list
[email protected]
Manage your subscription at:
http://developer.marklogic.com/mailman/listinfo/general


--
David Sewell, Editorial and Technical Manager
ROTUNDA, The University of Virginia Press
PO Box 400314, Charlottesville, VA 22904-4314 USA
Email: [email protected]   Tel: +1 434 924 9973
Web: http://rotunda.upress.virginia.edu/
_______________________________________________
General mailing list
[email protected]
Manage your subscription at: 
http://developer.marklogic.com/mailman/listinfo/general

Reply via email to