Dear Lucas,

judging from your command, I think your input file contains an XML-starttag
"<uri _id>" and corresponding endtag "</uri _id>". Unfortunately, XML tag
names may not contain empty spaces (See also:
https://www.w3.org/TR/2008/REC-xml-20081126/#NT-Name).

MLCP tries to interpret the xml-file and it reports an unexpected
character, ">". MLCP assumes "_id" to be an attribute name to the tag name
"uri", like <uri _id="1234">. The next character following "_id" is
therefore expected to be an equal sign.

I would advice you to request the output file be offered in accordance with
the XML-specification, rather than trying to fix the document. Otherwise, I
fear, you will be forced to use sed, or a something similar, to replace the
malformed XML-tags through the entire document each and every time you
receive a new version.

Met vriendelijke groet / Kind regards,



*Martijn Sintemaartensdijk*






*A:* Einsteinbaan 12, 3439 NJ Nieuwegein

*T:* (+31) 06 40 59 09 36

*E:* [email protected]

*W:* www.dikw.nl



Hartelijk dank voor uw waardering en stem!
<http://www.dikw.com/algemeen-nieuws/computable-awards-2016/>


[image: banner 468x60 DIKW prijswinnaar]
<http://www.dikw.com/algemeen-nieuws/computable-awards-2016/>


On 21 March 2017 at 19:02, Lucas Davenport <[email protected]> wrote:

> I am a newb, so forgive me if I missed this answer while searching.
>
> I am testing ML 8 for a project at work and we have a requirement to load
> large amounts of historical data. I've read the mlcp documentation and can
> successfully import some test data, but the problem I am facing is the
> archive data has a space in the record identifier.
>
> My command is:
>  mlcp.sh import -host localhost -port 8006 -username dataload -password
> dataload -mode local -input_file_path ../xml/MD2014aggregate.xml
> -input_file_type aggregates -aggregate_record_element row -uri_id "row _id"
> -output_uri_prefix /traffic/MD -output_uri_suffix .xml -output_collections
> published
>
> This produces the following error:
> 17/03/21 13:49:20 ERROR contentpump.ContentPump: Unrecognized argument:
> \_id
>
> I've escaped both the space and the underscore (row\ _id and row\ \_id)
> and still get the same error. I've also wrapped in in single quotes and
> double quotes.
>
> I'm trying to keep from having to use sed to remove the space between row
> and _id in the entire file.
>
> Is there a way to make mlcp see the URI_ID literally as "row _id"?
>
> Thanks in advance.
>
> _______________________________________________
> General mailing list
> [email protected]
> Manage your subscription at:
> http://developer.marklogic.com/mailman/listinfo/general
>
>
_______________________________________________
General mailing list
[email protected]
Manage your subscription at: 
http://developer.marklogic.com/mailman/listinfo/general

Reply via email to