Dear Lucas, judging from your command, I think your input file contains an XML-starttag "<uri _id>" and corresponding endtag "</uri _id>". Unfortunately, XML tag names may not contain empty spaces (See also: https://www.w3.org/TR/2008/REC-xml-20081126/#NT-Name).
MLCP tries to interpret the xml-file and it reports an unexpected character, ">". MLCP assumes "_id" to be an attribute name to the tag name "uri", like <uri _id="1234">. The next character following "_id" is therefore expected to be an equal sign. I would advice you to request the output file be offered in accordance with the XML-specification, rather than trying to fix the document. Otherwise, I fear, you will be forced to use sed, or a something similar, to replace the malformed XML-tags through the entire document each and every time you receive a new version. Met vriendelijke groet / Kind regards, *Martijn Sintemaartensdijk* *A:* Einsteinbaan 12, 3439 NJ Nieuwegein *T:* (+31) 06 40 59 09 36 *E:* [email protected] *W:* www.dikw.nl Hartelijk dank voor uw waardering en stem! <http://www.dikw.com/algemeen-nieuws/computable-awards-2016/> [image: banner 468x60 DIKW prijswinnaar] <http://www.dikw.com/algemeen-nieuws/computable-awards-2016/> On 21 March 2017 at 19:02, Lucas Davenport <[email protected]> wrote: > I am a newb, so forgive me if I missed this answer while searching. > > I am testing ML 8 for a project at work and we have a requirement to load > large amounts of historical data. I've read the mlcp documentation and can > successfully import some test data, but the problem I am facing is the > archive data has a space in the record identifier. > > My command is: > mlcp.sh import -host localhost -port 8006 -username dataload -password > dataload -mode local -input_file_path ../xml/MD2014aggregate.xml > -input_file_type aggregates -aggregate_record_element row -uri_id "row _id" > -output_uri_prefix /traffic/MD -output_uri_suffix .xml -output_collections > published > > This produces the following error: > 17/03/21 13:49:20 ERROR contentpump.ContentPump: Unrecognized argument: > \_id > > I've escaped both the space and the underscore (row\ _id and row\ \_id) > and still get the same error. I've also wrapped in in single quotes and > double quotes. > > I'm trying to keep from having to use sed to remove the space between row > and _id in the entire file. > > Is there a way to make mlcp see the URI_ID literally as "row _id"? > > Thanks in advance. > > _______________________________________________ > General mailing list > [email protected] > Manage your subscription at: > http://developer.marklogic.com/mailman/listinfo/general > >
_______________________________________________ General mailing list [email protected] Manage your subscription at: http://developer.marklogic.com/mailman/listinfo/general
