As Mike indicated, there is currently no way to extend mlcp (Content Pump) with 
the ability to validate documents as they arrive, or do any other arbitrary 
processing as part of a loading task. We’re looking at how we might support 
this in a future release, so additional input would be appreciated. 

Information Studio supports schema validation today. In an Information Studio 
flow you can configure a transformation step to validate against a Schemas 
database as documents arrive. See the docs at 
<http://docs.marklogic.com/guide/infostudio/loadingContent>. With Information 
Studio you won’t get the ability to distribute the actual loading across many 
nodes, like you would with mlcp, but Information Studio provides a convenient 
user interface and the ability to build custom collectors and transformation 
steps. If you’re set on mlcp, I think pre-processing in some other tool or 
implementing a CPF (or a lower-level trigger-based) approach is probably your 
best bet.

Justin

Justin Makeig
Director, Product Management
MarkLogic Corporation
[email protected]
Phone: +1 650 655 2387
www.marklogic.com



On Nov 12, 2012, at 10:24 AM, Michael Blakeley <[email protected]> wrote:

> I don't see anything relevant at 
> http://docs.marklogic.com/guide/ingestion/content-pump - but mlcp is designed 
> to work with hadoop. Possibly you could validate the XML in hadoop tasks? 
> Also mlcp is open-source, so you could always patch it to do what you want.
> 
> RecordLoader would do this using a CONTENT_MODULE_URI written in XQuery, and 
> invoked via XCC or HTTP requests. See 
> http://marklogic.github.com/recordloader/ for details.
> 
> Since we know from your other email that you are thinking of using CPF, you 
> might also consider using the CPF validation pipeline.
> 
> -- Mike
> 
> On 12 Nov 2012, at 01:39 , sini narayanan <[email protected]> wrote:
> 
>> Hi,
>> 
>> I have a requirement where I need to use content pump to load the files into 
>> the MarkLogic DB. While loading contents, I need to make sure that the input 
>> xml file conforms to the schema. Is it possible to perform a strict schema 
>> validation on the xml files, while inserting them through content pump?
>> 
>> Please help…
>> 
>> 
>> 
>> Thanks,
>> 
>> Sini
>> 
>> _______________________________________________
>> General mailing list
>> [email protected]
>> http://developer.marklogic.com/mailman/listinfo/general
> 
> _______________________________________________
> General mailing list
> [email protected]
> http://developer.marklogic.com/mailman/listinfo/general

_______________________________________________
General mailing list
[email protected]
http://developer.marklogic.com/mailman/listinfo/general

Reply via email to