Hi Santhosh, If you look in the CPF documentation (http://developer.marklogic.com/pubs/4.0/books/cpf.pdf), chapter 9 describes the default conversion option.
You cannot just specify and xsd file and have it transform your document to that schema. You have to write code to tell it how to transform it. If you install the default conversion option (even without the conversion license), you can still convert html documents (to xhtml and simplified docbook formats). That will give you a pretty good idea of the output it creates with word and pdf documents. -Danny From: [email protected] [mailto:[email protected]] On Behalf Of Santhosh Raj Sent: Sunday, April 26, 2009 10:13 PM To: General Mark Logic Developer Discussion Subject: RE: [MarkLogic Dev General] How to convert pdf / doc files to xml when storying into marklogic - reg., Hi Danny, Thnaks for your reply,I can understand your point. What i need to know is if i am using conversion license then 1) How/Where can i specify any xsd that the generated(converted) xml file should follow. 2) In what name the xml file will be stored. 3) send me some sample file that you have converted. (i.e, pdf/doc , xhtml, xml files)\ Original pdf/doc file Generated xhtml file Generated docbook xml file. Thanks in advance. Santhosh Rajasekaran Danny Sokolsky <[email protected]> Sent by: [email protected] 04/24/2009 09:32 PM Please respond to General Mark Logic Developer Discussion <[email protected]> To General Mark Logic Developer Discussion <[email protected]> cc Subject RE: [MarkLogic Dev General] How to convert pdf / doc files to xml when storying into marklogic - reg., Hi Santhosh, You need to have a conversion license to run the pdf or office conversion built-in XQuery functions-it is not included in the community license. Once you have the license, the conversion built-ins convert the binary files (pdf, word, and so on) to XHTML, and then there is a CPF process to clean up the XML and produce docbook. If you wanted to transform it at that point to some other structure, you could write some code to perform that transformation. -Danny From: [email protected] [mailto:[email protected]] On Behalf Of Santhosh Raj Sent: Friday, April 24, 2009 5:03 AM To: General Mark Logic Developer Discussion Subject: [MarkLogic Dev General] How to convert pdf / doc files to xml when storying into marklogic - reg., Hi all, I have only community version of Marklogic Server. In community version we can't convert pdf/doc files to xml format. It is stored as binary file. 1) While storing in Marklogic If we want to convert the doc / pdf file to xml and then store it to marklogic then what to do. 2) can we specify any xsd (schema file) for the conversion to take place. If else if marklogic itself uses any schema which schema it uses. If you give me the sample doc / pdf , schema file, and the converted xml file. Steps to follow to convert pdf/doc file to xml It will be more useful. Thanks and Regards, Santhosh Rajasekaran Tata Consultancy Services Mailto: [email protected] Website: http://www.tcs.com<http://www.tcs.com/> ____________________________________________ Experience certainty. IT Services Business Solutions Outsourcing ____________________________________________ =====-----=====-----===== Notice: The information contained in this e-mail message and/or attachments to it may contain confidential or privileged information. If you are not the intended recipient, any dissemination, use, review, distribution, printing or copying of the information contained in this e-mail message and/or attachments to it are strictly prohibited. If you have received this communication in error, please notify us by reply e-mail or telephone and immediately and permanently delete the message and any attachments. Thank you _______________________________________________ General mailing list [email protected] http://xqzone.com/mailman/listinfo/general ForwardSourceID:NT0000AB6A =====-----=====-----===== Notice: The information contained in this e-mail message and/or attachments to it may contain confidential or privileged information. If you are not the intended recipient, any dissemination, use, review, distribution, printing or copying of the information contained in this e-mail message and/or attachments to it are strictly prohibited. If you have received this communication in error, please notify us by reply e-mail or telephone and immediately and permanently delete the message and any attachments. Thank you
_______________________________________________ General mailing list [email protected] http://xqzone.com/mailman/listinfo/general
