Hi Santhosh,

If you look in the CPF documentation 
(http://developer.marklogic.com/pubs/4.0/books/cpf.pdf), chapter 9 describes 
the default conversion option.

You cannot just specify and xsd file and have it transform your document to 
that schema.  You have to write code to tell it how to transform it.

If you install the default conversion option (even without the conversion 
license), you can still convert html documents (to xhtml and simplified docbook 
formats).  That will give you a pretty good idea of the output it creates with 
word and pdf documents.

-Danny

From: [email protected] 
[mailto:[email protected]] On Behalf Of Santhosh Raj
Sent: Sunday, April 26, 2009 10:13 PM
To: General Mark Logic Developer Discussion
Subject: RE: [MarkLogic Dev General] How to convert pdf / doc files to xml when 
storying into marklogic - reg.,


Hi Danny,

        Thnaks for your reply,I can understand your point. What i need to know 
is  if i am using conversion license then

1)  How/Where can i specify any xsd that the generated(converted) xml file 
should follow.

2) In what name the xml file will be stored.

3) send me some sample file that you have converted. (i.e, pdf/doc , xhtml, xml 
files)\
        Original pdf/doc file
        Generated xhtml file
        Generated docbook xml file.

Thanks in advance.


Santhosh Rajasekaran


Danny Sokolsky <[email protected]>
Sent by: [email protected]

04/24/2009 09:32 PM
Please respond to
General Mark Logic Developer Discussion <[email protected]>


To

General Mark Logic Developer Discussion <[email protected]>

cc

Subject

RE: [MarkLogic Dev General] How to convert pdf / doc files to xml        when   
     storying into marklogic - reg.,







Hi Santhosh,

You need to have a conversion license to run the pdf or office conversion 
built-in XQuery functions-it is not included in the community license.    Once 
you have the license, the conversion built-ins convert the binary files (pdf, 
word, and so on) to XHTML, and then there is a CPF process to clean up the XML 
and produce docbook.  If you wanted to transform it at that point to some other 
structure, you could write some code to perform that transformation.

-Danny

From: [email protected] 
[mailto:[email protected]] On Behalf Of Santhosh Raj
Sent: Friday, April 24, 2009 5:03 AM
To: General Mark Logic Developer Discussion
Subject: [MarkLogic Dev General] How to convert pdf / doc files to xml when 
storying into marklogic - reg.,


Hi all,

       I have only community version of Marklogic Server. In community version  
we can't convert pdf/doc files to xml format. It is stored as binary file.

1)  While storing in Marklogic If we want to convert the doc / pdf file to xml 
and then store it to marklogic then what to do.

2) can we specify any xsd (schema file) for the conversion to take place.  If 
else if marklogic itself uses any schema which schema it uses.

If you give me the sample doc / pdf , schema file, and the converted xml file.  
Steps to follow to convert pdf/doc file to xml It will be more useful.

Thanks and Regards,
Santhosh Rajasekaran
Tata Consultancy Services
Mailto: [email protected]
Website: http://www.tcs.com<http://www.tcs.com/>
____________________________________________
Experience certainty.        IT Services
                      Business Solutions
                      Outsourcing
____________________________________________
=====-----=====-----=====
Notice: The information contained in this e-mail
message and/or attachments to it may contain
confidential or privileged information. If you are
not the intended recipient, any dissemination, use,
review, distribution, printing or copying of the
information contained in this e-mail message
and/or attachments to it are strictly prohibited. If
you have received this communication in error,
please notify us by reply e-mail or telephone and
immediately and permanently delete the message
and any attachments. Thank you

 _______________________________________________
General mailing list
[email protected]
http://xqzone.com/mailman/listinfo/general

ForwardSourceID:NT0000AB6A

=====-----=====-----=====

Notice: The information contained in this e-mail

message and/or attachments to it may contain

confidential or privileged information. If you are

not the intended recipient, any dissemination, use,

review, distribution, printing or copying of the

information contained in this e-mail message

and/or attachments to it are strictly prohibited. If

you have received this communication in error,

please notify us by reply e-mail or telephone and

immediately and permanently delete the message

and any attachments. Thank you




_______________________________________________
General mailing list
[email protected]
http://xqzone.com/mailman/listinfo/general

Reply via email to