Hi Joydeep,

 

In MarkLogic Server 4.0, check out the 'Office OpenXML Extract' and
'WordprocessingML Process' pipelines in Content Processing.  

 

These are not enabled by default when you install Content Processing, so
you will have to attach them to your domain.  These 2 pipelines, along
with 'Status Change Handling', will process Word 2007 documents saved to
the Server.

 

Office Open XML Extract:  extracts the parts from a .docx package into a
directory named for the originating file.

WordprocessingML process:   updates document.xml (extracted from every
.docx package), by merging text split across runs (<w:r> elements) to
help improve search results and clean up the content for repurposing.

 

It's  also easy to assemble Word documents on the server by using the
xdmp:zip* utilities.

 

Hope this helps,

Pete

 

From: [EMAIL PROTECTED]
[mailto:[EMAIL PROTECTED] On Behalf Of
Joydeep_Sinha
Sent: Wednesday, November 12, 2008 10:44 AM
To: [email protected]
Cc: Vivek_Nagasundara; Thangavelu_Senniyappan
Subject: [MarkLogic Dev General] Important: Resolution of an Issue
inMarklogic
Importance: High

 

HI All,

 

I am from Satyam Computer Services Limited and we generally build
solutions on top of Marklogic. Currently we are using Marklogic to
upload docx files (MS Office 2007 formats) but are unaware of the
conversion capabilities of Marklogic to xhtml/xml components. Please
confirm how we can allow ingestion of docx formats into Marklogic and
how the latest version of Marklogic would support handling of the latest
Office formats.

 

It would be great, if you all can provide us the exact Xquery for
handling such issue or inform the change which would be required so as
to allow Office 2007 formats ingestion and retrieval to and from
Marklogic.

 

A quick resolution, would be greatly appreciated.

 

Thanks and Regards,

Joydeep Sinha

Onsite Co-ordinator  - IDMF PoC

Media and Entertainment - Solution Offerings

Satyam Computer Services Limited.

Mobile - (001)-6103020388

 

 

________________________________

DISCLAIMER:
This email (including any attachments) is intended for the sole use of
the intended recipient/s and may contain material that is CONFIDENTIAL
AND PRIVATE COMPANY INFORMATION. Any review or reliance by others or
copying or distribution or forwarding of any or all of the contents in
this message is STRICTLY PROHIBITED. If you are not the intended
recipient, please contact the sender by email and delete all copies;
your cooperation in this regard is appreciated.

_______________________________________________
General mailing list
[email protected]
http://xqzone.com/mailman/listinfo/general

Reply via email to