Pete,

thanks that's good to know, and it resolved my problem.

I attached the Office Open XML Extract pipeline to the Default
Documents domain to which it is now attached like these pipelines:


Conversion Processing
Conversion Processing (Basic)
DocBook Conversion
HTML Conversion
MS Office Conversion
*Office OpenXML Extract*
PDF Conversion
PDF Conversion (Paged Text, No Rendering)
Status Change Handling

Weirdly, .doc and .xls now also work.

thanks again,
Jakob.


PS: I'll be back in a separate mail with some more questions regarding
the Word add-in.




On Wed, Feb 22, 2012 at 15:46, Pete Aven <[email protected]> wrote:
> Conversion is currently for Office 2003 documents and earlier.
>
> With 2007/2010 we work with the XML directly.  The Office Open XML Extract 
> pipeline will unzip the .docx and .pptx, and create the *_parts directory 
> containing their XML components.
>
> Hope this helps,
> Pete
>
> -----Original Message-----
> From: [email protected] 
> [mailto:[email protected]] On Behalf Of Jakob Fix
> Sent: Wednesday, February 22, 2012 9:37 AM
> To: General Mark Logic Developer Discussion
> Subject: [MarkLogic Dev General] cpf pipeline question
>
> Hi,
>
> So i'm experimenting with the conversion option in MarkLogic (v5.0).
> CPF is installed and enabled, conversion is set to true.
> Import of docx and pptx is via WebDAV.
>
> However, conversion visibly doesn't take place.
> I set logging to "finest", so  I see lots of "skipped" lines but no outright 
> errors:
>
> 2012-02-22 15:31:17.416 Fine: TaskServer: Documents: on-any-property skipping 
> /AuthoringGuide.docx
>
> Uploaded documents are visible via QC's "Explore", their type is "binary", 
> and the properties don't show any errors, e.g.:
> <prop:properties xmlns:prop="http://marklogic.com/xdmp/property";>
>  <cpf:processing-status
> xmlns:cpf="http://marklogic.com/cpf";>done</cpf:processing-status>
>  <cpf:property-hash
> xmlns:cpf="http://marklogic.com/cpf";>d41d8cd98f00b204e9800998ecf8427e</cpf:property-hash>
>  <cpf:last-updated
> xmlns:cpf="http://marklogic.com/cpf";>2012-02-22T15:23:04.949+01:00</cpf:last-updated>
>  <cpf:state 
> xmlns:cpf="http://marklogic.com/cpf";>http://marklogic.com/states/converted</cpf:state>
>  <cpf:self 
> xmlns:cpf="http://marklogic.com/cpf";>/AuthoringGuide.docx</cpf:self>
>  <prop:last-modified>2012-02-22T15:23:04+01:00</prop:last-modified>
> </prop:properties>
>
> So, no _toc.xml file or _parts directory is created with XML inside.
>
> Could somebody please tell me what else to check?
>
> Thanks,
> Jakob.
> _______________________________________________
> General mailing list
> [email protected]
> http://developer.marklogic.com/mailman/listinfo/general
> _______________________________________________
> General mailing list
> [email protected]
> http://developer.marklogic.com/mailman/listinfo/general
_______________________________________________
General mailing list
[email protected]
http://developer.marklogic.com/mailman/listinfo/general

Reply via email to