René wrote:
We are using the “XMLMind Word to XML”-converter.

We would like to pass on more attributes from word to the dita-files, to start, with primarily which style was in the heading converted to topic.

Though the XHTML file(s) in the initial conversion for all output formats is not available for inspection, if we choose e.g. “Multi-page styled (X)HTML” as output then Word  styles end up as attributes class=”p-Heading2”, class=”p-Heading3” etc.

That's right, but p-Heading2, p-Heading3, etc, are *stock* MS-Word styles and not *custom* MS-Word styles (that is, styles defined by DNV and not by Microsoft). See below.





Hence the information is available.

*1 w2x options*

We thought  one option was to employ  “Use text file containing w2x options:” could be used to specify such mapping. However the dropdown showing paragraph style mapping is empty initially and also after selecting a Word file with the “+”-button:

Is this a bug?

--> Some clarifications first:

1) The "+" button is used to load DOCX files in addition to the one specified in the "Input DOCX file" field. This is a way to tell w2x to load more *custom* styles in order to specify more style to DITA element mappings.

2) "MS-Word style to XML element map" screen only lists *custom* styles and not *stock* styles like Heading2, Heading3, etc.

Why that? because stock styles like Heading2, Heading3, etc, are already recognized and processed accordingly by the various ".xed" files.

See "3.1.3.1. Dialog box allowing to add or modify an entry of the MS-Word style to XML element map", https://www.xmlmind.com/w2x/_distrib/doc/w2x_app_help/style_mapping_editor.html
---
...
If you want paragraphs or text runs found in the input DOCX file and having a given *custom* MS-Word style to be converted to a specific XML element, ...

...

It's possible to add to the combobox custom styles coming from other DOCX files by clicking ["+" icon].
---

So no bug there. Just not what you expected.



--> When you convert a DOCX file to DITA:

1) The DOCX file is converted to a styled XHTML file. This is the "Convert" step, https://www.xmlmind.com/w2x/_distrib/doc/manual/index.html#convert_step

2) This styled XHTML file is converted to a "semantic", non-styled, XHTML file by the means of ".xed" scripts. This is the "Edit" step, https://www.xmlmind.com/w2x/_distrib/doc/manual/index.html#edit_step

3) All MS-Word styles, whether stock or custom, are stripped from the "semantic", non-styled, XHTML file. This is done by "xed/remove-styles.xed".

4) the semantic, non-styled, XHTML file is converted to DITA by the means of some XSLT stylesheets. This is the "Transform" step, https://www.xmlmind.com/w2x/_distrib/doc/manual/index.html#transform_step

If you do not want to remove certain MS-Word style names (for example to converted them to DITA @outputclass attributes) you may want to specify for example:

---
-p edit.remove-styles.preserved-classes "p-Heading2 p-Heading3"
--

See https://www.xmlmind.com/w2x/_distrib/doc/manual/webhelp/edit_step.html#__IDX144__

Of course, after doing this, in order to add the outputclass attributes to some DITA elements for example, you'll have to customize the XSLT styleheet which converts the "semantic", non-styled, XHTML file to a DITA topic. This XSLT stylesheet is "xslt/topic.xslt".






*2 xed /xslt*

Is the way forward instead to edit C:\Program Files (x86)\XMLmind_Word_To_XML\xed\main.xed   to refer to a modified headings.xed  and possibly update the related topic.xslt ?

Yes.

Replace "headings.xed" by "myheadings.xed"

---
-pu edit.do.headings "C:\Users\René\myheadings.xed"
---

See https://www.xmlmind.com/w2x/_distrib/doc/manual/webhelp/stock_xed_scripts.html

Replace "topic.xslt" by "mytopic.xslt"

---
-t "C:\Users\René\mytopic.xslt"
---

See https://www.xmlmind.com/w2x/_distrib/doc/manual/webhelp/w2x_command.html



You can add all the above options in a single ".txt" file and pass this ".txt" file to w2x-app by the means of the "Use text file containing w2x options" button. See https://www.xmlmind.com/w2x/_distrib/doc/w2x_app_help/converting_docx_to_xml.html

For example, when we convert our own "XMLmind Word To XML Manual" DOCX to a DITA bookmap, we use the following "map_options.txt":

---
-p edit.prune.preserve p-ProgramListing
-p edit.inlines.convert " c-Code code ! c-Abbrev abbr "
-p edit.blocks.convert "p-Term dt g:id='dl' g:container='dl' ! p-Definition dd g:id='dl' g:container='dl' ! p-ProgramListing span g:id='pre' g:container='pre'"
-pu edit.after.blocks customize/notes.xed
-p transform.root-topic-id manual
-p transform2.topic-path manual_map_topics
-p transform2.section-depth 6
-o map
-t customize/custom_topic.xslt
---

where "customize/notes.xed" is:

---
namespace "http://www.w3.org/1999/xhtml";;
namespace html = "http://www.w3.org/1999/xhtml";;
namespace g = "urn:x-mlmind:namespace:group";

for-each /html/body//p[get-class("p-Note")] {
    delete-text("note:\s*", "i");
    if content-type() <= 1 and not(@id) {
        delete();
    } else {
        remove-class("p-Note");
        set-attribute("g:id", "note_group_member");
        set-attribute("g:container", "div class='role-note'");
    }
}

group();
---

where "customize/custom_topic.xslt" is:

---
<xsl:stylesheet version="1.0"
  xmlns:xsl="http://www.w3.org/1999/XSL/Transform";
  xmlns:h="http://www.w3.org/1999/xhtml";
  exclude-result-prefixes="h">

<xsl:import href="w2x:xslt/topic.xslt"/>

<xsl:template match="h:div[@class = 'role-note']">
  <note>
    <xsl:call-template name="processCommonAttributes"/>
    <xsl:apply-templates/>
  </note>
</xsl:template>

<xsl:template match="h:code">
  <tt>
    <xsl:call-template name="processCommonAttributes"/>
    <xsl:apply-templates/>
  </tt>
</xsl:template>

</xsl:stylesheet>
---

--
XMLmind Word To XML Support List
w2x-support@xmlmind.com
http://www.xmlmind.com/mailman/listinfo/w2x-support

Reply via email to