René wrote:
We are using the “XMLMind Word to XML”-converter.
We would like to pass on more attributes from word to the dita-files, to
start, with primarily which style was in the heading converted to topic.
Though the XHTML file(s) in the initial conversion for all output
formats is not available for inspection, if we choose e.g. “Multi-page
styled (X)HTML” as output then Word styles end up as attributes
class=”p-Heading2”, class=”p-Heading3” etc.
That's right, but p-Heading2, p-Heading3, etc, are *stock* MS-Word
styles and not *custom* MS-Word styles (that is, styles defined by DNV
and not by Microsoft). See below.
Hence the information is available.
*1 w2x options*
We thought one option was to employ “Use text file containing w2x
options:” could be used to specify such mapping. However the dropdown
showing paragraph style mapping is empty initially and also after
selecting a Word file with the “+”-button:
Is this a bug?
--> Some clarifications first:
1) The "+" button is used to load DOCX files in addition to the one
specified in the "Input DOCX file" field. This is a way to tell w2x to
load more *custom* styles in order to specify more style to DITA element
mappings.
2) "MS-Word style to XML element map" screen only lists *custom* styles
and not *stock* styles like Heading2, Heading3, etc.
Why that? because stock styles like Heading2, Heading3, etc, are already
recognized and processed accordingly by the various ".xed" files.
See "3.1.3.1. Dialog box allowing to add or modify an entry of the
MS-Word style to XML element map",
https://www.xmlmind.com/w2x/_distrib/doc/w2x_app_help/style_mapping_editor.html
---
...
If you want paragraphs or text runs found in the input DOCX file and
having a given *custom* MS-Word style to be converted to a specific XML
element, ...
...
It's possible to add to the combobox custom styles coming from other
DOCX files by clicking ["+" icon].
---
So no bug there. Just not what you expected.
--> When you convert a DOCX file to DITA:
1) The DOCX file is converted to a styled XHTML file. This is the
"Convert" step,
https://www.xmlmind.com/w2x/_distrib/doc/manual/index.html#convert_step
2) This styled XHTML file is converted to a "semantic", non-styled,
XHTML file by the means of ".xed" scripts. This is the "Edit" step,
https://www.xmlmind.com/w2x/_distrib/doc/manual/index.html#edit_step
3) All MS-Word styles, whether stock or custom, are stripped from the
"semantic", non-styled, XHTML file. This is done by "xed/remove-styles.xed".
4) the semantic, non-styled, XHTML file is converted to DITA by the
means of some XSLT stylesheets. This is the "Transform" step,
https://www.xmlmind.com/w2x/_distrib/doc/manual/index.html#transform_step
If you do not want to remove certain MS-Word style names (for example to
converted them to DITA @outputclass attributes) you may want to specify
for example:
---
-p edit.remove-styles.preserved-classes "p-Heading2 p-Heading3"
--
See
https://www.xmlmind.com/w2x/_distrib/doc/manual/webhelp/edit_step.html#__IDX144__
Of course, after doing this, in order to add the outputclass attributes
to some DITA elements for example, you'll have to customize the XSLT
styleheet which converts the "semantic", non-styled, XHTML file to a
DITA topic. This XSLT stylesheet is "xslt/topic.xslt".
*2 xed /xslt*
Is the way forward instead to edit C:\Program Files
(x86)\XMLmind_Word_To_XML\xed\main.xed to refer to a modified
headings.xed and possibly update the related topic.xslt ?
Yes.
Replace "headings.xed" by "myheadings.xed"
---
-pu edit.do.headings "C:\Users\René\myheadings.xed"
---
See
https://www.xmlmind.com/w2x/_distrib/doc/manual/webhelp/stock_xed_scripts.html
Replace "topic.xslt" by "mytopic.xslt"
---
-t "C:\Users\René\mytopic.xslt"
---
See https://www.xmlmind.com/w2x/_distrib/doc/manual/webhelp/w2x_command.html
You can add all the above options in a single ".txt" file and pass this
".txt" file to w2x-app by the means of the "Use text file containing w2x
options" button. See
https://www.xmlmind.com/w2x/_distrib/doc/w2x_app_help/converting_docx_to_xml.html
For example, when we convert our own "XMLmind Word To XML Manual" DOCX
to a DITA bookmap, we use the following "map_options.txt":
---
-p edit.prune.preserve p-ProgramListing
-p edit.inlines.convert " c-Code code ! c-Abbrev abbr "
-p edit.blocks.convert "p-Term dt g:id='dl' g:container='dl' !
p-Definition dd g:id='dl' g:container='dl' ! p-ProgramListing span
g:id='pre' g:container='pre'"
-pu edit.after.blocks customize/notes.xed
-p transform.root-topic-id manual
-p transform2.topic-path manual_map_topics
-p transform2.section-depth 6
-o map
-t customize/custom_topic.xslt
---
where "customize/notes.xed" is:
---
namespace "http://www.w3.org/1999/xhtml";
namespace html = "http://www.w3.org/1999/xhtml";
namespace g = "urn:x-mlmind:namespace:group";
for-each /html/body//p[get-class("p-Note")] {
delete-text("note:\s*", "i");
if content-type() <= 1 and not(@id) {
delete();
} else {
remove-class("p-Note");
set-attribute("g:id", "note_group_member");
set-attribute("g:container", "div class='role-note'");
}
}
group();
---
where "customize/custom_topic.xslt" is:
---
<xsl:stylesheet version="1.0"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
xmlns:h="http://www.w3.org/1999/xhtml"
exclude-result-prefixes="h">
<xsl:import href="w2x:xslt/topic.xslt"/>
<xsl:template match="h:div[@class = 'role-note']">
<note>
<xsl:call-template name="processCommonAttributes"/>
<xsl:apply-templates/>
</note>
</xsl:template>
<xsl:template match="h:code">
<tt>
<xsl:call-template name="processCommonAttributes"/>
<xsl:apply-templates/>
</tt>
</xsl:template>
</xsl:stylesheet>
---
--
XMLmind Word To XML Support List
w2x-support@xmlmind.com
http://www.xmlmind.com/mailman/listinfo/w2x-support