Re: [NTG-context] question for the xml-experts

2009-02-20 Thread Thomas A. Schmitz


On Feb 19, 2009, at 3:10 PM, luigi scarso wrote:


see
http://codespeak.net/lxml/tutorial.html#namespaces


Luigi,

thanks so much for your patient replies. I have now begun to play with  
python's lxml. It offers a lot, maybe too much for a beginner. One  
advantage for my immediate needs that I see is that it offers the  
possibility to use Python's regular expressions and control  
structures, so this may make coding easier to maintain and adapt that  
in the rather clumsy xslt syntax; it may be a big help for the rather  
messy OpenOffice xml that I want to process.


I had already tried w2latex a while ago. I found it very limited and  
lacking documentation, so I haven't pursued this track.


Again, thanks for getting me started!

Thomas
___
If your question is of interest to others as well, please add an entry to the 
Wiki!

maillist : ntg-context@ntg.nl / http://www.ntg.nl/mailman/listinfo/ntg-context
webpage  : http://www.pragma-ade.nl / http://tex.aanhet.net
archive  : https://foundry.supelec.fr/projects/contextrev/
wiki : http://contextgarden.net
___


Re: [NTG-context] question for the xml-experts

2009-02-20 Thread luigi scarso
On Fri, Feb 20, 2009 at 4:09 PM, Thomas A. Schmitz
thomas.schm...@uni-bonn.de wrote:

 On Feb 19, 2009, at 3:10 PM, luigi scarso wrote:

 see
 http://codespeak.net/lxml/tutorial.html#namespaces

 Luigi,

 thanks so much for your patient replies. I have now begun to play with
 python's lxml. It offers a lot, maybe too much for a beginner. One advantage
 for my immediate needs that I see is that it offers the possibility to use
 Python's regular expressions and control structures, so this may make coding
 easier to maintain and adapt that in the rather clumsy xslt syntax; it may
 be a big help for the rather messy OpenOffice xml that I want to process.

also


Python 2.5.2 (r252:60911, Jul 31 2008, 17:28:52)
[GCC 4.2.3 (Ubuntu 4.2.3-2ubuntu7)] on linux2
Type help, copyright, credits or license for more information.
 URI_OFFICE = urn:oasis:names:tc:opendocument:xmlns:office:1.0
URI_STYLE = urn:oasis:names:tc:opendocument:xmlns:style:1.0
URI_TEXT = urn:oasis:names:tc:opendocument:xmlns:text:1.0
URI_TABLE = urn:oasis:names:tc:opendocument:xmlns:table:1.0
URI_DRAW = urn:oasis:names:tc:opendocument:xmlns:drawing:1.0
URI_FO = urn:oasis:names:tc:opendocument:xmlns:xsl-fo-compatible:1.0
URI_XLINK = http://www.w3.org/1999/xlink;
URI_DC = http://purl.org/dc/elements/1.1/;
URI_META = urn:oasis:names:tc:opendocument:xmlns:meta:1.0
URI_NUMBER = urn:oasis:names:tc:opendocument:xmlns:datastyle:1.0
URI_PRESENTATION =
urn:oasis:names:tc:opendocument:xmlns:presentation:1.0
URI_SVG = urn:oasis:names:tc:opendocument:xmlns:svg-compatible:1.0
URI_CHART = urn:oasis:names:tc:opendocument:xmlns:chart:1.0
URI_DR3D = urn:oasis:names:tc:opendocument:xmlns:dr3d:1.0
URI_MATH = http://www.w3.org/1998/Math/MathML;
URI_FORM = urn:oasis:names:tc:opendocument:xmlns:form:1.0
URI_SCRIPT = urn:oasis:names:tc:opendocument:xmlns:script:1.0
URI_OOO = http://openoffice.org/2004/office;
URI_OOOW = http://openoffice.org/2004/writer;
URI_OOOC = http://openoffice.org/2004/calc;
URI_DOM = http://www.w3.org/2001/xml-events;
URI_XFORMS = http://www.w3.org/2002/xforms;
URI_XSD = http://www.w3.org/2001/XMLSchema;
URI_XSI = http://www.w3.org/2001/XMLSchema-instance;
URI_FIELD = 
urn:openoffice:names:experimental:ooxml-odf-interop:xmlns:field:1.0

 NSMAP_OO = {
office :  URI_OFFICE,
style :   URI_STYLE,
text :URI_TEXT,
table :   URI_TABLE,
draw :URI_DRAW,
fo :  URI_FO,
xlink :   URI_XLINK,
dc :  URI_DC,
meta :URI_META,
number :  URI_NUMBER,
presentation :URI_PRESENTATION,
svg : URI_SVG,
chart :   URI_CHART,
dr3d :URI_DR3D,
math :URI_MATH,
form :URI_FORM,
script :  URI_SCRIPT,
ooo : URI_OOO,
ooow :URI_OOOW,
oooc :URI_OOOC,
dom : URI_DOM,
xforms :  URI_XFORMS,
xsd : URI_XSD,
xsi : URI_XSI,
field :   URI_FIELD,
}


 from lxml import etree

 tree = etree.parse(file('t.xml'))



 foo = tree.getroot()

 [child.tag for child in foo.iterdescendants(tag = '{%s}span'%URI_TEXT ) ]
['{urn:oasis:names:tc:opendocument:xmlns:text:1.0}span']





give a look at
http://opendocumentfellowship.com/projects/odfpy
too




-- 
luigi
___
If your question is of interest to others as well, please add an entry to the 
Wiki!

maillist : ntg-context@ntg.nl / http://www.ntg.nl/mailman/listinfo/ntg-context
webpage  : http://www.pragma-ade.nl / http://tex.aanhet.net
archive  : https://foundry.supelec.fr/projects/contextrev/
wiki : http://contextgarden.net
___


Re: [NTG-context] question for the xml-experts

2009-02-19 Thread luigi scarso
 Yes, you're right of course.
 I have a similar situation here: the xml
 produced by ooo is too messy, so I want to preprocess it to something that
 is easier to maintain and modify (e.g., I will, at some point, add index
 entries and a TOC); that's why I use xslt here. But I still produce xml
 which I process with mkiv.
so you have
xml --( xslt )--xml--( mkiv ) -- pdf
where the second xml is no normative, while the first yes.

In yor situation I  prefear
xml --( xslt )--tex--( mkiv ) -- pdf
because there is no much differences   between stylesheets of
xml --( xslt )--xml
and
xml --( xslt )--tex
and there is a clear distinction of roles: xml carries the semantic,
tex the presentation .


This chain
xml --( xslt )--xml--( mkiv ) -- pdf
can be reasonable
if the first xml come out from a db extraction
(you  must be quick and make the correct queries, so this xml is
typically in a row major fashion. ie like a table),
and the second xml is book-oriented and it is  simple .



BTW
always choose whatever is right for you needs

. Just to give me an idea: how would you
 transform this:

 text:span text:style-name=T3foo/text:span

 to this

 emphfoo/emph

 with lxml? lxml seems to object to the : in the tag, even though it's
 declared in the document.
I will give it a look

-- 
luigi
___
If your question is of interest to others as well, please add an entry to the 
Wiki!

maillist : ntg-context@ntg.nl / http://www.ntg.nl/mailman/listinfo/ntg-context
webpage  : http://www.pragma-ade.nl / http://tex.aanhet.net
archive  : https://foundry.supelec.fr/projects/contextrev/
wiki : http://contextgarden.net
___


Re: [NTG-context] question for the xml-experts

2009-02-19 Thread luigi scarso
On Thu, Feb 19, 2009 at 9:54 AM, Thomas A. Schmitz
thomas.schm...@uni-bonn.de wrote:

 On Feb 17, 2009, at 11:07 PM, luigi scarso wrote:

 (sorry x my laziness)
 If I have a good xml , then mkiv is a good choice. As far I know, mkiv
 ~ xslt by lpeg, so
 traditional
 xml--( xslt )--tex--( mkiv )--pdf
 is  like
 xml--( mkiv )--pdf
 Note that in the last chain one mixes xml+tex: if xml become complex,
 this can end in a messy situation.


 Yes, you're right of course. I have a similar situation here: the xml
 produced by ooo is too messy, so I want to preprocess it to something that
 is easier to maintain and modify (e.g., I will, at some point, add index
 entries and a TOC); that's why I use xslt here. But I still produce xml
 which I process with mkiv.

 But some  documents  need heavy preprocessing:
 for example, I have one that come from  java classes serialization,
 and I need the power of python (lxml) to do a clean work .
 Also, if xml changes , I 've found that lxml is more flexible than xslt.
 In this case I have
 xml--( lxml )--tex--( mkiv )--pdf

 The fact is that python and lua are not so differents,
 so I've to manage two languages
 (python+lua) and tex;
 with 'traditional' workflow you have to manage 3 languages
 xslt,lua and tex
 and subdivide responsability is not so easy as the former .

 Interesting. I have tried to play around with python-lxml, but am having
 some problems to understand it. Just to give me an idea: how would you
 transform this:

 text:span text:style-name=T3foo/text:span

 to this

 emphfoo/emph

 with lxml? lxml seems to object to the : in the tag, even though it's
 declared in the document.

 Thomas

t.xml:
foo xmlns:text=urn:oasis:names:tc:opendocument:xmlns:text:1.0
text:span  text:style-name=T3foo/text:span
/foo


# python
Python 2.5.2 (r252:60911, Jul 31 2008, 17:28:52)
[GCC 4.2.3 (Ubuntu 4.2.3-2ubuntu7)] on linux2
Type help, copyright, credits or license for more information.
 from lxml import etree
 tree = etree.parse(file('t.xml'))
 foo = tree.getroot()
 foo.tag
'foo'

 [child.tag for child in foo.iterdescendants() ]
['{urn:oasis:names:tc:opendocument:xmlns:text:1.0}span']
 print foo.iterdescendants.__doc__
iterdescendants(self, tag=None)

Iterate over the descendants of this element in document order.

As opposed to ``el.iter()``, this iterator does not yield the element
itself.  The generated elements can be restricted to a specific tag
name with the 'tag' keyword.


 FOO = etree.Element(FOO)
 emph =  etree.Element(emph)
 [child.tag for child in foo.iterdescendants(tag = 
 '{urn:oasis:names:tc:opendocument:xmlns:text:1.0}span' ) ]
['{urn:oasis:names:tc:opendocument:xmlns:text:1.0}span']
 span = [child for child in foo.iterdescendants(tag = 
 '{urn:oasis:names:tc:opendocument:xmlns:text:1.0}span' ) ][0]
 emph.text = span.text
 FOO.append(emph)
 etree.tostring(FOO)
'FOOemphfoo/emph/FOO'



http://codespeak.net/lxml/tutorial.html
http://codespeak.net/lxml/api.html


-- 
luigi
___
If your question is of interest to others as well, please add an entry to the 
Wiki!

maillist : ntg-context@ntg.nl / http://www.ntg.nl/mailman/listinfo/ntg-context
webpage  : http://www.pragma-ade.nl / http://tex.aanhet.net
archive  : https://foundry.supelec.fr/projects/contextrev/
wiki : http://contextgarden.net
___


Re: [NTG-context] question for the xml-experts

2009-02-19 Thread Thomas A. Schmitz


On Feb 19, 2009, at 11:39 AM, luigi scarso wrote:




FOO = etree.Element(FOO)
emph =  etree.Element(emph)
[child.tag for child in foo.iterdescendants(tag = '{urn:oasis:names:tc:opendocument:xmlns:text:1.0 
}span' ) ]

['{urn:oasis:names:tc:opendocument:xmlns:text:1.0}span']
span = [child for child in foo.iterdescendants(tag = '{urn:oasis:names:tc:opendocument:xmlns:text:1.0 
}span' ) ][0]

emph.text = span.text
FOO.append(emph)
etree.tostring(FOO)

'FOOemphfoo/emph/FOO'






Excuse me for being dense: you mean all namespaces have to be  
explicitly expanded?


Thomas
___
If your question is of interest to others as well, please add an entry to the 
Wiki!

maillist : ntg-context@ntg.nl / http://www.ntg.nl/mailman/listinfo/ntg-context
webpage  : http://www.pragma-ade.nl / http://tex.aanhet.net
archive  : https://foundry.supelec.fr/projects/contextrev/
wiki : http://contextgarden.net
___


Re: [NTG-context] question for the xml-experts

2009-02-19 Thread luigi scarso
 Yes, you're right of course. I have a similar situation here: the xml
 produced by ooo is too messy, so I want to preprocess it to something that
 is easier to maintain and modify (e.g., I will, at some point, add index
 entries and a TOC); that's why I use xslt here. But I still produce xml
 which I process with mkiv.
also this
http://www.hj-gym.dk/~hj/writer2latex/


-- 
luigi
___
If your question is of interest to others as well, please add an entry to the 
Wiki!

maillist : ntg-context@ntg.nl / http://www.ntg.nl/mailman/listinfo/ntg-context
webpage  : http://www.pragma-ade.nl / http://tex.aanhet.net
archive  : https://foundry.supelec.fr/projects/contextrev/
wiki : http://contextgarden.net
___


Re: [NTG-context] question for the xml-experts

2009-02-17 Thread luigi scarso
On Sun, Feb 15, 2009 at 6:17 PM, Thomas A. Schmitz
thomas.schm...@uni-bonn.de wrote:
 Luigi and Khaled,

 thanks a lot for your replies! Luigi: I had a look at python lxml; it looks
 very powerful and interesting, and I will try and see if can make use of it.
 Why do you translate your xml sources into tex instead of using the mkiv
 mechanism for processing xml, is it because of speed?
(sorry x my laziness)
If I have a good xml , then mkiv is a good choice. As far I know, mkiv
~ xslt by lpeg, so
traditional
xml--( xslt )--tex--( mkiv )--pdf
is  like
xml--( mkiv )--pdf
Note that in the last chain one mixes xml+tex: if xml become complex,
this can end in a messy situation.




But some  documents  need heavy preprocessing:
for example, I have one that come from  java classes serialization,
and I need the power of python (lxml) to do a clean work .
Also, if xml changes , I 've found that lxml is more flexible than xslt.
In this case I have
xml--( lxml )--tex--( mkiv )--pdf

The fact is that python and lua are not so differents,
so I've to manage two languages
(python+lua) and tex;
with 'traditional' workflow you have to manage 3 languages
xslt,lua and tex
and subdivide responsability is not so easy as the former .

BTW, I have no test that say this one is quickly than that one .

-- 
luigi
___
If your question is of interest to others as well, please add an entry to the 
Wiki!

maillist : ntg-context@ntg.nl / http://www.ntg.nl/mailman/listinfo/ntg-context
webpage  : http://www.pragma-ade.nl / http://tex.aanhet.net
archive  : https://foundry.supelec.fr/projects/contextrev/
wiki : http://contextgarden.net
___


Re: [NTG-context] question for the xml-experts

2009-02-15 Thread luigi scarso
If you know python
http://wiki.services.openoffice.org/wiki/PyUNO_bridge
http://opendocumentfellowship.com/projects/odfpy
For xml the choice is
http://codespeak.net/lxml/

A native xml db, with XQuery and python binding
http://www.oracle.com/technology/products/berkeley-db/xml/index.html



And this is my experience :
I'm programming in TeX (with context) , lua / python (they are
similar) and xslt .
For every project  if I can I use lxml to manage xml sources, because
it includes xslt but not viceversa.
The goal is to translate xml in tex in the quickest way, and let mkiv
to do the hard word.
I have not a good feeling with xslt, because is not so powerful as
lxml, and clearly is not a competitor of TeX .

If I need storage, dbxml is good, and XQuery+lxml is powerful enought .

OO has also docbook exporter
http://www.docbook.org/
docbook
is rich and with a good collection of xsl stylesheets to translate xml to html
but maybe is ...too much .

-- 
luigi
___
If your question is of interest to others as well, please add an entry to the 
Wiki!

maillist : ntg-context@ntg.nl / http://www.ntg.nl/mailman/listinfo/ntg-context
webpage  : http://www.pragma-ade.nl / http://tex.aanhet.net
archive  : https://foundry.supelec.fr/projects/contextrev/
wiki : http://contextgarden.net
___


Re: [NTG-context] question for the xml-experts

2009-02-15 Thread Khaled Hosny
You may consider giving dbcontext a look, it is written in python and
seems to use xsl to translate DocBook's xml into TeX files to be typeset
by ConTeXt. http://dblatex.sourceforge.net/doc/pt02.html

Regards,
 Khaled

On Sat, Feb 14, 2009 at 06:40:51PM +0100, Thomas A. Schmitz wrote:
 Hi all,

 this is not a question about direct technical details, but more of a  
 conceptual problem, and I would love to have your input and ideas on  
 this. I will be editing several edited volumes in my field (humanities, 
 classics). From experience, I know that it's impossible to make scholars 
 in the humanities adhere to standards. Each and every one of them will 
 turn in a paper (most of them written in half a dozen different versions 
 of Word) with its own idiosyncracies. At my last conference, I asked them 
 to please use Unicode for their Greek passages, and I got blank looks and 
 the question What the hell is Unicode?

 So: I want to extract the content of these papers and process it with  
 ConTeXt. I thought the easiest route might be convert them to OpenOffice 
 odt and then use the content.xml as a starting point. Since the 
 formatting will be unusable anyways, it doesn't make sense to process the 
 odt directly; instead, I want to transform the xml via xslt to a 
 simplified format and then process that with ConTeXt. I have just 
 discovered the tool xalan ( http://xml.apache.org/xalan-c/index.html ) 
 which allows me to use an xslt style sheet and direct the output to a new 
 file. I will then need to clean up these xml files and write a mkiv xml 
 setup for them.

 So for those who know much more about this sort of workflow: does that  
 make sense? Is there any better way to achieve these results, i.e., have 
 the content of a couple of papers in Word and/or rtf format and typeset 
 it in a consistent ConTeXt environment? Is there any tool better than 
 xslt to convert the OpenOffice xml than xslt (anything in lua that can 
 parse xml)? Anything better than xalan to convert xm - xml? I'm just 
 beginning to plan this, so I'd be most grateful for any pointers.

 Thanks for reading this long message, all best

 Thomas
 ___
 If your question is of interest to others as well, please add an entry to the 
 Wiki!

 maillist : ntg-context@ntg.nl / http://www.ntg.nl/mailman/listinfo/ntg-context
 webpage  : http://www.pragma-ade.nl / http://tex.aanhet.net
 archive  : https://foundry.supelec.fr/projects/contextrev/
 wiki : http://contextgarden.net
 ___

-- 
 Khaled Hosny
 Arabic localizer and member of Arabeyes.org team


signature.asc
Description: Digital signature
___
If your question is of interest to others as well, please add an entry to the 
Wiki!

maillist : ntg-context@ntg.nl / http://www.ntg.nl/mailman/listinfo/ntg-context
webpage  : http://www.pragma-ade.nl / http://tex.aanhet.net
archive  : https://foundry.supelec.fr/projects/contextrev/
wiki : http://contextgarden.net
___


Re: [NTG-context] question for the xml-experts

2009-02-15 Thread Thomas A. Schmitz

Luigi and Khaled,

thanks a lot for your replies! Luigi: I had a look at python lxml; it  
looks very powerful and interesting, and I will try and see if can  
make use of it. Why do you translate your xml sources into tex instead  
of using the mkiv mechanism for processing xml, is it because of speed?


Khaled: I have to see if I can tweak the OpenOffice docbook converter  
to keep more of the formatting; in its default state, it drops too  
much important stuff...


Right now, I have followed Patrick's advice. I've installed saxon9 and  
am writing a xslt stylesheet to translate the openoffice xml into a  
cleaner and easier to handle format. I'm making progress... Maybe we  
should put something like this on the wiki and make it a collaborative  
effort - I can only write rules for stuff that occurs in my documents,  
and that is of course only a subset of what OpenOffice has, so it  
would be good to add rules as people find interesting features.


All best

Thomas

On Feb 15, 2009, at 10:39 AM, luigi scarso wrote:


If you know python
http://wiki.services.openoffice.org/wiki/PyUNO_bridge
http://opendocumentfellowship.com/projects/odfpy
For xml the choice is
http://codespeak.net/lxml/

A native xml db, with XQuery and python binding
http://www.oracle.com/technology/products/berkeley-db/xml/index.html



And this is my experience :
I'm programming in TeX (with context) , lua / python (they are
similar) and xslt .
For every project  if I can I use lxml to manage xml sources, because
it includes xslt but not viceversa.
The goal is to translate xml in tex in the quickest way, and let mkiv
to do the hard word.
I have not a good feeling with xslt, because is not so powerful as
lxml, and clearly is not a competitor of TeX .

If I need storage, dbxml is good, and XQuery+lxml is powerful  
enought .


OO has also docbook exporter
http://www.docbook.org/
docbook
is rich and with a good collection of xsl stylesheets to translate  
xml to html

but maybe is ...too much .

--
luigi
___
If your question is of interest to others as well, please add an  
entry to the Wiki!


maillist : ntg-context@ntg.nl / http://www.ntg.nl/mailman/listinfo/ntg-context
webpage  : http://www.pragma-ade.nl / http://tex.aanhet.net
archive  : https://foundry.supelec.fr/projects/contextrev/
wiki : http://contextgarden.net
___


___
If your question is of interest to others as well, please add an entry to the 
Wiki!

maillist : ntg-context@ntg.nl / http://www.ntg.nl/mailman/listinfo/ntg-context
webpage  : http://www.pragma-ade.nl / http://tex.aanhet.net
archive  : https://foundry.supelec.fr/projects/contextrev/
wiki : http://contextgarden.net
___


[NTG-context] question for the xml-experts

2009-02-14 Thread Thomas A. Schmitz

Hi all,

this is not a question about direct technical details, but more of a  
conceptual problem, and I would love to have your input and ideas on  
this. I will be editing several edited volumes in my field  
(humanities, classics). From experience, I know that it's impossible  
to make scholars in the humanities adhere to standards. Each and every  
one of them will turn in a paper (most of them written in half a dozen  
different versions of Word) with its own idiosyncracies. At my last  
conference, I asked them to please use Unicode for their Greek  
passages, and I got blank looks and the question What the hell is  
Unicode?


So: I want to extract the content of these papers and process it with  
ConTeXt. I thought the easiest route might be convert them to  
OpenOffice odt and then use the content.xml as a starting point. Since  
the formatting will be unusable anyways, it doesn't make sense to  
process the odt directly; instead, I want to transform the xml via  
xslt to a simplified format and then process that with ConTeXt. I have  
just discovered the tool xalan ( http://xml.apache.org/xalan-c/index.html 
 ) which allows me to use an xslt style sheet and direct the output  
to a new file. I will then need to clean up these xml files and write  
a mkiv xml setup for them.


So for those who know much more about this sort of workflow: does that  
make sense? Is there any better way to achieve these results, i.e.,  
have the content of a couple of papers in Word and/or rtf format and  
typeset it in a consistent ConTeXt environment? Is there any tool  
better than xslt to convert the OpenOffice xml than xslt (anything in  
lua that can parse xml)? Anything better than xalan to convert xm -  
xml? I'm just beginning to plan this, so I'd be most grateful for any  
pointers.


Thanks for reading this long message, all best

Thomas
___
If your question is of interest to others as well, please add an entry to the 
Wiki!

maillist : ntg-context@ntg.nl / http://www.ntg.nl/mailman/listinfo/ntg-context
webpage  : http://www.pragma-ade.nl / http://tex.aanhet.net
archive  : https://foundry.supelec.fr/projects/contextrev/
wiki : http://contextgarden.net
___


Re: [NTG-context] question for the xml-experts

2009-02-14 Thread Wolfgang Schuster

Hi Thomas,

why don't you take a look at the OpenOffice export function, I saw it's
possible to convert a document to xhtml and this could be a start for  
you.


Wolfgang

Am 14.02.2009 um 18:40 schrieb Thomas A. Schmitz:


Hi all,

this is not a question about direct technical details, but more of a  
conceptual problem, and I would love to have your input and ideas on  
this. I will be editing several edited volumes in my field  
(humanities, classics). From experience, I know that it's impossible  
to make scholars in the humanities adhere to standards. Each and  
every one of them will turn in a paper (most of them written in half  
a dozen different versions of Word) with its own idiosyncracies. At  
my last conference, I asked them to please use Unicode for their  
Greek passages, and I got blank looks and the question What the  
hell is Unicode?


So: I want to extract the content of these papers and process it  
with ConTeXt. I thought the easiest route might be convert them to  
OpenOffice odt and then use the content.xml as a starting point.  
Since the formatting will be unusable anyways, it doesn't make sense  
to process the odt directly; instead, I want to transform the xml  
via xslt to a simplified format and then process that with ConTeXt.  
I have just discovered the tool xalan ( http://xml.apache.org/xalan-c/index.html 
 ) which allows me to use an xslt style sheet and direct the output  
to a new file. I will then need to clean up these xml files and  
write a mkiv xml setup for them.


So for those who know much more about this sort of workflow: does  
that make sense? Is there any better way to achieve these results,  
i.e., have the content of a couple of papers in Word and/or rtf  
format and typeset it in a consistent ConTeXt environment? Is there  
any tool better than xslt to convert the OpenOffice xml than xslt  
(anything in lua that can parse xml)? Anything better than xalan to  
convert xm - xml? I'm just beginning to plan this, so I'd be most  
grateful for any pointers.


Thanks for reading this long message, all best

Thomas

___
If your question is of interest to others as well, please add an entry to the 
Wiki!

maillist : ntg-context@ntg.nl / http://www.ntg.nl/mailman/listinfo/ntg-context
webpage  : http://www.pragma-ade.nl / http://tex.aanhet.net
archive  : https://foundry.supelec.fr/projects/contextrev/
wiki : http://contextgarden.net
___


Re: [NTG-context] question for the xml-experts

2009-02-14 Thread Patrick Gundlach
Hi Thomas,

 process the odt directly; instead, I want to transform the xml via
 xslt to a simplified format and then process that with ConTeXt. I have
 just discovered the tool xalan (
 http://xml.apache.org/xalan-c/index.html ) which allows me to use an
 xslt style sheet and direct the output  to a new file. I will then
 need to clean up these xml files and write  a mkiv xml setup for them.

 So for those who know much more about this sort of workflow: does that
 make sense?

Yes, it does. At my company we clean up (and reorganize) XML data with
XSLT all the time. We are happy users of saxon 9
(http://saxon.sourceforge.net/) which is an xslt 2.0 engine. Learning
XSLT is not trivial (but not too hard either), but once you get an
understanding of it nobody can stop you using XSLT for 'everything'.


Patrick
-- 
ConTeXt wiki and more: http://contextgarden.net
___
If your question is of interest to others as well, please add an entry to the 
Wiki!

maillist : ntg-context@ntg.nl / http://www.ntg.nl/mailman/listinfo/ntg-context
webpage  : http://www.pragma-ade.nl / http://tex.aanhet.net
archive  : https://foundry.supelec.fr/projects/contextrev/
wiki : http://contextgarden.net
___


Re: [NTG-context] question for the xml-experts

2009-02-14 Thread Thomas A. Schmitz


On Feb 14, 2009, at 7:25 PM, Wolfgang Schuster wrote:


Hi Thomas,

why don't you take a look at the OpenOffice export function, I saw  
it's
possible to convert a document to xhtml and this could be a start  
for you.


Wolfgang


Hi Wolfgang,

thanks for the suggestion! I had, in fact, tried the export functions  
(docbook and xhtml), but both drop too much formating: all italics  
etc. are silently dropped, and dynamical references are replaced with  
their values. So unless I can manage to hack the export xslt files,  
this doesn't seem possible.


All best

Thomas
___
If your question is of interest to others as well, please add an entry to the 
Wiki!

maillist : ntg-context@ntg.nl / http://www.ntg.nl/mailman/listinfo/ntg-context
webpage  : http://www.pragma-ade.nl / http://tex.aanhet.net
archive  : https://foundry.supelec.fr/projects/contextrev/
wiki : http://contextgarden.net
___


Re: [NTG-context] question for the xml-experts

2009-02-14 Thread Thomas A. Schmitz


On Feb 14, 2009, at 7:31 PM, Patrick Gundlach wrote:


Yes, it does. At my company we clean up (and reorganize) XML data with
XSLT all the time. We are happy users of saxon 9
(http://saxon.sourceforge.net/) which is an xslt 2.0 engine. Learning
XSLT is not trivial (but not too hard either), but once you get an
understanding of it nobody can stop you using XSLT for 'everything'.


Patrick


Great, I will look into saxon and xslt!

Best

Thomas
___
If your question is of interest to others as well, please add an entry to the 
Wiki!

maillist : ntg-context@ntg.nl / http://www.ntg.nl/mailman/listinfo/ntg-context
webpage  : http://www.pragma-ade.nl / http://tex.aanhet.net
archive  : https://foundry.supelec.fr/projects/contextrev/
wiki : http://contextgarden.net
___