Re: [dev-biblio] xslt, citeproc-writer

2006-06-27 Thread pt

I disagree Bruce, about not storing rendered content in the XML. I think it
needs to be stored in a rendered form.

If you don't then it will make it very hard to write things like
OpenDocument to HTML transforms in random languages as you would need to run
citeproc to format citations and bibliographies.

Related to this is an interoperability question. It is important not to
focus only on interop with OOo2, after all with a free software package it
is easy to get users to upgrade. Consider the interop problems with MS Word.

Have you considered an approach where citations are stored as rendered text
(or footnote/endnotes) in place with a link of some kind back to the
bibliographic database with the citation details stored as an item in the
database. That is you would have a (1) Work, with (2) particular expression
(is that what you call it) with (3) a citation by page or line number or
whatever . That is three items in the database - and only one simple link in
the documnet text. Seems to me that this would fit very well with your RDF
approach, Bruce. And an approach like this might mean that you could build a
solution that could wodk with OpenXML docs as well.





On 6/28/06, Bruce D'Arcus [EMAIL PROTECTED] wrote:


bib project questions ...

Re: David Wilson's idea that citeproc give pre-rendered citation and
bibliography chunks (first/subsequent, etc.) and save it in the XML,
described here:

http://wiki.services.openoffice.org/wiki/Citeproc_Writer_Interaction

I've thought about this some, and agree with the first part, but not
that the rendered content should be saved in the XML. I think perhaps
we can modify the bibliographic class to store that pre-formatted
content (or create a new ReferenceList class?), so that it's just
stored in memory, rather than saved?





I've updated the wiki to reflect this.


So process is something like:

Citation passes list of ids to ReferenceList
ReferenceList requests formatted citation and bib chunks from citeproc
Citation requests formatted citation from ReferenceList
Bibliography = ReferenceList to ODF

I think this is how MS is doing it in Word 2007.

Right now citeproc is XSLT 2.0. It'd be nice if we could just use it
more-or-less as is. Svante has suggested it's likely OOo might switch
to using Saxon (and thus get XSLT 2.0 for free) in the next major
release.

How feasible would it be to this? Could we implement essentially
real-time citation processing using XSLT?

It's hard enough to get good C++ programmers, and I'd rather not have
them waste time reimplementing citeproc in that language when it
already works quite well.

Bruce

-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]





--
Peter (pt) Sefton
Toowoomba 4350
Queensland, Australia
Phone: +61 4 1032 6955
Web: http://ptsefton.com
Email: [EMAIL PROTECTED]


Re: [dev-biblio] xslt, citeproc-writer

2006-06-27 Thread Bruce D'Arcus


Hey Peter,

On Jun 27, 2006, at 11:09 PM, pt wrote:

I disagree Bruce, about not storing rendered content in the XML. I 
think it

needs to be stored in a rendered form.

If you don't then it will make it very hard to write things like
OpenDocument to HTML transforms in random languages as you would need 
to run

citeproc to format citations and bibliographies.


Maybe I wasn't clear, but the citation always gets included in the 
content file; there's no other way to display it after all. And that 
can be easily transformed to HTML or OXML.


What David was suggesting (if I understood right) was that the 
bibliographic source file (bibliography.xml or whatever) would, beyond 
the raw metadata, also include pre-rendered chunks for all potential 
citation rendering options for a given style, plus the bibliographic 
entry.


My problem with that is it results in redundant and unnecessary 
content, and pollutes the source file.



Related to this is an interoperability question. It is important not to
focus only on interop with OOo2, after all with a free software 
package it
is easy to get users to upgrade. Consider the interop problems with MS 
Word.


Given what I say above, do you still see any interop problems?

Have you considered an approach where citations are stored as rendered 
text

(or footnote/endnotes) in place with a link of some kind back to the
bibliographic database with the citation details stored as an item in 
the

database.


That's exactly what the new ODF approach (and the MS OXML approach) 
does ;-)



That is you would have a (1) Work, with (2) particular expression
(is that what you call it) with (3) a citation by page or line number 
or
whatever . That is three items in the database - and only one simple 
link in
the documnet text. Seems to me that this would fit very well with your 
RDF
approach, Bruce. And an approach like this might mean that you could 
build a

solution that could wodk with OpenXML docs as well.


Am not quite following this bit. The plan is:

1)  content.xml holds the new citation fields, which are:
a) link to a source record, and
b) rendered citation

2)  the source metadata gets stored in a dedicated file within the 
wrapper; maybe bibliography/source.xml


1b gets generated from 2. This is exactly how MS is doing it, 
coincidentally, in OXML.


What David was thinking about was funky citation styles (well, many of 
them, in fact; APA, Chicago, etc.) that distinguish first and 
subsequent citations. The way citeproc works now is, IT has to figure 
out this sort of positional information, and then inserts the right 
formatted version in the output.


The alternative, then, is to just have citeproc be rather dumb about 
it, and create the two representations for each citation, and have the 
new citation support figured out which to use.


Bruce

-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



Re: [dev-biblio] xslt, citeproc-writer

2006-06-27 Thread pt

Thanks for the propmt response Bruce.

What I'm suggesting for consideration is that you take this example from
http://wiki.services.openoffice.org/wiki/Bibliographic_Project%27s_Developer_Page#New_Citation_Coding


cite:biblioref cite:key=Veer1996a
cite:detail cite:units=pages cite:begin=23 cite:end=24/
  /cite:biblioref

And change it to something like this:

cite:biblioref cite:key=Veer1996a-citation1

  /cite:biblioref

Where there the page-range details are stored in the bibliographic database
as a new item that relates to the key 'Veer1996a' - so each page-range would
have its own record in the db.

(I'm assuming the rendered text will go inside the cite element?)

I'm still of the opinion that you would be better off building an
bibliographic database with the simplest possible hooks into the file
format; this could be implemented using a cross reference field or some
other field that exists already in OpenDocument.

What if the OOo bibliographic tool had its own web server? Then the inline
text would look like:

a href=http://localhost:1234/cite-key/Veer1996a/1/;(Veer 1996a pp.
23-24)/a

Where the database would have a record for 'Veer1996a' and a record for each
page-range cited.

Citeproc could change the rendered citation inside the a element
dynamically as has been proposed elsewhere.

I am suggesting these ideas because I think they would allow an integrated
tool in OOo to also inter operate  with Word documents, HTML documents and
and so in the kinds of real environments we see at our university where we
have Windows, Mac and Linux users running different versions of Word and OOo
and LaTeX. To work with .doc files one would have to ship the database
separately like you do with EndNote now, but it could be embedded in the
file where OpenDocument 3 support is available.




On 6/28/06, Bruce D'Arcus [EMAIL PROTECTED] wrote:



Hey Peter,

On Jun 27, 2006, at 11:09 PM, pt wrote:

 I disagree Bruce, about not storing rendered content in the XML. I
 think it
 needs to be stored in a rendered form.

 If you don't then it will make it very hard to write things like
 OpenDocument to HTML transforms in random languages as you would need
 to run
 citeproc to format citations and bibliographies.

Maybe I wasn't clear, but the citation always gets included in the
content file; there's no other way to display it after all. And that
can be easily transformed to HTML or OXML.

What David was suggesting (if I understood right) was that the
bibliographic source file (bibliography.xml or whatever) would, beyond
the raw metadata, also include pre-rendered chunks for all potential
citation rendering options for a given style, plus the bibliographic
entry.

My problem with that is it results in redundant and unnecessary
content, and pollutes the source file.

 Related to this is an interoperability question. It is important not to
 focus only on interop with OOo2, after all with a free software
 package it
 is easy to get users to upgrade. Consider the interop problems with MS
 Word.

Given what I say above, do you still see any interop problems?

 Have you considered an approach where citations are stored as rendered
 text
 (or footnote/endnotes) in place with a link of some kind back to the
 bibliographic database with the citation details stored as an item in
 the
 database.

That's exactly what the new ODF approach (and the MS OXML approach)
does ;-)

 That is you would have a (1) Work, with (2) particular expression
 (is that what you call it) with (3) a citation by page or line number
 or
 whatever . That is three items in the database - and only one simple
 link in
 the documnet text. Seems to me that this would fit very well with your
 RDF
 approach, Bruce. And an approach like this might mean that you could
 build a
 solution that could wodk with OpenXML docs as well.

Am not quite following this bit. The plan is:

1)  content.xml holds the new citation fields, which are:
a) link to a source record, and
b) rendered citation

2)  the source metadata gets stored in a dedicated file within the
wrapper; maybe bibliography/source.xml

1b gets generated from 2. This is exactly how MS is doing it,
coincidentally, in OXML.

What David was thinking about was funky citation styles (well, many of
them, in fact; APA, Chicago, etc.) that distinguish first and
subsequent citations. The way citeproc works now is, IT has to figure
out this sort of positional information, and then inserts the right
formatted version in the output.

The alternative, then, is to just have citeproc be rather dumb about
it, and create the two representations for each citation, and have the
new citation support figured out which to use.

Bruce

-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]





--
Peter (pt) Sefton
Toowoomba 4350
Queensland, Australia
Phone: +61 4 1032 6955
Web: http://ptsefton.com
Email: