Embedding XMP in FO

2004-12-02 Thread Victor Röder
Hi Fop developers,

I'm using FOP to produce PDFs and I'm trying to embed XMP (a set of XML
metadata-describing vocabularies from Adobe) in FO. These XMP packets
should be serialized (nearly) one-to-one in a PDF tag for metadata.
Here is an simple example:

fo:page-sequence master-reference=default-page
 x:xmpmeta xmlns:x=adobe:ns:meta/ x:xmptk=XMP toolkit 2.9-9, framework
1.6
  rdf:RDF xmlns:rdf=http://www.w3.org/1999/02/22-rdf-syntax-ns;
   rdf:Description rdf:about=autodoc:tiff:c:/sample_path/sample.tiff
xmlns:xmpMM=http://ns.adobe.com/xap/1.0/mm/;
xmpMM:DocumentIDuuid:foo:id-0815/xmpMM:DocumentID
   /rdf:Description
  /rdf:RDF
 /x:xmpmeta
 fo:flow flow-name=xsl-region-body

 [...]

/fo:page-sequence

This XMP metadata says that the generated PDF page sequence is made of the
sample.tiff.

Serialized to PDF it should look something like:
(the best would be only for the first page of the TIFF page sequence)

7 0 obj
 /Type /Metadata /Subtype /XML /Length 541 stream
?xpacket begin= id=W5M0MpCehiHzreSzNTczkc9d?x:xmpmeta
xmlns:x=adobe:ns:meta/ x:xmptk=XMP toolkit 2.9-9, framework 1.6rdf:RDF
xmlns:rdf=http://www.w3.org/1999/02/22-rdf-syntax-ns;rdf:Description
rdf:about=autodoc:tiff:c:/sample_path/sample.tiff
xmlns:xmpMM=http://ns.adobe.com/xap/1.0/mm/;xmpMM:DocumentIDuuid:foo:id-
0815/xmpMM:DocumentID/rdf:Description/rdf:RDF/x:xmpmeta?xpacket
end=w?
endstream
endobj

Now my challenge: How can I do this as simple as possible. I tried to extend
the FOP sources but permanently get lost, while searching how the FObjs of
the FOTree get ready for the List of PDF objects to be rendered.

The problem is not the knowledge about XSL-FO nor PDF. I just don't have a
view through the FOP sources. With which things do I have to start to embed
the so-called foreign XML? I tried XMLObj but only got an [ERROR] null.

Thanks in advance for your responce!

Bye,
Victor



Re: Good news: Jeremias has been elected as an ASF member!

2004-12-02 Thread Luca Furini

 I have the great pleasure to announce that Jeremias Maerki has
 been elected as an ASF member at the last member's meeting
 during ApacheCon.

Congratulations!

Luca



Re: Knuth linebreaking questions

2004-12-02 Thread Luca Furini

Finn Bock wrote:

(starting from the second question)
 And why not adjust the spacing within the user specified min/max for
 START and END alignment?

Should the user desire adjusted spaces, wouldn't it be better for him to
specify justified alignment? :-)
Seriously, the recommendation (at 7.16.2 letter-spacing and 7.16.8
word-spacing) states that these spaces may also be influenced by
justification, but says nothing about start and end alignments.

 I'm still not sure why it would be ok to ignore any user specified
 min and max values of 'word-spacing' during START and END alignment.
 If a user specifies a length range, what would the reason be for not
 using it? Perhaps with additional DEFAULT_SPACE_WIDTH.

When alignment is start or end, each space has always its .optimum width,
so there is no need to look at the .minimum and .maximum: the user most
preferred value is already used.
But the knuth algorithm would not work if there were no elements with
adjustable width (glue with stretchability and/or shrinkability); the
actual value used is not very relevant, because the computed adjustment
ratio will not be applied.

 Ok, performance is indeed a fine reason, but IMHO such quality vs.
 speed tradeoffs should eventually be made by the user rather than us.

Simon told the same:

# Note that in TeX such thresholds are user-adjustable parameters. I
# think they should eventually be so in FOP too, for those of us who
# have the most exquisite taste of line layout.

and I think it's a good idea; the algorithm should:

 1 find breaking points without hyphenation
 2 hyphenate
 3 find breaking points with hyphenation
 4 decide which ones are better

and point #4 uses the user-definable threshold; where should this constant
be stored? Inside the code of LineLM or in a configuration file?

Regards
Luca



Re: Good news: Jeremias has been elected as an ASF member!

2004-12-02 Thread Oleg Tkachenko
Bertrand Delacretaz wrote:
I have the great pleasure to announce that Jeremias Maerki has been 
elected as an ASF member at the last member's meeting during ApacheCon.
Hey, that's good news. Congrats to Jeremias! Keep up your great work.
--
Oleg Tkachenko
http://blog.tkachenko.com
Multiconn Technologies, Israel


Re: Knuth linebreaking questions

2004-12-02 Thread Finn Bock
And why not adjust the spacing within the user specified min/max for
START and END alignment?
[Luca]
Should the user desire adjusted spaces, wouldn't it be better for him to
specify justified alignment? :-)
Seriously, the recommendation (at 7.16.2 letter-spacing and 7.16.8
word-spacing) states that these spaces may also be influenced by
justification, but says nothing about start and end alignments.
I tend to read that to mean that word spacing may be pushed beyond the 
specified range by justification. And I would think that unjustified 
alignment still has the option of using the word-spacing range but 
ofcourse has to stay within the range.

I'm still not sure why it would be ok to ignore any user specified
min and max values of 'word-spacing' during START and END alignment.
If a user specifies a length range, what would the reason be for not
using it? Perhaps with additional DEFAULT_SPACE_WIDTH.

When alignment is start or end, each space has always its .optimum width,
so there is no need to look at the .minimum and .maximum: the user most
preferred value is already used.
Is there anything that prevents using a non .optimum value within the 
range if the result is judged to be better (with a lower demerit).

Ok, performance is indeed a fine reason, but IMHO such quality vs.
speed tradeoffs should eventually be made by the user rather than us.

Simon told the same:
# Note that in TeX such thresholds are user-adjustable parameters. I
# think they should eventually be so in FOP too, for those of us who
# have the most exquisite taste of line layout.
and I think it's a good idea; the algorithm should:
 1 find breaking points without hyphenation
 2 hyphenate
 3 find breaking points with hyphenation
 4 decide which ones are better
and point #4 uses the user-definable threshold; where should this constant
be stored? Inside the code of LineLM or in a configuration file?
An extension attribute?
   fo:block fox:knuth-threshold=5 ... /fo:block
I suspect that the other knuth parameters should be specified the same 
way. But it is not a high priority IMO.

regards,
finn


More questions on line breaking.

2004-12-02 Thread Finn Bock
Hi
Some more questions.
1) What is inactiveList doing. Nodes are added but never used.
2) If there is no shrink in a line (the case in START alignment) then 
nodes are never removed from activeList until a forced break element is 
found. Is that really the intention of the algorithm? It seems suspect 
that a ration of INFINITE_RATIO is also created when the break is too 
wide to fit within a node.

regards,
finn


Re: Knuth linebreaking questions

2004-12-02 Thread Simon Pepping
On Thu, Dec 02, 2004 at 12:16:55PM +0100, Finn Bock wrote:
 and point #4 uses the user-definable threshold; where should this constant
 be stored? Inside the code of LineLM or in a configuration file?
 
 An extension attribute?
 
fo:block fox:knuth-threshold=5 ... /fo:block
 
 I suspect that the other knuth parameters should be specified the same 
 way. But it is not a high priority IMO.

It is not a layout specification in the fo file, it is a fine-tuning
of the algorithm applied by a particular FO Processor. It should be in
the user configuration. It may be specified in the configuration file,
or it may be specified by the calling application in the configuration
object FOUserAgent.userConfig. In the configuration file it should be
something like:

line-layout
  hyphenation-threshold5/hyphenation-threshold
  other parameters
/line-layout

FOUserAgent should get appropriate methods to extract the layout
part of the configuration and pass it on to a client class,
e.g. LineLM. Cf. FOUserAgent.getUserRendererConfig().

TeX's terms are pretolerance and tolerance for the two values of
maxAdjustment.

Regards, Simon

-- 
Simon Pepping
home page: http://www.leverkruid.nl



Re: Embedding XMP in FO

2004-12-02 Thread Jeremias Maerki
Hi Victor,

you haven't told us what branch you are on. Are you on the maintenance
branch or on FOP HEAD?

These hints are targeted at FOP HEAD code but some can be applied to the
maintenance branch, too, although the whole mechanism is quite different
there.

One thing you will have to do is creating an ElementMapping class (plus
node subclasses, or you extend the existing ExtensionElementMapping) so
the FOTreeBuilder can create nodes for each XML element of the various
namespaces in the metadata part. A good example in your case is the
bookmarks support code you can find in the org.apache.fop.fo.extensions
package. There you have an example of an ElementMapping class
(ExtensionElementMapping) and node classes (Bookmarks/Outline). That's
the first step.

The other issue is getting the metadata through to the PDFRenderer.
Since this is not an InstreamForeignObject or an ExternalGraphic, you
cannot use that infrastructure. But since the Bookmarks support is quite
similar to what you want to do, just follow these classes through the
code.

I'm not sure if attaching the metadata to a page-sequence is the right
approach. The page-sequence itself isn't represented in the resulting
PDF. You might also have multiple page-sequences in a document. So
you should probably attach it to fo:root where it applies to the whole
document. Attaching metadata to individual pages or objects is probably
trickier.

So for the fo:root attachment it is probably best to enhance the Root
class to catch the XMP data, just as it was done for Bookmarks. The FO
tree is one thing, you will also have to enhance the ...area.AreaTreeHandler
for the metadata. There's a class ...area.BookmarksData you might want
to look at. PDFRenderer.processOffDocumentItem() is finally responsible
to generate the outlines in the PDF. This can probably be used for the
metadata, too.

This should give you a few pointers into the code. I'm not sure if I got
everything right, because I'm not so familiar with this part, yet.
Hopefully, my colleagues will correct me if I wrote anything wrong.

I hope this helps nonetheless. We're looking forward to your patches for
FOP HEAD. ;)


On 02.12.2004 09:43:44 Victor Röder wrote:
 Hi Fop developers,
 
 I'm using FOP to produce PDFs and I'm trying to embed XMP (a set of XML
 metadata-describing vocabularies from Adobe) in FO. These XMP packets
 should be serialized (nearly) one-to-one in a PDF tag for metadata.
 Here is an simple example:
 
 fo:page-sequence master-reference=default-page
  x:xmpmeta xmlns:x=adobe:ns:meta/ x:xmptk=XMP toolkit 2.9-9, framework
 1.6
   rdf:RDF xmlns:rdf=http://www.w3.org/1999/02/22-rdf-syntax-ns;
rdf:Description rdf:about=autodoc:tiff:c:/sample_path/sample.tiff
 xmlns:xmpMM=http://ns.adobe.com/xap/1.0/mm/;
 xmpMM:DocumentIDuuid:foo:id-0815/xmpMM:DocumentID
/rdf:Description
   /rdf:RDF
  /x:xmpmeta
  fo:flow flow-name=xsl-region-body
 
  [...]
 
 /fo:page-sequence
 
 This XMP metadata says that the generated PDF page sequence is made of the
 sample.tiff.
 
 Serialized to PDF it should look something like:
 (the best would be only for the first page of the TIFF page sequence)
 
 7 0 obj
  /Type /Metadata /Subtype /XML /Length 541 stream
 ?xpacket begin= id=W5M0MpCehiHzreSzNTczkc9d?x:xmpmeta
 xmlns:x=adobe:ns:meta/ x:xmptk=XMP toolkit 2.9-9, framework 1.6rdf:RDF
 xmlns:rdf=http://www.w3.org/1999/02/22-rdf-syntax-ns;rdf:Description
 rdf:about=autodoc:tiff:c:/sample_path/sample.tiff
 xmlns:xmpMM=http://ns.adobe.com/xap/1.0/mm/;xmpMM:DocumentIDuuid:foo:id-
 0815/xmpMM:DocumentID/rdf:Description/rdf:RDF/x:xmpmeta?xpacket
 end=w?
 endstream
 endobj
 
 Now my challenge: How can I do this as simple as possible. I tried to extend
 the FOP sources but permanently get lost, while searching how the FObjs of
 the FOTree get ready for the List of PDF objects to be rendered.
 
 The problem is not the knowledge about XSL-FO nor PDF. I just don't have a
 view through the FOP sources. With which things do I have to start to embed
 the so-called foreign XML? I tried XMLObj but only got an [ERROR] null.
 
 Thanks in advance for your responce!
 
 Bye,
   Victor



Jeremias Maerki



Re: Good news: Jeremias has been elected as an ASF member!

2004-12-02 Thread J.Pietschmann
Bertrand Delacretaz wrote:
I have the great pleasure to announce that Jeremias Maerki has been 
elected as an ASF member at the last member's meeting during ApacheCon.
Congratulations!
J.Pietschmann


Info on Avalon Framework

2004-12-02 Thread Jeremias Maerki
Gang,

you may have heard about what happened in Avalon-land. Looks like the
project has failed from a community POV. So, just for those who don't
know already, Avalon Framework which we use in FOP has been transferred
over to the Avalon Excalibur project (http://excalibur.apache.org/).

The source code is here: 
https://svn.apache.org/repos/asf/excalibur/trunk/framework/

The announcement:
http://www.mail-archive.com/users@avalon.apache.org/msg05033.html

My impression is that the package will remain active and maintained this
way, so actually this doesn't change anything for us.

Jeremias Maerki