0.95: Metadata is not valid XML document even if there are no metadata defined in FO source

2008-11-05 Thread Stepan RYBAR
Hello,

I have a problem with metadada part of PDF created by Apache FOP 0.95 (using 
Sun Java SE 1.6.0_10 on MS Windows 2000). Although resulted PDF looks OK in 
Acrobat Reader, I am not able to process resulted PDF using iText 2.1.3. It 
throws Exception shown below. After several e-mails among iText developers (and 
me), it looks like even if I do not specify any metadata in FO, FOP 0.95 
produces PDF with compressed not valid XML metadata section according to iText 
developers. If I specify metadata according to Your example at

http://xmlgraphics.apache.org/fop/0.95/metadata.html#xmp-example

there is the same Exception. So, could You, please, verify, if the problem is 
really caused by FOP and if yes, to repait it? Thank You very much in advance.

Stepan Rybar

  Původní zpráva 
 Od: Paulo Soares 
 Předmět: Re: [iText-questions] iText 2.1.3: Invalid byte 1 of 1-byte UTF-8
 sequence Exception while processing PDF from Apache FOP 0.95
 Datum: 05.11.2008 15:44:00
 
 The metadata can be compressed, the problem is that the metadata is not a 
 valid
 xml document (or at least it wasn't in the pdf I loooked at). iText should
 ignore the error and carry on but it currently doesn't. The problem starts at
 FOP in any case.

 Paulo

 
 From: 1T3XT info []
 Sent: Wednesday, November 05, 2008 2:08 PM
 To: Post all your questions about iText here
 Subject: Re: [iText-questions] iText 2.1.3: Invalid byte 1 of 1-byte UTF-8
 sequence Exception while processing PDF from Apache FOP 0.95

 Stepan RYBAR wrote:
  Hello,
 
  thank You answer. But after some tests I guess, that this is not caused by
 missing metadata.

 Think again.

  Even if there are no metadata in source.fo

 Look at your PDF document.
 In the root dictionary, there's a reference to XMP data.
 /Metadata 7 0 R
 If we have a look at object 5, we see:
 7 0 obj
 
/Type /Metadata
/Subtype /XML
/Length 12 0 R
/Filter /FlateDecode
  
 stream
 xoe?QN~jÂ0?}?ü++?×?I??3¤.a){q??÷\??oem?#?Ú?öIu^...??
 W+OE=?îÍ9'ç#?u.Ë=X´...?]Ä?óý?#­bü?Íé1/4šÁNß7?VÍA~Z6{9Qxšô{ýž¨y}¨?`s?
 ?žu`2ÍÂ~ÂQ?]Y]?¨Ýómy^(2)1?-^^(1)}'f?0G?'l8?¤s?X?T'
 ^(1)s?wEUR^(2)Ç?o`?D++E`ë?Ë(c)HÇü/yÜ/[EMAIL 
 PROTECTED]'^(1)?ôvA`?EURFk6âa~?ÎFWqJ?é0yé§$W^?B+x
 a~?3/4äî?ÁO~ëE`U`s??!gÍ??.}??...kU^?Z?

 Oops... What's this?
 This is a compressed XMP stream.
 That's not allowed in PDF!



#  Původní zpráva 
# Od: Stepan RYBAR 
# Předmět: Re: [iText-questions] iText 2.1.3: Invalid byte 1 of 1-byte UTF-8
# sequence Exception while processing PDF from Apache FOP 0.95
# Datum: 05.11.2008 13:52:34
# 
# Hello,
#
# thank You answer. But after some tests I guess, that this is not caused by
# missing metadata. Even if there are no metadata in source.fo, I can use 
iText
# to print to stdout total number of pages. So it means, that iText can read
# source.pdf without throwing any exception. See attached source.fo 
(commented
# metadata), source.pdf (commented creation of output) and Java code for 
iText,
# which passes without problem.
#
# Once I want to create output file and save it without any modification (to
# prevent mistake in font encoding setting), there is an Exception:
#
# ExceptionConverter:
# com.sun.org.apache.xerces.internal.impl.io.MalformedByteSequenceException:
# Invalid byte 1 of 1-byte UTF-8 sequence.
#
# Stepan Rybar
#
#   Původní zpráva 
#  Od: Paulo Soares 
#  Předmět: Re: [iText-questions] iText 2.1.3: Invalid byte 1 of 1-byte UTF-8
#  sequence Exception while processing PDF from Apache FOP 0.95
#  Datum: 04.11.2008 17:23:03
#  
#  The metadata generated by FOP is broken. They have:
# 
#  x:xmpmeta
# 
#  instead of:
# 
#  x:xmpmeta xmlns:x=adobe:ns:meta/ x:xmptk=3.1-702
# 
#  The namespace is not defined and the parser complains about it.
# 
#  Paulo
#  
#  From: Stepan RYBAR []
#  Sent: Tuesday, November 04, 2008 4:11 PM
#  To: [EMAIL PROTECTED]
#  Subject: [iText-questions] iText 2.1.3: Invalid byte 1 of 1-byte UTF-8
# sequence
#  Exception while processing PDF from Apache FOP 0.95
# 
#  Hello,
# 
#  I am trying to use iText for adding page numbers to the already existed PDF
#  using PDFStamper class. I have problem. PDFs, which I create using Apache 
FOP
#  0.95, cause Exception as shown below, although they look OK in Adobe Acrobat
#  Reader 8.1.2. I guess, that this is problem of missing or wrong encoding
#  somewhere (but where?). Attached to this e-mail are: FO source, resulted 
PDF,
#  Java code for iText. I am using Sun Java SE 1.6.0_10 on MS Windows 2000.
# Please,
#  can You point me, where I am making miskate?
# 
#  Thank You. Stepan
# 
#  L:\Documents\Capitol\vypisyKv1\_otherL:\RunFiles\jdk\jre\bin\java -cp
#  .;iText-
#  2.1.3.jar AddPageNumbersToExistingPageNumberPDF
#  ExceptionConverter:
#  

Re: 0.95: Metadata is not valid XML document even if there are no metadata defined in FO source

2008-11-05 Thread Jeremias Maerki
Ok, so we have two problems:
1. Compressed Metadata stream: FOP doesn't compress the metadata stream
by default. The problem is that FOP 0.95 has a demo configuration file
that contains a filter list that causes all stream to be compressed.
Please remove all filterList elements including their children from
the configuration file. I've already committed a fix to FOP Trunk which
makes this unnecessary some time ago:
http://svn.apache.org/viewvc?rev=695491view=rev

2. The XMP is not valid RDF because the namespace declarations are
missing: If that happens you're using a JAXP TraX implementation that
has a buggy serializer. I've just tried with a fresh Sun Java 1.6.0_10
installation on WinXP and didn't have that problem. I assume you've
installed a different JAXP implementation.

On 05.11.2008 16:28:54 Stepan RYBAR wrote:
 Hello, 
 
 I have a problem with metadada part of PDF created by Apache FOP 0.95
 (using Sun Java SE 1.6.0_10 on MS Windows 2000). Although resulted PDF
 looks OK in Acrobat Reader, I am not able to process resulted PDF using
 iText 2.1.3. It throws Exception shown below. After several e-mails
 among iText developers (and me), it looks like even if I do not specify
 any metadata in FO, FOP 0.95 produces PDF with compressed not valid XML
 metadata section according to iText developers. If I specify metadata
 according to Your example at 
 
 http://xmlgraphics.apache.org/fop/0.95/metadata.html#xmp-example
 
 there is the same Exception. So, could You, please, verify, if the
 problem is really caused by FOP and if yes, to repait it? Thank You
 very much in advance. 
 
 Stepan Rybar 
 
   Pùvodní zpráva 
  Od: Paulo Soares 
  Pøedmìt: Re: [iText-questions] iText 2.1.3: Invalid byte 1 of 1-byte UTF-8
  sequence Exception while processing PDF from Apache FOP 0.95
  Datum: 05.11.2008 15:44:00
  
  The metadata can be compressed, the problem is that the metadata is not a 
  valid
  xml document (or at least it wasn't in the pdf I loooked at). iText should
  ignore the error and carry on but it currently doesn't. The problem starts 
  at
  FOP in any case.
  
  Paulo
  
  
  From: 1T3XT info []
  Sent: Wednesday, November 05, 2008 2:08 PM
  To: Post all your questions about iText here
  Subject: Re: [iText-questions] iText 2.1.3: Invalid byte 1 of 1-byte UTF-8
  sequence Exception while processing PDF from Apache FOP 0.95
  
  Stepan RYBAR wrote:
   Hello,
  
   thank You answer. But after some tests I guess, that this is not caused by
  missing metadata.
  
  Think again.
  
   Even if there are no metadata in source.fo
  
  Look at your PDF document.
  In the root dictionary, there's a reference to XMP data.
  /Metadata 7 0 R
  If we have a look at object 5, we see:
  7 0 obj
  
 /Type /Metadata
 /Subtype /XML
 /Length 12 0 R
 /Filter /FlateDecode
   
  stream
  xoe?QN~jÂ0?}?ü++?×?I??3¤.a){q??÷\??oem?#?Ú?öIu^...??
  W+OE=?îÍ9'ç#?u.Ë=X´...?]Ä?óý?#­bü?Íé1/4šÁNß7?VÍA~Z6{9Qxšô{ýž¨y}¨?`s?
  ?žu`2ÍÂ~ÂQ?]Y]?¨Ýómy^(2)1?-^^(1)}'f?0G?'l8?¤s?X?T'
  ^(1)s?wEUR^(2)Ç?o`?D++E`ë?Ë(c)HÇü/yÜ/[EMAIL 
  PROTECTED]'^(1)?ôvA`?EURFk6âa~?ÎFWqJ?é0yé§$W^?B+x
  a~?3/4äî?ÁO~ëE`U`s??!gÍ??.}??...kU^?Z?
  
  Oops... What's this?
  This is a compressed XMP stream.
  That's not allowed in PDF!
  
 
 
 #  Pùvodní zpráva 
 # Od: Stepan RYBAR 
 # Pøedmìt: Re: [iText-questions] iText 2.1.3: Invalid byte 1 of 1-byte UTF-8
 # sequence Exception while processing PDF from Apache FOP 0.95
 # Datum: 05.11.2008 13:52:34
 # 
 # Hello, 
 # 
 # thank You answer. But after some tests I guess, that this is not caused by
 # missing metadata. Even if there are no metadata in source.fo, I can use 
 iText
 # to print to stdout total number of pages. So it means, that iText can read
 # source.pdf without throwing any exception. See attached source.fo 
 (commented
 # metadata), source.pdf (commented creation of output) and Java code for 
 iText,
 # which passes without problem. 
 # 
 # Once I want to create output file and save it without any modification (to
 # prevent mistake in font encoding setting), there is an Exception: 
 # 
 # ExceptionConverter:
 # com.sun.org.apache.xerces.internal.impl.io.MalformedByteSequenceException:
 # Invalid byte 1 of 1-byte UTF-8 sequence. 
 # 
 # Stepan Rybar 
 # 
 #   Pùvodní zpráva 
 #  Od: Paulo Soares 
 #  Pøedmìt: Re: [iText-questions] iText 2.1.3: Invalid byte 1 of 1-byte UTF-8
 #  sequence Exception while processing PDF from Apache FOP 0.95
 #  Datum: 04.11.2008 17:23:03
 #  
 #  The metadata generated by FOP is broken. They have:
 #  
 #  x:xmpmeta
 #  
 #  instead of:
 #  
 #  x:xmpmeta xmlns:x=adobe:ns:meta/ x:xmptk=3.1-702
 #  
 #  The namespace is not defined and the parser complains about it.
 #  
 #  Paulo
 #  
 #  

Re: 0.95: Metadata is not valid XML document even if there are no metadata defined in FO source

2008-11-05 Thread Stepan RYBAR
ad 1) I did it, it looks like it works.

ad 2)  ...you're using a JAXP TraX... You are right. Although I do not know, 
that (and even how to) I changed anything about JAXP, when I run this on my 
second comp (MS Windows Vista 64-bit Enterprise, Sun Java 64-bit 1.6.0_10-b33), 
it works fine.

Thank You. Stepan

  Původní zpráva 
 Od: Jeremias Maerki [EMAIL PROTECTED]
 Předmět: Re: 0.95: Metadata is not valid XML document even if there are no
 metadata defined in FO source
 Datum: 05.11.2008 17:51:56
 
 Ok, so we have two problems:
 1. Compressed Metadata stream: FOP doesn't compress the metadata stream
 by default. The problem is that FOP 0.95 has a demo configuration file
 that contains a filter list that causes all stream to be compressed.
 Please remove all filterList elements including their children from
 the configuration file. I've already committed a fix to FOP Trunk which
 makes this unnecessary some time ago:
 http://svn.apache.org/viewvc?rev=695491view=rev

 2. The XMP is not valid RDF because the namespace declarations are
 missing: If that happens you're using a JAXP TraX implementation that
 has a buggy serializer. I've just tried with a fresh Sun Java 1.6.0_10
 installation on WinXP and didn't have that problem. I assume you've
 installed a different JAXP implementation.

 On 05.11.2008 16:28:54 Stepan RYBAR wrote:
  Hello,
 
  I have a problem with metadada part of PDF created by Apache FOP 0.95
  (using Sun Java SE 1.6.0_10 on MS Windows 2000). Although resulted PDF
  looks OK in Acrobat Reader, I am not able to process resulted PDF using
  iText 2.1.3. It throws Exception shown below. After several e-mails
  among iText developers (and me), it looks like even if I do not specify
  any metadata in FO, FOP 0.95 produces PDF with compressed not valid XML
  metadata section according to iText developers. If I specify metadata
  according to Your example at
 
  http://xmlgraphics.apache.org/fop/0.95/metadata.html#xmp-example
 
  there is the same Exception. So, could You, please, verify, if the
  problem is really caused by FOP and if yes, to repait it? Thank You
  very much in advance.
 
  Stepan Rybar
 
    Pùvodní zpráva 
   Od: Paulo Soares 
   Pøedmìt: Re: [iText-questions] iText 2.1.3: Invalid byte 1 of 1-byte UTF-8
   sequence Exception while processing PDF from Apache FOP 0.95
   Datum: 05.11.2008 15:44:00
   
   The metadata can be compressed, the problem is that the metadata is not a
 valid
   xml document (or at least it wasn't in the pdf I loooked at). iText should
   ignore the error and carry on but it currently doesn't. The problem starts
 at
   FOP in any case.
  
   Paulo
  
   
   From: 1T3XT info []
   Sent: Wednesday, November 05, 2008 2:08 PM
   To: Post all your questions about iText here
   Subject: Re: [iText-questions] iText 2.1.3: Invalid byte 1 of 1-byte UTF-8
   sequence Exception while processing PDF from Apache FOP 0.95
  
   Stepan RYBAR wrote:
Hello,
   
thank You answer. But after some tests I guess, that this is not caused
 by
   missing metadata.
  
   Think again.
  
Even if there are no metadata in source.fo
  
   Look at your PDF document.
   In the root dictionary, there's a reference to XMP data.
   /Metadata 7 0 R
   If we have a look at object 5, we see:
   7 0 obj
   
  /Type /Metadata
  /Subtype /XML
  /Length 12 0 R
  /Filter /FlateDecode

   stream
   xoe?QN~jÂ0?}?ü++?×?I??3¤.a){q??÷\??oem?#?Ú?öIu^...??
   W+OE=?îÍ9'ç#?u.Ë=X´...?]Ä?óý?#­bü?Íé1/4šÁNß7?VÍA~Z6{9Qxšô{ýž¨y}¨?`s?
   ?žu`2ÍÂ~ÂQ?]Y]?¨Ýómy^(2)1?-^^(1)}'f?0G?'l8?¤s?X?T'
  
 ^(1)s?wEUR^(2)Ç?o`?D++E`ë?Ë(c)HÇü/yÜ/[EMAIL 
 PROTECTED]'^(1)?ôvA`?EURFk6âa~?ÎFWqJ?é0yé§$W^?B+x
   a~?3/4äî?ÁO~ëE`U`s??!gÍ??.}??...kU^?Z?
  
   Oops... What's this?
   This is a compressed XMP stream.
   That's not allowed in PDF!
  
 
 
  #  Pùvodní zpráva 
  # Od: Stepan RYBAR 
  # Pøedmìt: Re: [iText-questions] iText 2.1.3: Invalid byte 1 of 1-byte UTF-8

  # sequence Exception while processing PDF from Apache FOP 0.95
  # Datum: 05.11.2008 13:52:34
  # 
  # Hello,
  #
  # thank You answer. But after some tests I guess, that this is not caused by
  # missing metadata. Even if there are no metadata in source.fo, I can use
 iText
  # to print to stdout total number of pages. So it means, that iText can read
  # source.pdf without throwing any exception. See attached source.fo
 (commented
  # metadata), source.pdf (commented creation of output) and Java code for
 iText,
  # which passes without problem.
  #
  # Once I want to create output file and save it without any modification (to
  # prevent mistake in font encoding setting), there is an Exception:
  #
  # ExceptionConverter:
  # com.sun.org.apache.xerces.internal.impl.io.MalformedByteSequenceException:
  # 

Re: 0.95: Metadata is not valid XML document even if there are no metadata defined in FO source

2008-11-05 Thread Jeremias Maerki
On 05.11.2008 18:32:20 Stepan RYBAR wrote:
 ad 1) I did it, it looks like it works. 
 
 ad 2)  ...you're using a JAXP TraX... You are right. Although I do not know,
 that (and even how to) I changed anything about JAXP, when I run this
 on my second comp (MS Windows Vista 64-bit Enterprise, Sun Java 64-bit
 1.6.0_10-b33), it works fine. 

There are multiple possibilities:
http://xml.apache.org/xalan-j/faq.html#faq-N100EF

Or you might have specified something like
-Djavax.xml.transform.TransformerFactory=net.sf.saxon.TransformerFactoryImpl
(or some other implementation class)

 
 Thank You. Stepan
 
   Původní zpráva 
  Od: Jeremias Maerki [EMAIL PROTECTED]
  Předmět: Re: 0.95: Metadata is not valid XML document even if there are no
  metadata defined in FO source
  Datum: 05.11.2008 17:51:56
  
  Ok, so we have two problems:
  1. Compressed Metadata stream: FOP doesn't compress the metadata stream
  by default. The problem is that FOP 0.95 has a demo configuration file
  that contains a filter list that causes all stream to be compressed.
  Please remove all filterList elements including their children from
  the configuration file. I've already committed a fix to FOP Trunk which
  makes this unnecessary some time ago:
  http://svn.apache.org/viewvc?rev=695491view=rev
  
  2. The XMP is not valid RDF because the namespace declarations are
  missing: If that happens you're using a JAXP TraX implementation that
  has a buggy serializer. I've just tried with a fresh Sun Java 1.6.0_10
  installation on WinXP and didn't have that problem. I assume you've
  installed a different JAXP implementation.
  
  On 05.11.2008 16:28:54 Stepan RYBAR wrote:
   Hello, 
   
   I have a problem with metadada part of PDF created by Apache FOP 0.95
   (using Sun Java SE 1.6.0_10 on MS Windows 2000). Although resulted PDF
   looks OK in Acrobat Reader, I am not able to process resulted PDF using
   iText 2.1.3. It throws Exception shown below. After several e-mails
   among iText developers (and me), it looks like even if I do not specify
   any metadata in FO, FOP 0.95 produces PDF with compressed not valid XML
   metadata section according to iText developers. If I specify metadata
   according to Your example at 
   
   http://xmlgraphics.apache.org/fop/0.95/metadata.html#xmp-example
   
   there is the same Exception. So, could You, please, verify, if the
   problem is really caused by FOP and if yes, to repait it? Thank You
   very much in advance. 
   
   Stepan Rybar 
   
 Pùvodní zpráva 
Od: Paulo Soares 
Pøedmìt: Re: [iText-questions] iText 2.1.3: Invalid byte 1 of 1-byte 
UTF-8
sequence Exception while processing PDF from Apache FOP 0.95
Datum: 05.11.2008 15:44:00

The metadata can be compressed, the problem is that the metadata is not 
a
  valid
xml document (or at least it wasn't in the pdf I loooked at). iText 
should
ignore the error and carry on but it currently doesn't. The problem 
starts
  at
FOP in any case.

Paulo


From: 1T3XT info []
Sent: Wednesday, November 05, 2008 2:08 PM
To: Post all your questions about iText here
Subject: Re: [iText-questions] iText 2.1.3: Invalid byte 1 of 1-byte 
UTF-8
sequence Exception while processing PDF from Apache FOP 0.95

Stepan RYBAR wrote:
 Hello,

 thank You answer. But after some tests I guess, that this is not 
 caused
  by
missing metadata.

Think again.

 Even if there are no metadata in source.fo

Look at your PDF document.
In the root dictionary, there's a reference to XMP data.
/Metadata 7 0 R
If we have a look at object 5, we see:
7 0 obj

   /Type /Metadata
   /Subtype /XML
   /Length 12 0 R
   /Filter /FlateDecode
 
stream
xoe?QN~jÂ0?}?ü++?×?I??3¤.a){q??÷\??oem?#?Ú?öIu^...??
W+OE=?îÍ9'ç#?u.Ë=X´...?]Ä?óý?#­bü?Íé1/4šÁNß7?VÍA~Z6{9Qxšô{ýž¨y}¨?`s?
?žu`2ÍÂ~ÂQ?]Y]?¨Ýómy^(2)1?-^^(1)}'f?0G?'l8?¤s?X?T'
   
  ^(1)s?wEUR^(2)Ç?o`?D++E`ë?Ë(c)HÇü/yÜ/[EMAIL 
  PROTECTED]'^(1)?ôvA`?EURFk6âa~?ÎFWqJ?é0yé§$W^?B+x
a~?3/4äî?ÁO~ëE`U`s??!gÍ??.}??...kU^?Z?

Oops... What's this?
This is a compressed XMP stream.
That's not allowed in PDF!

   
   
   #  Pùvodní zpráva 
   # Od: Stepan RYBAR 
   # Pøedmìt: Re: [iText-questions] iText 2.1.3: Invalid byte 1 of 1-byte 
   UTF-8
 
   # sequence Exception while processing PDF from Apache FOP 0.95
   # Datum: 05.11.2008 13:52:34
   # 
   # Hello, 
   # 
   # thank You answer. But after some tests I guess, that this is not caused 
   by
   # missing metadata. Even if there are no metadata in source.fo, I can 
   use
  iText
   # to print to stdout total number of pages. So it means, that