0.95: Metadata is not valid XML document even if there are no metadata defined in FO source
Hello, I have a problem with metadada part of PDF created by Apache FOP 0.95 (using Sun Java SE 1.6.0_10 on MS Windows 2000). Although resulted PDF looks OK in Acrobat Reader, I am not able to process resulted PDF using iText 2.1.3. It throws Exception shown below. After several e-mails among iText developers (and me), it looks like even if I do not specify any metadata in FO, FOP 0.95 produces PDF with compressed not valid XML metadata section according to iText developers. If I specify metadata according to Your example at http://xmlgraphics.apache.org/fop/0.95/metadata.html#xmp-example there is the same Exception. So, could You, please, verify, if the problem is really caused by FOP and if yes, to repait it? Thank You very much in advance. Stepan Rybar Původní zpráva Od: Paulo Soares Předmět: Re: [iText-questions] iText 2.1.3: Invalid byte 1 of 1-byte UTF-8 sequence Exception while processing PDF from Apache FOP 0.95 Datum: 05.11.2008 15:44:00 The metadata can be compressed, the problem is that the metadata is not a valid xml document (or at least it wasn't in the pdf I loooked at). iText should ignore the error and carry on but it currently doesn't. The problem starts at FOP in any case. Paulo From: 1T3XT info [] Sent: Wednesday, November 05, 2008 2:08 PM To: Post all your questions about iText here Subject: Re: [iText-questions] iText 2.1.3: Invalid byte 1 of 1-byte UTF-8 sequence Exception while processing PDF from Apache FOP 0.95 Stepan RYBAR wrote: Hello, thank You answer. But after some tests I guess, that this is not caused by missing metadata. Think again. Even if there are no metadata in source.fo Look at your PDF document. In the root dictionary, there's a reference to XMP data. /Metadata 7 0 R If we have a look at object 5, we see: 7 0 obj /Type /Metadata /Subtype /XML /Length 12 0 R /Filter /FlateDecode stream xoe?QN~jÂ0?}?ü++?×?I??3¤.a){q??÷\??oem?#?Ú?öIu^...?? W+OE=?îÍ9'ç#?u.Ë=X´...?]Ä?óý?#bü?Íé1/4šÁNß7?VÍA~Z6{9Qxšô{ýž¨y}¨?`s? ?žu`2ÍÂ~ÂQ?]Y]?¨Ýómy^(2)1?-^^(1)}'f?0G?'l8?¤s?X?T' ^(1)s?wEUR^(2)Ç?o`?D++E`ë?Ë(c)HÇü/yÜ/[EMAIL PROTECTED]'^(1)?ôvA`?EURFk6âa~?ÎFWqJ?é0yé§$W^?B+x a~?3/4äî?ÁO~ëE`U`s??!gÍ??.}??...kU^?Z? Oops... What's this? This is a compressed XMP stream. That's not allowed in PDF! # Původní zpráva # Od: Stepan RYBAR # Předmět: Re: [iText-questions] iText 2.1.3: Invalid byte 1 of 1-byte UTF-8 # sequence Exception while processing PDF from Apache FOP 0.95 # Datum: 05.11.2008 13:52:34 # # Hello, # # thank You answer. But after some tests I guess, that this is not caused by # missing metadata. Even if there are no metadata in source.fo, I can use iText # to print to stdout total number of pages. So it means, that iText can read # source.pdf without throwing any exception. See attached source.fo (commented # metadata), source.pdf (commented creation of output) and Java code for iText, # which passes without problem. # # Once I want to create output file and save it without any modification (to # prevent mistake in font encoding setting), there is an Exception: # # ExceptionConverter: # com.sun.org.apache.xerces.internal.impl.io.MalformedByteSequenceException: # Invalid byte 1 of 1-byte UTF-8 sequence. # # Stepan Rybar # # Původní zpráva # Od: Paulo Soares # Předmět: Re: [iText-questions] iText 2.1.3: Invalid byte 1 of 1-byte UTF-8 # sequence Exception while processing PDF from Apache FOP 0.95 # Datum: 04.11.2008 17:23:03 # # The metadata generated by FOP is broken. They have: # # x:xmpmeta # # instead of: # # x:xmpmeta xmlns:x=adobe:ns:meta/ x:xmptk=3.1-702 # # The namespace is not defined and the parser complains about it. # # Paulo # # From: Stepan RYBAR [] # Sent: Tuesday, November 04, 2008 4:11 PM # To: [EMAIL PROTECTED] # Subject: [iText-questions] iText 2.1.3: Invalid byte 1 of 1-byte UTF-8 # sequence # Exception while processing PDF from Apache FOP 0.95 # # Hello, # # I am trying to use iText for adding page numbers to the already existed PDF # using PDFStamper class. I have problem. PDFs, which I create using Apache FOP # 0.95, cause Exception as shown below, although they look OK in Adobe Acrobat # Reader 8.1.2. I guess, that this is problem of missing or wrong encoding # somewhere (but where?). Attached to this e-mail are: FO source, resulted PDF, # Java code for iText. I am using Sun Java SE 1.6.0_10 on MS Windows 2000. # Please, # can You point me, where I am making miskate? # # Thank You. Stepan # # L:\Documents\Capitol\vypisyKv1\_otherL:\RunFiles\jdk\jre\bin\java -cp # .;iText- # 2.1.3.jar AddPageNumbersToExistingPageNumberPDF # ExceptionConverter: #
Re: 0.95: Metadata is not valid XML document even if there are no metadata defined in FO source
Ok, so we have two problems: 1. Compressed Metadata stream: FOP doesn't compress the metadata stream by default. The problem is that FOP 0.95 has a demo configuration file that contains a filter list that causes all stream to be compressed. Please remove all filterList elements including their children from the configuration file. I've already committed a fix to FOP Trunk which makes this unnecessary some time ago: http://svn.apache.org/viewvc?rev=695491view=rev 2. The XMP is not valid RDF because the namespace declarations are missing: If that happens you're using a JAXP TraX implementation that has a buggy serializer. I've just tried with a fresh Sun Java 1.6.0_10 installation on WinXP and didn't have that problem. I assume you've installed a different JAXP implementation. On 05.11.2008 16:28:54 Stepan RYBAR wrote: Hello, I have a problem with metadada part of PDF created by Apache FOP 0.95 (using Sun Java SE 1.6.0_10 on MS Windows 2000). Although resulted PDF looks OK in Acrobat Reader, I am not able to process resulted PDF using iText 2.1.3. It throws Exception shown below. After several e-mails among iText developers (and me), it looks like even if I do not specify any metadata in FO, FOP 0.95 produces PDF with compressed not valid XML metadata section according to iText developers. If I specify metadata according to Your example at http://xmlgraphics.apache.org/fop/0.95/metadata.html#xmp-example there is the same Exception. So, could You, please, verify, if the problem is really caused by FOP and if yes, to repait it? Thank You very much in advance. Stepan Rybar Pùvodní zpráva Od: Paulo Soares Pøedmìt: Re: [iText-questions] iText 2.1.3: Invalid byte 1 of 1-byte UTF-8 sequence Exception while processing PDF from Apache FOP 0.95 Datum: 05.11.2008 15:44:00 The metadata can be compressed, the problem is that the metadata is not a valid xml document (or at least it wasn't in the pdf I loooked at). iText should ignore the error and carry on but it currently doesn't. The problem starts at FOP in any case. Paulo From: 1T3XT info [] Sent: Wednesday, November 05, 2008 2:08 PM To: Post all your questions about iText here Subject: Re: [iText-questions] iText 2.1.3: Invalid byte 1 of 1-byte UTF-8 sequence Exception while processing PDF from Apache FOP 0.95 Stepan RYBAR wrote: Hello, thank You answer. But after some tests I guess, that this is not caused by missing metadata. Think again. Even if there are no metadata in source.fo Look at your PDF document. In the root dictionary, there's a reference to XMP data. /Metadata 7 0 R If we have a look at object 5, we see: 7 0 obj /Type /Metadata /Subtype /XML /Length 12 0 R /Filter /FlateDecode stream xoe?QN~jÂ0?}?ü++?×?I??3¤.a){q??÷\??oem?#?Ú?öIu^...?? W+OE=?îÍ9'ç#?u.Ë=X´...?]Ä?óý?#bü?Íé1/4ÁNß7?VÍA~Z6{9Qxô{ý¨y}¨?`s? ?u`2ÍÂ~ÂQ?]Y]?¨Ýómy^(2)1?-^^(1)}'f?0G?'l8?¤s?X?T' ^(1)s?wEUR^(2)Ç?o`?D++E`ë?Ë(c)HÇü/yÜ/[EMAIL PROTECTED]'^(1)?ôvA`?EURFk6âa~?ÎFWqJ?é0yé§$W^?B+x a~?3/4äî?ÁO~ëE`U`s??!gÍ??.}??...kU^?Z? Oops... What's this? This is a compressed XMP stream. That's not allowed in PDF! # Pùvodní zpráva # Od: Stepan RYBAR # Pøedmìt: Re: [iText-questions] iText 2.1.3: Invalid byte 1 of 1-byte UTF-8 # sequence Exception while processing PDF from Apache FOP 0.95 # Datum: 05.11.2008 13:52:34 # # Hello, # # thank You answer. But after some tests I guess, that this is not caused by # missing metadata. Even if there are no metadata in source.fo, I can use iText # to print to stdout total number of pages. So it means, that iText can read # source.pdf without throwing any exception. See attached source.fo (commented # metadata), source.pdf (commented creation of output) and Java code for iText, # which passes without problem. # # Once I want to create output file and save it without any modification (to # prevent mistake in font encoding setting), there is an Exception: # # ExceptionConverter: # com.sun.org.apache.xerces.internal.impl.io.MalformedByteSequenceException: # Invalid byte 1 of 1-byte UTF-8 sequence. # # Stepan Rybar # # Pùvodní zpráva # Od: Paulo Soares # Pøedmìt: Re: [iText-questions] iText 2.1.3: Invalid byte 1 of 1-byte UTF-8 # sequence Exception while processing PDF from Apache FOP 0.95 # Datum: 04.11.2008 17:23:03 # # The metadata generated by FOP is broken. They have: # # x:xmpmeta # # instead of: # # x:xmpmeta xmlns:x=adobe:ns:meta/ x:xmptk=3.1-702 # # The namespace is not defined and the parser complains about it. # # Paulo # #
Re: 0.95: Metadata is not valid XML document even if there are no metadata defined in FO source
ad 1) I did it, it looks like it works. ad 2) ...you're using a JAXP TraX... You are right. Although I do not know, that (and even how to) I changed anything about JAXP, when I run this on my second comp (MS Windows Vista 64-bit Enterprise, Sun Java 64-bit 1.6.0_10-b33), it works fine. Thank You. Stepan Původní zpráva Od: Jeremias Maerki [EMAIL PROTECTED] Předmět: Re: 0.95: Metadata is not valid XML document even if there are no metadata defined in FO source Datum: 05.11.2008 17:51:56 Ok, so we have two problems: 1. Compressed Metadata stream: FOP doesn't compress the metadata stream by default. The problem is that FOP 0.95 has a demo configuration file that contains a filter list that causes all stream to be compressed. Please remove all filterList elements including their children from the configuration file. I've already committed a fix to FOP Trunk which makes this unnecessary some time ago: http://svn.apache.org/viewvc?rev=695491view=rev 2. The XMP is not valid RDF because the namespace declarations are missing: If that happens you're using a JAXP TraX implementation that has a buggy serializer. I've just tried with a fresh Sun Java 1.6.0_10 installation on WinXP and didn't have that problem. I assume you've installed a different JAXP implementation. On 05.11.2008 16:28:54 Stepan RYBAR wrote: Hello, I have a problem with metadada part of PDF created by Apache FOP 0.95 (using Sun Java SE 1.6.0_10 on MS Windows 2000). Although resulted PDF looks OK in Acrobat Reader, I am not able to process resulted PDF using iText 2.1.3. It throws Exception shown below. After several e-mails among iText developers (and me), it looks like even if I do not specify any metadata in FO, FOP 0.95 produces PDF with compressed not valid XML metadata section according to iText developers. If I specify metadata according to Your example at http://xmlgraphics.apache.org/fop/0.95/metadata.html#xmp-example there is the same Exception. So, could You, please, verify, if the problem is really caused by FOP and if yes, to repait it? Thank You very much in advance. Stepan Rybar Pùvodní zpráva Od: Paulo Soares Pøedmìt: Re: [iText-questions] iText 2.1.3: Invalid byte 1 of 1-byte UTF-8 sequence Exception while processing PDF from Apache FOP 0.95 Datum: 05.11.2008 15:44:00 The metadata can be compressed, the problem is that the metadata is not a valid xml document (or at least it wasn't in the pdf I loooked at). iText should ignore the error and carry on but it currently doesn't. The problem starts at FOP in any case. Paulo From: 1T3XT info [] Sent: Wednesday, November 05, 2008 2:08 PM To: Post all your questions about iText here Subject: Re: [iText-questions] iText 2.1.3: Invalid byte 1 of 1-byte UTF-8 sequence Exception while processing PDF from Apache FOP 0.95 Stepan RYBAR wrote: Hello, thank You answer. But after some tests I guess, that this is not caused by missing metadata. Think again. Even if there are no metadata in source.fo Look at your PDF document. In the root dictionary, there's a reference to XMP data. /Metadata 7 0 R If we have a look at object 5, we see: 7 0 obj /Type /Metadata /Subtype /XML /Length 12 0 R /Filter /FlateDecode stream xoe?QN~jÂ0?}?ü++?×?I??3¤.a){q??÷\??oem?#?Ú?öIu^...?? W+OE=?îÍ9'ç#?u.Ë=X´...?]Ä?óý?#bü?Íé1/4ÁNß7?VÍA~Z6{9Qxô{ý¨y}¨?`s? ?u`2ÍÂ~ÂQ?]Y]?¨Ýómy^(2)1?-^^(1)}'f?0G?'l8?¤s?X?T' ^(1)s?wEUR^(2)Ç?o`?D++E`ë?Ë(c)HÇü/yÜ/[EMAIL PROTECTED]'^(1)?ôvA`?EURFk6âa~?ÎFWqJ?é0yé§$W^?B+x a~?3/4äî?ÁO~ëE`U`s??!gÍ??.}??...kU^?Z? Oops... What's this? This is a compressed XMP stream. That's not allowed in PDF! # Pùvodní zpráva # Od: Stepan RYBAR # Pøedmìt: Re: [iText-questions] iText 2.1.3: Invalid byte 1 of 1-byte UTF-8 # sequence Exception while processing PDF from Apache FOP 0.95 # Datum: 05.11.2008 13:52:34 # # Hello, # # thank You answer. But after some tests I guess, that this is not caused by # missing metadata. Even if there are no metadata in source.fo, I can use iText # to print to stdout total number of pages. So it means, that iText can read # source.pdf without throwing any exception. See attached source.fo (commented # metadata), source.pdf (commented creation of output) and Java code for iText, # which passes without problem. # # Once I want to create output file and save it without any modification (to # prevent mistake in font encoding setting), there is an Exception: # # ExceptionConverter: # com.sun.org.apache.xerces.internal.impl.io.MalformedByteSequenceException: #
Re: 0.95: Metadata is not valid XML document even if there are no metadata defined in FO source
On 05.11.2008 18:32:20 Stepan RYBAR wrote: ad 1) I did it, it looks like it works. ad 2) ...you're using a JAXP TraX... You are right. Although I do not know, that (and even how to) I changed anything about JAXP, when I run this on my second comp (MS Windows Vista 64-bit Enterprise, Sun Java 64-bit 1.6.0_10-b33), it works fine. There are multiple possibilities: http://xml.apache.org/xalan-j/faq.html#faq-N100EF Or you might have specified something like -Djavax.xml.transform.TransformerFactory=net.sf.saxon.TransformerFactoryImpl (or some other implementation class) Thank You. Stepan Původní zpráva Od: Jeremias Maerki [EMAIL PROTECTED] Předmět: Re: 0.95: Metadata is not valid XML document even if there are no metadata defined in FO source Datum: 05.11.2008 17:51:56 Ok, so we have two problems: 1. Compressed Metadata stream: FOP doesn't compress the metadata stream by default. The problem is that FOP 0.95 has a demo configuration file that contains a filter list that causes all stream to be compressed. Please remove all filterList elements including their children from the configuration file. I've already committed a fix to FOP Trunk which makes this unnecessary some time ago: http://svn.apache.org/viewvc?rev=695491view=rev 2. The XMP is not valid RDF because the namespace declarations are missing: If that happens you're using a JAXP TraX implementation that has a buggy serializer. I've just tried with a fresh Sun Java 1.6.0_10 installation on WinXP and didn't have that problem. I assume you've installed a different JAXP implementation. On 05.11.2008 16:28:54 Stepan RYBAR wrote: Hello, I have a problem with metadada part of PDF created by Apache FOP 0.95 (using Sun Java SE 1.6.0_10 on MS Windows 2000). Although resulted PDF looks OK in Acrobat Reader, I am not able to process resulted PDF using iText 2.1.3. It throws Exception shown below. After several e-mails among iText developers (and me), it looks like even if I do not specify any metadata in FO, FOP 0.95 produces PDF with compressed not valid XML metadata section according to iText developers. If I specify metadata according to Your example at http://xmlgraphics.apache.org/fop/0.95/metadata.html#xmp-example there is the same Exception. So, could You, please, verify, if the problem is really caused by FOP and if yes, to repait it? Thank You very much in advance. Stepan Rybar Pùvodní zpráva Od: Paulo Soares Pøedmìt: Re: [iText-questions] iText 2.1.3: Invalid byte 1 of 1-byte UTF-8 sequence Exception while processing PDF from Apache FOP 0.95 Datum: 05.11.2008 15:44:00 The metadata can be compressed, the problem is that the metadata is not a valid xml document (or at least it wasn't in the pdf I loooked at). iText should ignore the error and carry on but it currently doesn't. The problem starts at FOP in any case. Paulo From: 1T3XT info [] Sent: Wednesday, November 05, 2008 2:08 PM To: Post all your questions about iText here Subject: Re: [iText-questions] iText 2.1.3: Invalid byte 1 of 1-byte UTF-8 sequence Exception while processing PDF from Apache FOP 0.95 Stepan RYBAR wrote: Hello, thank You answer. But after some tests I guess, that this is not caused by missing metadata. Think again. Even if there are no metadata in source.fo Look at your PDF document. In the root dictionary, there's a reference to XMP data. /Metadata 7 0 R If we have a look at object 5, we see: 7 0 obj /Type /Metadata /Subtype /XML /Length 12 0 R /Filter /FlateDecode stream xoe?QN~jÂ0?}?ü++?×?I??3¤.a){q??÷\??oem?#?Ú?öIu^...?? W+OE=?îÍ9'ç#?u.Ë=X´...?]Ä?óý?#bü?Íé1/4ÁNß7?VÍA~Z6{9Qxô{ý¨y}¨?`s? ?u`2ÍÂ~ÂQ?]Y]?¨Ýómy^(2)1?-^^(1)}'f?0G?'l8?¤s?X?T' ^(1)s?wEUR^(2)Ç?o`?D++E`ë?Ë(c)HÇü/yÜ/[EMAIL PROTECTED]'^(1)?ôvA`?EURFk6âa~?ÎFWqJ?é0yé§$W^?B+x a~?3/4äî?ÁO~ëE`U`s??!gÍ??.}??...kU^?Z? Oops... What's this? This is a compressed XMP stream. That's not allowed in PDF! # Pùvodní zpráva # Od: Stepan RYBAR # Pøedmìt: Re: [iText-questions] iText 2.1.3: Invalid byte 1 of 1-byte UTF-8 # sequence Exception while processing PDF from Apache FOP 0.95 # Datum: 05.11.2008 13:52:34 # # Hello, # # thank You answer. But after some tests I guess, that this is not caused by # missing metadata. Even if there are no metadata in source.fo, I can use iText # to print to stdout total number of pages. So it means, that