Yes, it has two images, but if you use the code that I attached it will not
detect those images and the code uses the sample included in the Itext
book. So I think there is something wrong in the code but I can't see what
it is.
When you run the code I attached using the file pdf-lopd.pdf the
RenderImage procedure is called, but when you use the 10.7 file the
procedure is never called so I assume the parser finds no images and
therefore the IRenderListener is never used. But as you said, the images
are on the pdf file.
In order to find the images on the PDF file we call this function.
The function creates an object using the class MyImageListener (the
definition of the class is after function code) and then process the file.
public static IEnumerable<drawing.RectangleF> FindImages(MemoryStream pdfIn)
{
pdfIn.Position = 0;
var reader = new PdfReader(pdfIn);
var parser = new PdfReaderContentParser(reader);
var imgScan = new MyImageListener();
// Only process images on the last page of the PDF file
int nPage = reader.NumberOfPages;
parser.ProcessContent(nPage, imgScan);
reader.Close();
pdfIn.Position = 0;
if (!imgScan.HasImageAreas)
return null;
// Find the minimum and maximum values of the X value of each rectangle
float minX = 1000000.0F;
float maxX = 0.0F;
float minY = 1000000.0F;
float maxY = 0.0F;
foreach (var r in imgScan.ImageRectangles)
{
minX = Math.Min(minX, r.Left);
maxX = Math.Max(maxX, r.Left);
minY = Math.Min(minY, r.Bottom);
maxY = Math.Max(maxY, r.Bottom);
}
var qryRects = from r in imgScan.ImageRectangles
where ((r.Left == minX) || (r.Left == maxX)) && r.Width >= 75.0F
orderby r.Bottom, r.Left
select r;
return qryRects.ToList();
}
This is the class implementing IRenderListener that we created to locate
the images on the PDF file
public class MyImageListener : IRenderListener
{
private readonly List<drawing.RectangleF> imageAreas;
public MyImageListener()
{
imageAreas = new List<drawing.RectangleF>();
}
public bool HasImageAreas
{
get { return imageAreas.Count() != 0; }
}
public IEnumerable<drawing.RectangleF> ImageRectangles
{
get { return imageAreas; }
}
public void BeginTextBlock() { }
public void EndTextBlock() { }
public void RenderText(TextRenderInfo renderInfo) { }
public void RenderImage(ImageRenderInfo renderInfo)
{
var matrix = renderInfo.GetImageCTM();
float left = matrix[6];
float top = matrix[7];
float width = matrix[0];
float height = matrix[4];
var rect = new drawing.RectangleF(left, top, width, height);
imageAreas.Add(rect);
}
}
2012/6/12 Leonard Rosenthol <lrose...@adobe.com>
> The 10.7 file has TWO images in it (unlike the other which only had 1).
>
> Leonard
>
> From: Daniel Oliveras <dolive...@gmail.com>
> Reply-To: Post here <itext-questions@lists.sourceforge.net>
> To: Post here <itext-questions@lists.sourceforge.net>
> Subject: Re: [iText-questions] Unable to locate images in a PDF document.
>
> Hi,
>
> My bad, I forget to copy the code in the mail :-(
>
> That said, the problem is in “pdf-lopd 10.7.pdf” the other file works
> fine with this code.
>
> This time I attach the code we use:
>
> In order to find the images on the PDF file we call this function.
> The function creates an object using the class MyImageListener (the
> definition of the class is after function code) and then process the file.
>
> public static IEnumerable<drawing.RectangleF> FindImages(MemoryStream
> pdfIn)
> {
> pdfIn.Position = 0;
> var reader = new PdfReader(pdfIn);
>
> var parser = new PdfReaderContentParser(reader);
> var imgScan = new MyImageListener();
>
> // Only process images on the last page of the PDF file
> int nPage = reader.NumberOfPages;
> parser.ProcessContent(nPage, imgScan);
>
> reader.Close();
> pdfIn.Position = 0;
>
> if (!imgScan.HasImageAreas)
> return null;
>
> // Find the minimum and maximum values of the X value of each
> rectangle
> float minX = 1000000.0F;
> float maxX = 0.0F;
> float minY = 1000000.0F;
> float maxY = 0.0F;
> foreach (var r in imgScan.ImageRectangles)
> {
> minX = Math.Min(minX, r.Left);
> maxX = Math.Max(maxX, r.Left);
>
> minY = Math.Min(minY, r.Bottom);
> maxY = Math.Max(maxY, r.Bottom);
> }
>
> var qryRects = from r in imgScan.ImageRectangles
> where ((r.Left == minX) || (r.Left == maxX)) &&
> r.Width >= 75.0F
> orderby r.Bottom, r.Left
> select r;
> return qryRects.ToList();
> }
>
> This is the class implementing IRenderListener that we created to locate
> the images on the PDF file
> /// <summary>
> /// IRenderListener class to detect images on a PDF Page
> /// </summary>
> public class MyImageListener : IRenderListener
> {
> private readonly List<drawing.RectangleF> imageAreas;
>
> public MyImageListener()
> {
> imageAreas = new List<drawing.RectangleF>();
> }
>
> public bool HasImageAreas
> {
> get { return imageAreas.Count() != 0; }
> }
>
> public IEnumerable<drawing.RectangleF> ImageRectangles
> {
> get { return imageAreas; }
> }
>
> public void BeginTextBlock() { }
>
> public void EndTextBlock() { }
>
> public void RenderText(TextRenderInfo renderInfo) { }
>
> public void RenderImage(ImageRenderInfo renderInfo)
> {
> var matrix = renderInfo.GetImageCTM();
>
> float left = matrix[6];
> float top = matrix[7];
> float width = matrix[0];
> float height = matrix[4];
>
> var rect = new drawing.RectangleF(left, top, width, height);
> imageAreas.Add(rect);
> }
> }
>
> 2012/6/11 Leonard Rosenthol <lrose...@adobe.com>
>
>> I don't know what you code is doing, since you didn't post it, but
>> Acrobat's Create Inventory report shows that there is one image in the
>> lopd.pdf file.
>>
>> Leonard
>>
>> From: Daniel Oliveras <dolive...@gmail.com>
>> Reply-To: Post here <itext-questions@lists.sourceforge.net>
>> To: Post here <itext-questions@lists.sourceforge.net>
>> Subject: [iText-questions] Unable to locate images in a PDF document.
>>
>> Hi,
>>
>>
>> We need to obtain the images included in a PDF document that another
>> system generates and replace these images with signature fields. This is
>> done this way because we never know the position the signature has to be
>> placed until we obtain the PDF so in order to determine the position we use
>> the images and replace them with the signature fields.
>>
>>
>> Attached to this mail I send two PDF documents the one named
>> “pdf-lopd.pdf” works fine, Using iTextSharp we can find the images and
>> replace them. But with the document named “pdf-lopd 10.7.pdf” iTextSharp
>> find no images.
>>
>>
>> For what we know the system that generates the PDF documents have changed
>> something internally, we believe they use aspose.words to convert a Word
>> document into a PDF one.
>>
>>
>> NOTE: The images used are white rectangles. No image is seen if the
>> document is printed.
>>
>>
>>
>> ------------------------------------------------------------------------------
>> Live Security Virtual Conference
>> Exclusive live event will cover all the ways today's security and
>> threat landscape has changed and how IT managers can respond. Discussions
>> will include endpoint security, mobile security and the latest in malware
>> threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/
>> _______________________________________________
>> iText-questions mailing list
>> iText-questions@lists.sourceforge.net
>> https://lists.sourceforge.net/lists/listinfo/itext-questions
>>
>> iText(R) is a registered trademark of 1T3XT BVBA.
>> Many questions posted to this list can (and will) be answered with a
>> reference to the iText book: http://www.itextpdf.com/book/
>> Please check the keywords list before you ask for examples:
>> http://itextpdf.com/themes/keywords.php
>>
>
>
>
> ------------------------------------------------------------------------------
> Live Security Virtual Conference
> Exclusive live event will cover all the ways today's security and
> threat landscape has changed and how IT managers can respond. Discussions
> will include endpoint security, mobile security and the latest in malware
> threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/
> _______________________________________________
> iText-questions mailing list
> iText-questions@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/itext-questions
>
> iText(R) is a registered trademark of 1T3XT BVBA.
> Many questions posted to this list can (and will) be answered with a
> reference to the iText book: http://www.itextpdf.com/book/
> Please check the keywords list before you ask for examples:
> http://itextpdf.com/themes/keywords.php
>
------------------------------------------------------------------------------
Live Security Virtual Conference
Exclusive live event will cover all the ways today's security and
threat landscape has changed and how IT managers can respond. Discussions
will include endpoint security, mobile security and the latest in malware
threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/
_______________________________________________
iText-questions mailing list
iText-questions@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/itext-questions
iText(R) is a registered trademark of 1T3XT BVBA.
Many questions posted to this list can (and will) be answered with a reference
to the iText book: http://www.itextpdf.com/book/
Please check the keywords list before you ask for examples:
http://itextpdf.com/themes/keywords.php