Why not just to ruby FileUtils file compare the *.pdf?
BTW: other readers that are 100% gems are
http://www.rubydoc.info/gems/pdf-reader/1.3.3 PDF-Reader
https://github.com/prawnpdf/pdf-inspector PDF-Inspector
On Wednesday, September 30, 2015 at 8:23:29 PM UTC-5, Gauri Kuwar wrote:
>
> Hello All,
> Code for reading through a pdf was really helpful.
> Yet we need to compare two pdf's both text and appearance and save the
> differences if any.
> Can anybody suggest on automating the above scenario using ruby?
>
> Thanks
> Gauri
>
> On Wednesday, March 25, 2009 at 10:02:36 AM UTC-5, Wesley Chen wrote:
>>
>> Hi, Juuser,
>> Thank you very much for this post, but when I run the command:
>> system("pdftotext.exe -layout c:\\hello.pdf c:\\test.txt")
>> I get warning message:
>> *Error (0): PDF file is damaged - attempting to reconstruct xref table...*
>> I get the test.txt with an unexpected character in the end.
>> Please see the file attached.
>>
>> So do you have any choice to avoid it?
>>
>> Thanks.
>> Wesley Chen.
>>
>>
>> On Tue, Aug 12, 2008 at 12:17 AM, juuser <[email protected]> wrote:
>>
>>>
>>> Try to do it something like this so you only need to have this
>>> pdftotext.exe
>>>
>>> First, try it on your command line to see if that works correctly:
>>> pdftotext.exe -layout input.pdf output.txt
>>>
>>> or omit the -layout switch...
>>>
>>> Now, in Ruby just call the exe with:
>>>
>>> raise "failed!" unless system("pdftotext.exe -layout input.pdf
>>> output.txt")
>>>
>>> Now you can just read the output from file for example:
>>> data = File.readlines("output.txt")
>>>
>>> this should solve your problem about different Ruby versions.
>>>
>>> Hope this helps.
>>> On Aug 9, 3:04 am, Vijay <[email protected]> wrote:
>>> > Thanks Sameh and Juuser for your valuable and crystal-clear replies.
>>> > The code, which you provided worked like a gem. Now, we are able to
>>> > read 'pdf' files using Ruby.
>>> >
>>> > However, there is a small obstacle. The application, which were
>>> > trying to automate with Watir, has a lot of 'Modal Dialogs' and so, we
>>> > are using 'Ruby 1.82', which, according to the instructions given in
>>> > "http://wtr.rubyforge.org", only can support these dialogs. The 'pdf-
>>> > toolkit' code, though works perfectly with Ruby 1.85 (the latest but
>>> > one version of Ruby), it does not work with "Ruby 1.82". The code
>>> > throws some error "undefined method gem or something in one of its
>>> > internal files".
>>> >
>>> > So, I was wondering if it was possible to change the value of the
>>> > "Environment Variable", 'Path', through a 'DOS' command to point to
>>> > the 'Ruby 1.85' installation in the same computer and running this
>>> > 'pdf_read' program so as to execute it. Is this (having two versions
>>> > of Ruby installed in the computer and switching between versions
>>> > whenever needed) possible? or if there is any other way to get round
>>> > this?
>>> >
>>> > Thanks again,Vijay.
>>> >
>>> > On Aug 4, 2:26 am, Sameh <[email protected]> wrote:
>>> >
>>> > > HeyVijay,
>>> > > The process is a little complicated and not so straightforward.
>>> >
>>> > > First you will need to download pdftk. Download these files and
>>> > > extract the files only in the C:\windows\system32 folder.
>>> http://www.accesspdf.com/article.php/20041130153545577
>>> >
>>> > > Secondly you will need to download and isntall xpdf :
>>> http://pdf-toolkit.rubyforge.org/
>>> > > Extract those files into the C:\windows\system32 folder also
>>> >
>>> > > Then you will need the PDF::TOOLKIT gem. This can be found
>>> herehttp://rubyforge.org/projects/pdf-toolkit/
>>> >
>>> > > Basically this will convert the pdf to a textfile and you can do what
>>> > > you like with it. In the following example I have just read a file on
>>> > > my c:\ and displayed it using the 'puts' command.
>>> >
>>> > > require 'rubygems'
>>> > > require 'pdf/toolkit'
>>> >
>>> > > my_pdf = PDF::Toolkit.open("c:\\file.pdf")
>>> > > text = my_pdf.to_text.read
>>> > > puts text
>>> >
>>> > > I hope that helps a little.
>>> > > Cheers
>>> > > Sameh.
>>> >
>>> > >Vijaywrote:
>>> > > > Hello people,
>>> >
>>> > > > In our project, which we are trying to automate with Watir, we
>>> need to
>>> > > > read check the contents of a 'PDF report' that comes embedded in a
>>> > > > 'IE' window like the following
>>> >
>>> > > > <HTML><HEAD></HEAD>
>>> >
>>> > > > <BODY leftMargin=0 topMargin=0 scroll=no><EMBED src=http://
>>> > > > 192.1.2.24:10041/servlets/elite/shared/attachment/N/
>>> > > > RECEIPT_08012008_014938.pdf width="100%" height="100%"
>>> > > > type=application/pdf fullscreen="yes"></BODY></HTML>
>>> >
>>> > > > Can the contents of this file be read in Ruby after saving it in
>>> the
>>> > > > hardrive? When we use 'File.read' statement, Ruby outputs some
>>> junk
>>> > > > values like the following,
>>> >
>>> > > > %PDF-1.4
>>> > > > 1 0 obj <</Type /Catalog /Pages 2 0 R >>
>>> > > > endobj
>>> > > > 2 0 obj <</Type /Pages /Count 1 /Kids [3 0 R] /MediaBox [0 0 792
>>> > > > 612]>>
>>> > > > endobj
>>> > > > 3 0 obj <</Type /Page /Parent 2 0 R /Resources 4 0 R /Contents 6 0
>>> R>>
>>> > > > endobj
>>> > > > 4 0 obj <</ProcSet 5 0 R /Font 100 0 R>>......
>>> >
>>> > > > Thanks for your time,
>>> > > >Vijay.- Hide quoted text -
>>> >
>>> > > - Show quoted text -
>>>
>>>
>>
--
--
Before posting, please read http://watir.com/support. In short: search before
you ask, be nice.
[email protected]
http://groups.google.com/group/watir-general
[email protected]
---
You received this message because you are subscribed to the Google Groups
"Watir General" group.
To unsubscribe from this group and stop receiving emails from it, send an email
to [email protected].
For more options, visit https://groups.google.com/d/optout.