Re: How to extract whole text from a PDF file with the PDF

2021-12-14 Thread Paul McClernan via use-livecode
Ah, OK thanks for the clarification. I hadn’t realize PDFium has been around as long as it has been. I wouldn’t sweat the naming conflict, there’s at least three “PDFKit” libraries so you’re not alone. On Tue, Dec 14, 2021 at 11:01 AM Paul Dupuis via use-livecode < use-livecode@lists.runrev.com>

Re: How to extract whole text from a PDF file with the PDF

2021-12-14 Thread Paul Dupuis via use-livecode
On 12/14/2021 10:33 AM, Paul McClernan via use-livecode wrote: I was fairly certain that XPDF external was/is based on this XPDF: https://en.m.wikipedia.org/wiki/Xpdf Which has both GPL and Proprietary Licensing options available. Nope. My company (Researchware) and I paid for the development

Re: How to extract whole text from a PDF file with the PDF

2021-12-14 Thread Paul McClernan via use-livecode
I was fairly certain that XPDF external was/is based on this XPDF: https://en.m.wikipedia.org/wiki/Xpdf Which has both GPL and Proprietary Licensing options available. The newer (> 9.6.3) PDF Widget is based on PDFium which is an offshoot project that spawned from Google’s Chromium project. I’m

RE: How to extract whole text from a PDF file with the PDF widget?

2021-12-13 Thread Ralph DiMola via use-livecode
o the NumberOfPages of control "PDF1") Ralph DiMola IT Director Evergreen Information Services rdim...@evergreeninfo.net -Original Message- From: use-livecode [mailto:use-livecode-boun...@lists.runrev.com] On Behalf Of Paul Dupuis via use-livecode Sent: Sunday, December 12, 20

Re: How to extract whole text from a PDF file with the PDF

2021-12-13 Thread Richard Gaskin via use-livecode
Richmond wrote: > On 12.12.21 21:33, Richard Gaskin wrote: >> Stam Kapetanakis wrote: >> > i presume the pdf widget in pro is the opensource xpdfReader but >> > don’t know for sure. >> >> If it is that would be problematic, as the open source edition of >> xpdfReader is licensed under GPL, and

Re: How to extract whole text from a PDF file with the PDF widget?

2021-12-12 Thread Monte Goulding via use-livecode
Both the page and character index are clamped to the number of pages and characters on a page so you could set both to very high numbers. Adding character counts to the documentPages property might be useful here too. Cheers Monte > On 13 Dec 2021, at 11:17 am, Paul Dupuis via use-livecode >

Re: How to extract whole text from a PDF file with the PDF widget?

2021-12-12 Thread Paul Dupuis via use-livecode
Thank you Monte, We've just started to make a map from XPDF APIs to the PDF Widget APIs, so I'll make sure that gets done soon and add any missing capabilities as requests to the LC Quality Center. With regard to the hilitedRange and hilitedRangeText properties, can you just advise on the

Re: How to extract whole text from a PDF file with the PDF widget?

2021-12-12 Thread Monte Goulding via use-livecode
Hi Folks Currently you can extract text in the widget by setting the hilitedRange and getting the hilitedRangeText. It wouldn’t be that hard to add extracted text to the documentPages property. The PDF widget was built to meet the requirements for a client rather than to match the features of

Re: How to extract whole text from a PDF file with the PDF

2021-12-12 Thread Paul Dupuis via use-livecode
On 12/12/2021 8:59 AM, Stam Kapetanakis via use-livecode wrote: Hi Torsten, i presume the pdf widget in pro is the opensource xpdfReader but don’t know for sure. It is not xpdfreader. The XPDF Erternal AND the PDF Wdiget with Licecode are based on the Google PDFium Library. The first is C++

Re: How to extract whole text from a PDF file with the PDF

2021-12-12 Thread Richmond via use-livecode
The consequences are endless. On 12.12.21 21:33, Richard Gaskin via use-livecode wrote: Stam Kapetanakis wrote: > i presume the pdf widget in pro is the opensource xpdfReader but > don’t know for sure. If it is that would be problematic, as the open source edition of xpdfReader is licensed

Re: How to extract whole text from a PDF file with the PDF

2021-12-12 Thread Richard Gaskin via use-livecode
Stam Kapetanakis wrote: > i presume the pdf widget in pro is the opensource xpdfReader but > don’t know for sure. If it is that would be problematic, as the open source edition of xpdfReader is licensed under GPL, and LC no longer has an edition compatible with GPL. -- Richard Gaskin

Re: How to extract whole text from a PDF file with the PDF

2021-12-12 Thread Stam Kapetanakis via use-livecode
Hi Torsten, i presume the pdf widget in pro is the opensource xpdfReader but don’t know for sure. I did post on how to extract text from PDF using the free xpdfReader and non-pro LC: https://forums.livecode.com/viewtopic.php?f=8=35280=201036=Xpdfreader#p201036 I presume that with Pro this

Re: How to extract whole text from a PDF file with the PDF widget?

2021-12-11 Thread Paul Dupuis via use-livecode
I suspect it is for backward compatibility. When I turned over the XPDF external to Livecode, I asked that they maintain it for a couple years. I had expected we'd migrate out apps to the PDF widget by then, but business factors mean we're only now just starting a migration. That's why I

Re: How to extract whole text from a PDF file with the PDF widget?

2021-12-11 Thread matthias rebbe via use-livecode
Ah, i thought you were referring only to XPDF. Btw. do you have an idea why both, XPDF external and PDF widget, are maintained? Wouldn't it make sense to have only one pdf solution included? Or am i missing something? Regards, Matthias > Am 11.12.2021 um 02:01 schrieb Paul Dupuis via

Re: How to extract whole text from a PDF file with the PDF widget?

2021-12-10 Thread Paul Dupuis via use-livecode
Yes, I am familiar with the XPDF external (based on Google's PDFium library), having designed it and paid Monte to code it and then turned it over to LiveCode. I was referring to the PDF Widget (also based on Google's PDFium), which should have a comparable property for fetching the text of a

Re: How to extract whole text from a PDF file with the PDF widget?

2021-12-10 Thread matthias rebbe via use-livecode
Paul, here on mac OS the dictionary of LC 10 DP1 definitely lists the function XPDFViewer_Text(viewerName, pageNumber). Btw. checking this showed me that this function seems to be deprecated and instead the command XPDFViewer_Unicode viewerName, pageNumber, variableName should be

Re: How to extract whole text from a PDF file with the PDF widget?

2021-12-10 Thread Paul Dupuis via use-livecode
There must be an undocumented property for the text of a page - there was a function to return the full text of a page in the External (XPDF) and to get the full text of the PDF file, you just stepped through the pages (1..N) getting and concatenating the page text. Monte? LC 10.0.0

Re: How to extract whole text from a PDF file with the PDF widget?

2021-12-10 Thread matthias rebbe via use-livecode
Hi Torsten, i think the PDF widget does not support extracting text by code. At least the documentation does not show any information about this. You wrote, that you have a business license. That would mean, that you can use the Pro features of Livecode. There is an external included in the

How to extract whole text from a PDF file with the PDF widget?

2021-12-10 Thread Torsten Holmer via use-livecode
Hi, I have a PDF file with text and pictures, but I just want the text. I can do it manually with Ctrl-A and Ctrl-Copy by viewing the file with Preview on MacOS. I have a business licence and want to use the PDF widget but I cannot find a way to do it. Can someone help me out? Cheers,