Re: [gentoo-user] Trying to automate HTML --- pdf
On Sunday 27 January 2008, [EMAIL PROTECTED] wrote: Oh geez, I LOVE it! I will play with it, it just might do the trick. It's sure not what I had been expecting, but if it works reliably, it is just the ticket. Java applets and flash animations could possibly cause problems, since they might need a few seconds to initialize even after the page is fully loaded (and thus the stop button is already inactive). Of course, if the pages you load don't use java/flash this is not a problem; but there might be other pitfalls. For example, I've noticed that konqueror loads some complex pages in two or more stages, with a brief pause (and the stop button inactive) between one stage and the next. You can check that by running something like while true; do dcop konqueror-8364 konqueror-mainwindow#1 actionIsEnabled stop; done ie, continuously checking the status of the stop button, and you'll see something like ... true true true true true true false false false false false false false false false false true true true true true true true ... true false before the page is fully loaded and the status eventually settles to false. So, if the script runs the test during the short false interval, it might be fooled into thinking that the page has loaded. I have not investigated further the cause of this behavior (perhaps multiple-frame pages?), but these few facts alone should be enough to deserve extra attention and thorough testing before using the kludge. Sheesh. A bloomin' genius is what you are :-) Thanks, glad you have at least a slightly better solution than before! -- gentoo-user@lists.gentoo.org mailing list
Re: [gentoo-user] Trying to automate HTML --- pdf
On Mon, Jan 28, 2008 at 04:01:12PM +0100, Etaoin Shrdlu wrote: You can check that by running something like while true; do dcop konqueror-8364 konqueror-mainwindow#1 actionIsEnabled stop; done That will bear protecting against. I have the basic program working, but it does need some fine tuning, and I will make it insist on having the status stay good for a couple of seconds after. One of the screwy things is that the widget names change as URLs are loaded. I mainly add a lot of checks and put out an alert for manual intervention when something off happens. So far none of the web sites use flash or Java applets. That would be a real mess for printing alone. -- ... _._. ._ ._. . _._. ._. ___ .__ ._. . .__. ._ .. ._. Felix Finch: scarecrow repairman rocket surgeon / [EMAIL PROTECTED] GPG = E987 4493 C860 246C 3B1E 6477 7838 76E9 182E 8151 ITAR license #4933 I've found a solution to Fermat's Last Theorem but I see I've run out of room o -- gentoo-user@lists.gentoo.org mailing list
[gentoo-user] Trying to automate HTML --- pdf
I am trying to automate converting a URL into a pdf file. These web pages include javascript and fancy formatting, so the simple minded converters just don't cut the ice. My next plan was to hack up a real browser so it would take two command line args, the URL and the print file, render the page, print it to the pdf file, and exit. From what I know of some of them, they would have to be configured in advance, and invocation would have to be strictly controlled so only one instance runs at a time, at least per user. I could probably create several firefox user sessions and have each of them running simultaneously, but multiple real users works for me too. Firefox doesn't print to pdf, however. But konqueror does. By using the DCOP interface, I can even pass it commands to load a URL and print the page, altho I have to settle for the configured print file name. But since I have to run individual sessions anyway, that's no big deal. The commands look like this: dcop konqueror-6352 'konqueror-mainwindow#1' openURL 'http://slashdot.org' dcop konqueror-6352 html-widget2 print true There's a bit more than that, since widget names change, but a simple perl program handles it easily (so far!). However, there's a problem. The openURL command returns without waiting for the web page to finish loading, and the print command does not wait for it to finish loading. The print command does wait for printing to finish before returning, which is nice. This means I have to put in some arbitrary sleep 30 or so between openURL and print to have a good chance of a complete printed page, and even then, there is no guarantee it actually will be complete. We have to send these pdf files to a bank, and it would not be good to send them incomplete pages, even if only one out of 100 or even 1000. There will be at least hundreds of these every day. I started to look at sources but there is no konqueror-3.5.8.tar.gz or anything similar. No doubt most of the code is handled by Qt widgets and KDE libs. Here are my quests: 0. Is there a better place to ask this? I tried a KDE mailing list and got no responses; there weren't even many views. 1. Is there either a DCOP command to wait for a URL to be loaded or a DCOP command like openURL which waits? 2. Is there a source file for konqueror which I could hack to take command line parameters without changing libraries or other code which would affect the rest of KDE? I don't have any problem with a hacked and renamed konqueror command. 3. Is there some other way of converting complicated web pages into pdf? If they don't understand javascript and style sheets and everything else that a real browser does, they are useless to me. 4. Are there other ways to do this that I haven't thought of? -- ... _._. ._ ._. . _._. ._. ___ .__ ._. . .__. ._ .. ._. Felix Finch: scarecrow repairman rocket surgeon / [EMAIL PROTECTED] GPG = E987 4493 C860 246C 3B1E 6477 7838 76E9 182E 8151 ITAR license #4933 I've found a solution to Fermat's Last Theorem but I see I've run out of room o -- gentoo-user@lists.gentoo.org mailing list
Re: [gentoo-user] Trying to automate HTML --- pdf
On Sun, 27 Jan 2008 09:06:15 -0800, [EMAIL PROTECTED] wrote: 1. Is there either a DCOP command to wait for a URL to be loaded or a DCOP command like openURL which waits? I can't see one, but it sounds like it would be useful enough to file a bug report requesting one. A DCOP command to tell whether the page has finished lading would be suitable. 2. Is there a source file for konqueror which I could hack to take command line parameters without changing libraries or other code which would affect the rest of KDE? I don't have any problem with a hacked and renamed konqueror command. Konqueror is part of kdebase, so you'll find the source somewhere in there. -- Neil Bothwick Where the system is concerned, you're not allowed to ask `Why?' signature.asc Description: PGP signature
Re: [gentoo-user] Trying to automate HTML --- pdf
On Sunday 27 January 2008, [EMAIL PROTECTED] wrote: dcop konqueror-6352 'konqueror-mainwindow#1' openURL 'http://slashdot.org' dcop konqueror-6352 html-widget2 print true There's a bit more than that, since widget names change, but a simple perl program handles it easily (so far!). However, there's a problem. The openURL command returns without waiting for the web page to finish loading, and the print command does not wait for it to finish loading. The print command does wait for printing to finish before returning, which is nice. [cut] 1. Is there either a DCOP command to wait for a URL to be loaded or a DCOP command like openURL which waits? I know of no direct method, and I can't answer your other questions either. However, the following (admittedly *really* kludgy and quick-and-dirty) method *seems* to work: dcop konqueror-6352 'konqueror-mainwindow#1' openURL 'http://my.url' while true; do # check if the stop button is clickable stat=`dcop konqueror-6352 konqueror-mainwindow#1 actionIsEnabled stop` if [ $stat == true ]; then # stop button is active, so page is still loading sleep 5 else # stop button is not active, page has loaded break fi done # do what you want here As I said above, I did some tests and this seems to work. However, I'm not claiming that it's the solution to your problem, nor that it will always work as expected. Therefore, I strongly suggest you test it thoroughly before using it. Hope that helped. -- gentoo-user@lists.gentoo.org mailing list
Re: [gentoo-user] Trying to automate HTML --- pdf
On Sun, Jan 27, 2008 at 06:56:59PM +0100, Etaoin Shrdlu wrote: However, the following (admittedly *really* kludgy and quick-and-dirty) method *seems* to work: Oh geez, I LOVE it! I will play with it, it just might do the trick. It's sure not what I had been expecting, but if it works reliably, it is just the ticket. Sheesh. A bloomin' genius is what you are :-) -- ... _._. ._ ._. . _._. ._. ___ .__ ._. . .__. ._ .. ._. Felix Finch: scarecrow repairman rocket surgeon / [EMAIL PROTECTED] GPG = E987 4493 C860 246C 3B1E 6477 7838 76E9 182E 8151 ITAR license #4933 I've found a solution to Fermat's Last Theorem but I see I've run out of room o -- gentoo-user@lists.gentoo.org mailing list