Re: Web page source using wget?
Thanks Hrvoje, using http://.../InventoryStatus.asp?cboSupplier=4541-134289status=allaction-select=Query in IE worked like a charm. I didn't have to follow links. I am now trying to automate this using wget 1.8.2 (Windows). There are two steps involved: 1). Log in to the customer's web site. I was able to create the following link after I looked at the form section in the source as explained to me earlier by Hrvoje. wget http://customer.website.com?UserAccount=USERAccessCode=PASSWORDLocale=English (United States)TimeZone=(GMT-5:00) Eastern Standard Time (USA amp; Canada)action-Submit=Login 2). Execute: wget http://customer.website.com/InventoryStatus.asp?cboSupplier=4541-134289status=allaction-select=Query I tried different ways to get this working, but so far have been unsuccessful. Any ideas? Thanks, Suhas - Original Message - From: Hrvoje Niksic [EMAIL PROTECTED] To: Suhas Tembe [EMAIL PROTECTED] Cc: [EMAIL PROTECTED] Sent: Tuesday, October 07, 2003 6:12 PM Subject: Re: Web page source using wget? Suhas Tembe [EMAIL PROTECTED] writes: It does look a little complicated This is how it looks: form action=InventoryStatus.asp method=post [...] [...] select name=cboSupplier option value=4541-134289454A/option option value=4542-134289 selected454B/option /select Those are the important parts. It's not hard to submit this form. With Wget 1.9, you can even use the POST method, e.g.: wget http://.../InventoryStatus.asp --post-data \ 'cboSupplier=4541-134289status=allaction-select=Query' \ -O InventoryStatus1.asp wget http://.../InventoryStatus.asp --post-data \ 'cboSupplier=4542-134289status=allaction-select=Query' -O InventoryStatus2.asp It might even work to simply use GET, and retrieve http://.../InventoryStatus.asp?cboSupplier=4541-134289status=allaction-select=Query without the need for `--post-data' or `-O', but that depends on the ASP script that does the processing. The harder part is to automate this process for *any* values in the drop-down list. You might need to use an intermediary Perl script that extracts all the option value=... from the HTML source of the page with the drop-down. Then, from the output of the Perl script, you call Wget as shown above. It's doable, but it takes some work. Unfortunately, I don't know of a (command-line) tool that would make this easier.
Re: Web page source using wget?
I tried, but it doesn't seem to have worked. This what I did: wget --save-cookies=cookies.txt http://customer.website.com?UserAccount=USERAccessCode=PASSWORDLocale=English (United States)TimeZone=(GMT-5:00) Eastern Standard Time (USA amp; Canada)action-Submit=Login wget --load-cookies=cookies.txt http://customer.website.com/supplyweb/smi/inventorystatus.asp?cboSupplier=4541-134289status=allaction-select=Query --http-user=4542-134289 After executing the above two lines, it creates two files: 1). [EMAIL PROTECTED] : I can see that this file contains a message (among other things): Your session has expired due to a period of inactivity 2). [EMAIL PROTECTED] Thanks, Suhas - Original Message - From: Hrvoje Niksic [EMAIL PROTECTED] To: Suhas Tembe [EMAIL PROTECTED] Cc: [EMAIL PROTECTED] Sent: Monday, October 13, 2003 11:37 AM Subject: Re: Web page source using wget? Suhas Tembe [EMAIL PROTECTED] writes: There are two steps involved: 1). Log in to the customer's web site. I was able to create the following link after I looked at the form section in the source as explained to me earlier by Hrvoje. wget http://customer.website.com?UserAccount=USERAccessCode=PASSWORDLocale=English (United States)TimeZone=(GMT-5:00) Eastern Standard Time (USA amp; Canada)action-Submit=Login Did you add --save-cookies=FILE? By default Wget will use cookies, but will not save them to an external file and they will therefore be lost. 2). Execute: wget http://customer.website.com/InventoryStatus.asp?cboSupplier=4541-134289status=allaction-select=Query For this step, add --load-cookies=FILE, where FILE is the same file you specified to --save-cookies above.
Re: Web page source using wget?
A slight correction the first wget should read: wget --save-cookies=cookies.txt http://customer.website.com/supplyweb/general/default.asp?UserAccount=USERAccessCode=PASSWORDLocale=en-usTimeZone=EST:-300action-Submit=Login I tried this link in IE, but it it comes back to the same login screen. No errors messages are displayed at this point. Am I missing something? I have attached the source for the login page. Thanks, Suhas - Original Message - From: Suhas Tembe [EMAIL PROTECTED] To: Hrvoje Niksic [EMAIL PROTECTED] Cc: [EMAIL PROTECTED] Sent: Monday, October 13, 2003 11:53 AM Subject: Re: Web page source using wget? I tried, but it doesn't seem to have worked. This what I did: wget --save-cookies=cookies.txt http://customer.website.com?UserAccount=USERAccessCode=PASSWORDLocale=English (United States)TimeZone=(GMT-5:00) Eastern Standard Time (USA amp; Canada)action-Submit=Login wget --load-cookies=cookies.txt http://customer.website.com/supplyweb/smi/inventorystatus.asp?cboSupplier=4541-134289status=allaction-select=Query --http-user=4542-134289 After executing the above two lines, it creates two files: 1). [EMAIL PROTECTED] : I can see that this file contains a message (among other things): Your session has expired due to a period of inactivity 2). [EMAIL PROTECTED] Thanks, Suhas - Original Message - From: Hrvoje Niksic [EMAIL PROTECTED] To: Suhas Tembe [EMAIL PROTECTED] Cc: [EMAIL PROTECTED] Sent: Monday, October 13, 2003 11:37 AM Subject: Re: Web page source using wget? Suhas Tembe [EMAIL PROTECTED] writes: There are two steps involved: 1). Log in to the customer's web site. I was able to create the following link after I looked at the form section in the source as explained to me earlier by Hrvoje. wget http://customer.website.com?UserAccount=USERAccessCode=PASSWORDLocale=English (United States)TimeZone=(GMT-5:00) Eastern Standard Time (USA amp; Canada)action-Submit=Login Did you add --save-cookies=FILE? By default Wget will use cookies, but will not save them to an external file and they will therefore be lost. 2). Execute: wget http://customer.website.com/InventoryStatus.asp?cboSupplier=4541-134289status=allaction-select=Query For this step, add --load-cookies=FILE, where FILE is the same file you specified to --save-cookies above.
Re: Web page source using wget?
So, is there a way I can get to the page I want after logging into a secure server using wget? Can I keep the SSL connection open for the second retrieval to work? The other thing I noticed is that the first URL (to log in) does not seem to work, because when I use that same URL in IE, it brings me back to the login screen (see attached source of the login page). I don't get logged-in. I am not quite sure if it is the URL that is incorrect or it is something else. Thanks, Suhas - Original Message - From: Jens Rösner [EMAIL PROTECTED] To: Suhas Tembe [EMAIL PROTECTED] Cc: [EMAIL PROTECTED] Sent: Monday, October 13, 2003 12:51 PM Subject: Re: Web page source using wget? Hi Suhas! Well, I am by no means an expert, but I think that wget closes the connection after the first retrieval. The SSL server realizes this and decides that wget has no right to log in for the second retrieval, eventhough the cookie is there. I think that is a correct behaviour for a secure server, isn't it? Does this make sense? Jens A slight correction the first wget should read: wget --save-cookies=cookies.txt http://customer.website.com/supplyweb/general/default.asp?UserAccount=U SERAccessCode=PASSWORDLocale=en-usTimeZone=EST:-300action-Submi t=Login I tried this link in IE, but it it comes back to the same login screen. No errors messages are displayed at this point. Am I missing something? I have attached the source for the login page. Thanks, Suhas - Original Message - From: Suhas Tembe [EMAIL PROTECTED] To: Hrvoje Niksic [EMAIL PROTECTED] Cc: [EMAIL PROTECTED] Sent: Monday, October 13, 2003 11:53 AM Subject: Re: Web page source using wget? I tried, but it doesn't seem to have worked. This what I did: wget --save-cookies=cookies.txt http://customer.website.com?UserAccount=USERAccessCode=PASSWORDLoca le=English (United States)TimeZone=(GMT-5:00) Eastern Standard Time (USA amp; Canada)action-Submit=Login wget --load-cookies=cookies.txt http://customer.website.com/supplyweb/smi/inventorystatus.asp?cboSupplier =4541-134289status=allaction-select=Query --http-user=4542-134289 After executing the above two lines, it creates two files: 1). [EMAIL PROTECTED] : I can see that this file contains a message (among other things): Your session has expired due to a period of inactivity 2). [EMAIL PROTECTED] Thanks, Suhas - Original Message - From: Hrvoje Niksic [EMAIL PROTECTED] To: Suhas Tembe [EMAIL PROTECTED] Cc: [EMAIL PROTECTED] Sent: Monday, October 13, 2003 11:37 AM Subject: Re: Web page source using wget? Suhas Tembe [EMAIL PROTECTED] writes: There are two steps involved: 1). Log in to the customer's web site. I was able to create the following link after I looked at the form section in the source as explained to me earlier by Hrvoje. wget http://customer.website.com?UserAccount=USERAccessCode=PASSWORDLoca le=English (United States)TimeZone=(GMT-5:00) Eastern Standard Time (USA amp; Canada)action-Submit=Login Did you add --save-cookies=FILE? By default Wget will use cookies, but will not save them to an external file and they will therefore be lost. 2). Execute: wget http://customer.website.com/InventoryStatus.asp?cboSupplier=4541-134289 status=allaction-select=Query For this step, add --load-cookies=FILE, where FILE is the same file you specified to --save-cookies above. -- NEU FÜR ALLE - GMX MediaCenter - für Fotos, Musik, Dateien... Fotoalbum, File Sharing, MMS, Multimedia-Gruß, GMX FotoService Jetzt kostenlos anmelden unter http://www.gmx.net +++ GMX - die erste Adresse für Mail, Message, More! +++ html xmlns:bml=urn:brainna.com:bml:2002 head META http-equiv=Content-Type content=text/html; charset=ISO-8859-1 titleSupplyWEB Login/title /headscript language=JavaScript1.1 type=text/javascript var amSymbol = AM; var pmSymbol = PM; var negativeSymbol = -; var dateSeparator = /; var dateFormat = M/dd/; var timeSeparator = :; var timeFormat = h:mm:ss t; var decimalSeparator = .; function setIcon(icon, required, valid) { if (!valid) { icon.alt = X; icon.src = ../images/error.gif; } else if (required) { icon.alt = *; icon.src = ../images/required.gif; } else { icon.alt = ; icon.src = ../images/blank.gif; } } function login_UserAccount_validate() { var valid = true; setIcon(document.login.UserAccount_icon, true, valid); return valid; } function login_AccessCode_validate() { var valid = true; setIcon(document.login.AccessCode_icon, true, valid); return valid; } function login_Locale_validate() { var valid = true; if (valid) valid = login_Locale_custom_validate(document.login.Locale); setIcon(document.login.Locale_icon, true, valid); return valid; } function
Re: Web page source using wget?
Cookies.txt looks like this: # HTTP cookie file. # Generated by Wget on 2003-10-13 13:19:26. # Edit at your own risk. There is nothing after the 3rd line. So, it doesn't look like a valid cookie file. - Original Message - From: Hrvoje Niksic [EMAIL PROTECTED] To: Suhas Tembe [EMAIL PROTECTED] Cc: [EMAIL PROTECTED] Sent: Monday, October 13, 2003 12:57 PM Subject: Re: Web page source using wget? Suhas Tembe [EMAIL PROTECTED] writes: I tried, but it doesn't seem to have worked. This what I did: wget --save-cookies=cookies.txt http://customer.website.com?UserAccount=USERAccessCode=PASSWORDLocale=English (United States)TimeZone=(GMT-5:00) Eastern Standard Time (USA amp; Canada)action-Submit=Login Hopefully you used quotes to protect the spaces in URLs from the shell? After the first command, does `cookies.txt' contains what looks like a valid cookie?
Error: wget for Windows.
I am trying to use wget for Windows get this message: The ordinal 508 could not be located in the dynamic link library LIBEAY32.dll. This is the command I am using: wget http://www.website.com --http-user=username --http-passwd=password I have the LIBEAY32.dll file in the same folder as the wget. What could be wrong? Thanks in advance. Suhas
Re: Web page source using wget?
Thanks everyone for the replies so far.. The problem I am having is that the customer is using ASP Java script. The URL stays the same as I click through the links. So, using wget URL for the page I want may not work (I may be wrong). Any suggestions on how I can tackle this? Thanks, Suhas - Original Message - From: Hrvoje Niksic [EMAIL PROTECTED] To: Suhas Tembe [EMAIL PROTECTED] Cc: [EMAIL PROTECTED] Sent: Monday, October 06, 2003 5:19 PM Subject: Re: Web page source using wget? Suhas Tembe [EMAIL PROTECTED] writes: Hello Everyone, I am new to this wget utility, so pardon my ignorance.. Here is a brief explanation of what I am currently doing: 1). I go to our customer's website every day log in using a User Name Password. 2). I click on 3 links before I get to the page I want. 3). I right-click on the page choose view source. It opens it up in Notepad. 4). I save the source to a file subsequently perform various tasks on that file. As you can see, it is a manual process. What I would like to do is automate this process of obtaining the source of a page using wget. Is this possible? Maybe you can give me some suggestions. It's possible, in fact it's what Wget does in its most basic form. Disregarding authentication, the recipe would be: 1) Write down the URL. 2) Type `wget URL' and you get the source of the page in file named SOMETHING.html, where SOMETHING is the file name that the URL ends with. Of course, you will also have to specify the credentials to the page, and Tony explained how to do that.
Re: Web page source using wget?
Got it! Thanks! So far so good. After logging-in, I was able to get to the page I am interested in. There was one thing that I forgot to mention in my earlier posts (I apologize)... this page contains a drop-down list of our customer's locations. At present, I choose one location from the drop-down list click submit to get the data, which is displayed in a report format. I right-click then choose view source save source to a file. I then choose the next location from the drop-down list, click submit again. I again do a view source save the source to another file and so on for all their locations. I am not quite sure how to automate this process! How can I do this non-interactively? especially the submit portion of the page. Is this possible using wget? Thanks, Suhas - Original Message - From: Hrvoje Niksic [EMAIL PROTECTED] To: Suhas Tembe [EMAIL PROTECTED] Cc: [EMAIL PROTECTED] Sent: Tuesday, October 07, 2003 5:02 PM Subject: Re: Web page source using wget? Suhas Tembe [EMAIL PROTECTED] writes: Thanks everyone for the replies so far.. The problem I am having is that the customer is using ASP Java script. The URL stays the same as I click through the links. URL staying the same is usually a sign of the use of frame, not of ASP and JavaScript. Instead of looking at the URL entry field, try using copy link to clipboard instead of clicking on the last link. Then use Wget on that.
Re: Web page source using wget?
It does look a little complicated This is how it looks: form action=InventoryStatus.asp method=post name=select onsubmit=return select_validate(); style=margin:0 div style=margin-top:10px table border=1 bordercolor=#d9d9d9 bordercolordark=#ff bordercolorlight=#d9d9d9 cellpadding=3 cellspacing=0 width=100% tr td style=font-weight:bold;color:black;background-color:#CC;text-align:right width=20%nobrSuppliernbsp;/nobr/td td style=color:black;background-color:#F0;text-align:left colspan=2nobrselect name=cboSupplieroption value=4541-134289454A/option option value=4542-134289 selected454B/option/select img id=cboSupplier_icon name=cboSupplier_icon src=../images/required.gif alt=*/nobr/td /tr tr td style=font-weight:bold;color:black;background-color:#CC;text-align:right width=20%nobrQuantity Statusnbsp;/nobr/td td style=color:black;background-color:#F0;text-align:left colspan=2 table border=0 cellpadding=0 cellspacing=0 tr td table border=0 tr td width=1input id=choice_IDAMCB3B name=status type=radio value=over/td td style=color:black;background-color:#F0;text-align:leftspan onclick=choice_IDAMCB3B.checked=true; Over/span/td td width=1input id=choice_IDARCB3B name=status type=radio value=under/td td style=color:black;background-color:#F0;text-align:leftspan onclick=choice_IDARCB3B.checked=true; Under/span/td td width=1input id=choice_IDAWCB3B name=status type=radio value=both/td td style=color:black;background-color:#F0;text-align:leftspan onclick=choice_IDAWCB3B.checked=true; Both/span/td td width=1input id=choice_IDA1CB3B name=status type=radio value=all checked/td td style=color:black;background-color:#F0;text-align:leftspan onclick=choice_IDA1CB3B.checked=true; All/span/td /tr /table /td td img id=status_icon name=status_icon src=../images/blank.gif alt=/td /tr /table /td /tr tr td style=font-weight:bold;color:black;background-color:#CCnbsp;/td td colspan=2 style=font-weight:bold;color:black;background-color:#CC;text-align:leftinput type=submit name=action-select value=Query onclick=doValidate = true; /td /tr /table /div /form I don't see any specific URL that would get the relevant data after I hit submit. Maybe I am missing something... Thanks, Suhas - Original Message - From: Hrvoje Niksic [EMAIL PROTECTED] To: Suhas Tembe [EMAIL PROTECTED] Cc: [EMAIL PROTECTED] Sent: Tuesday, October 07, 2003 5:24 PM Subject: Re: Web page source using wget? Suhas Tembe [EMAIL PROTECTED] writes: this page contains a drop-down list of our customer's locations. At present, I choose one location from the drop-down list click submit to get the data, which is displayed in a report format. I right-click then choose view source save source to a file. I then choose the next location from the drop-down list, click submit again. I again do a view source save the source to another file and so on for all their locations. It's possible to automate this, but it requires some knowledge of HTML. Basically, you need to look at the form.../form part of the page and find the select tag that defines the drop-down. Assuming that the form looks like this: form action=http://foo.com/customer; method=GET select name=location option value=caCalifornia option value=maMassachussetts ... /select /form you'd automate getting the locations by doing something like: for loc in ca ma ... do wget http://foo.com/customer?location=$loc; done Wget will save the respective sources in files named customer?location=ca, customer?location=ma, etc. But this was only an example. The actual process depends on what's in the form, and it might be considerably more complex than this.
Web page source using wget?
Hello Everyone, I am new to this wget utility, so pardon my ignorance.. Here is a brief explanation of what I am currently doing: 1). I go to our customer's website every day log in using a User Name Password. 2). I click on 3 links before I get to the page I want. 3). I right-click on the page choose view source. It opens it up in Notepad. 4). I save the source to a file subsequently perform various tasks on that file. As you can see, it is a manual process. What I would like to do is automate this process of obtaining the source of a page using wget. Is this possible? Maybe you can give me some suggestions. Thanks in advance. Suhas