Re: Web page "source" using wget?
Cookies.txt looks like this: # HTTP cookie file. # Generated by Wget on 2003-10-13 13:19:26. # Edit at your own risk. There is nothing after the 3rd line. So, it doesn't look like a valid cookie file. - Original Message - From: "Hrvoje Niksic" <[EMAIL PROTECTED]> To: "Suhas Tembe" <[EMAIL PROTECTED]> Cc: <[EMAIL PROTECTED]> Sent: Monday, October 13, 2003 12:57 PM Subject: Re: Web page "source" using wget? > "Suhas Tembe" <[EMAIL PROTECTED]> writes: > > > I tried, but it doesn't seem to have worked. This what I did: > > > > wget --save-cookies=cookies.txt > > http://customer.website.com?UserAccount=USER&AccessCode=PASSWORD&Locale=English > > (United States)&TimeZone=(GMT-5:00) Eastern Standard Time (USA & > > Canada)&action-Submit=Login > > Hopefully you used quotes to protect the spaces in URLs from the > shell? > > After the first command, does `cookies.txt' contains what looks like a > valid cookie?
Re: Web page "source" using wget?
So, is there a way I can get to the page I want after logging into a secure server using wget? Can I keep the SSL connection open for the second retrieval to work? The other thing I noticed is that the first URL (to log in) does not seem to work, because when I use that same URL in IE, it brings me back to the login screen (see attached "source" of the login page). I don't get logged-in. I am not quite sure if it is the URL that is incorrect or it is something else. Thanks, Suhas - Original Message - From: "Jens Rösner" <[EMAIL PROTECTED]> To: "Suhas Tembe" <[EMAIL PROTECTED]> Cc: <[EMAIL PROTECTED]> Sent: Monday, October 13, 2003 12:51 PM Subject: Re: Web page "source" using wget? > Hi Suhas! > > Well, I am by no means an expert, but I think that wget > closes the connection after the first retrieval. > The SSL server realizes this and decides that wget has no right to log in > for the second retrieval, eventhough the cookie is there. > I think that is a correct behaviour for a secure server, isn't it? > > Does this make sense? > Jens > > > > A slight correction the first wget should read: > > > > wget --save-cookies=cookies.txt > > http://customer.website.com/supplyweb/general/default.asp?UserAccount=U > > SER&AccessCode=PASSWORD&Locale=en-us&TimeZone=EST:-300&action-Submi > > t=Login > > > > I tried this link in IE, but it it comes back to the same login screen. > > No errors messages are displayed at this point. Am I missing something? > > I have attached the "source" for the login page. > > > > Thanks, > > Suhas > > > > > > - Original Message - > > From: "Suhas Tembe" <[EMAIL PROTECTED]> > > To: "Hrvoje Niksic" <[EMAIL PROTECTED]> > > Cc: <[EMAIL PROTECTED]> > > Sent: Monday, October 13, 2003 11:53 AM > > Subject: Re: Web page "source" using wget? > > > > > > I tried, but it doesn't seem to have worked. This what I did: > > > > wget --save-cookies=cookies.txt > > http://customer.website.com?UserAccount=USER&AccessCode=PASSWORD&Loca > > le=English (United States)&TimeZone=(GMT-5:00) Eastern Standard Time > > (USA & Canada)&action-Submit=Login > > > > wget --load-cookies=cookies.txt > > http://customer.website.com/supplyweb/smi/inventorystatus.asp?cboSupplier > > =4541-134289&status=all&action-select=Query > > --http-user=4542-134289 > > > > After executing the above two lines, it creates two files: > > 1). "[EMAIL PROTECTED]" : I can see that > > this file contains a message (among other things): "Your session has > > expired due to a period of inactivity" > > 2). "[EMAIL PROTECTED]" > > > > Thanks, > > Suhas > > > > > > - Original Message - > > From: "Hrvoje Niksic" <[EMAIL PROTECTED]> > > To: "Suhas Tembe" <[EMAIL PROTECTED]> > > Cc: <[EMAIL PROTECTED]> > > Sent: Monday, October 13, 2003 11:37 AM > > Subject: Re: Web page "source" using wget? > > > > > > > "Suhas Tembe" <[EMAIL PROTECTED]> writes: > > > > > > > There are two steps involved: > > > > 1). Log in to the customer's web site. I was able to create the > > following link after I looked at the section in the "source" as > > explained to me earlier by Hrvoje. > > > > wget > > http://customer.website.com?UserAccount=USER&AccessCode=PASSWORD&Loca > > le=English (United States)&TimeZone=(GMT-5:00) Eastern Standard Time > > (USA & Canada)&action-Submit=Login > > > > > > Did you add --save-cookies=FILE? By default Wget will use cookies, > > > but will not save them to an external file and they will therefore be > > > lost. > > > > > > > 2). Execute: wget > > > > > > http://customer.website.com/InventoryStatus.asp?cboSupplier=4541-134289 > > &status=all&action-select=Query > > > > > > For this step, add --load-cookies=FILE, where FILE is the same file > > > you specified to --save-cookies above. > > > > > > -- > NEU FÜR ALLE - GMX MediaCenter - für Fotos, Musik, Dateien... > Fotoalbum, File Sharing, MMS, Multimedia-Gruß, GMX FotoService > > Jetzt kostenlos anmelden unter http://www.gmx.net > > +++ GMX - die erste Adresse für Mail, Message, More! +++ > SupplyWEB Login var
Re: Web page "source" using wget?
A slight correction the first wget should read: wget --save-cookies=cookies.txt http://customer.website.com/supplyweb/general/default.asp?UserAccount=USER&AccessCode=PASSWORD&Locale=en-us&TimeZone=EST:-300&action-Submit=Login I tried this link in IE, but it it comes back to the same login screen. No errors messages are displayed at this point. Am I missing something? I have attached the "source" for the login page. Thanks, Suhas - Original Message - From: "Suhas Tembe" <[EMAIL PROTECTED]> To: "Hrvoje Niksic" <[EMAIL PROTECTED]> Cc: <[EMAIL PROTECTED]> Sent: Monday, October 13, 2003 11:53 AM Subject: Re: Web page "source" using wget? I tried, but it doesn't seem to have worked. This what I did: wget --save-cookies=cookies.txt http://customer.website.com?UserAccount=USER&AccessCode=PASSWORD&Locale=English (United States)&TimeZone=(GMT-5:00) Eastern Standard Time (USA & Canada)&action-Submit=Login wget --load-cookies=cookies.txt http://customer.website.com/supplyweb/smi/inventorystatus.asp?cboSupplier=4541-134289&status=all&action-select=Query --http-user=4542-134289 After executing the above two lines, it creates two files: 1). "[EMAIL PROTECTED]" : I can see that this file contains a message (among other things): "Your session has expired due to a period of inactivity" 2). "[EMAIL PROTECTED]" Thanks, Suhas - Original Message - From: "Hrvoje Niksic" <[EMAIL PROTECTED]> To: "Suhas Tembe" <[EMAIL PROTECTED]> Cc: <[EMAIL PROTECTED]> Sent: Monday, October 13, 2003 11:37 AM Subject: Re: Web page "source" using wget? > "Suhas Tembe" <[EMAIL PROTECTED]> writes: > > > There are two steps involved: > > 1). Log in to the customer's web site. I was able to create the following link > > after I looked at the section in the "source" as explained to me earlier by > > Hrvoje. > > wget > > http://customer.website.com?UserAccount=USER&AccessCode=PASSWORD&Locale=English > > (United States)&TimeZone=(GMT-5:00) Eastern Standard Time (USA & > > Canada)&action-Submit=Login > > Did you add --save-cookies=FILE? By default Wget will use cookies, > but will not save them to an external file and they will therefore be > lost. > > > 2). Execute: wget > > http://customer.website.com/InventoryStatus.asp?cboSupplier=4541-134289&status=all&action-select=Query > > For this step, add --load-cookies=FILE, where FILE is the same file > you specified to --save-cookies above.
Re: Web page "source" using wget?
I tried, but it doesn't seem to have worked. This what I did: wget --save-cookies=cookies.txt http://customer.website.com?UserAccount=USER&AccessCode=PASSWORD&Locale=English (United States)&TimeZone=(GMT-5:00) Eastern Standard Time (USA & Canada)&action-Submit=Login wget --load-cookies=cookies.txt http://customer.website.com/supplyweb/smi/inventorystatus.asp?cboSupplier=4541-134289&status=all&action-select=Query --http-user=4542-134289 After executing the above two lines, it creates two files: 1). "[EMAIL PROTECTED]" : I can see that this file contains a message (among other things): "Your session has expired due to a period of inactivity" 2). "[EMAIL PROTECTED]" Thanks, Suhas - Original Message ----- From: "Hrvoje Niksic" <[EMAIL PROTECTED]> To: "Suhas Tembe" <[EMAIL PROTECTED]> Cc: <[EMAIL PROTECTED]> Sent: Monday, October 13, 2003 11:37 AM Subject: Re: Web page "source" using wget? > "Suhas Tembe" <[EMAIL PROTECTED]> writes: > > > There are two steps involved: > > 1). Log in to the customer's web site. I was able to create the following link > > after I looked at the section in the "source" as explained to me earlier by > > Hrvoje. > > wget > > http://customer.website.com?UserAccount=USER&AccessCode=PASSWORD&Locale=English > > (United States)&TimeZone=(GMT-5:00) Eastern Standard Time (USA & > > Canada)&action-Submit=Login > > Did you add --save-cookies=FILE? By default Wget will use cookies, > but will not save them to an external file and they will therefore be > lost. > > > 2). Execute: wget > > http://customer.website.com/InventoryStatus.asp?cboSupplier=4541-134289&status=all&action-select=Query > > For this step, add --load-cookies=FILE, where FILE is the same file > you specified to --save-cookies above.
Re: Web page "source" using wget?
Thanks Hrvoje, using http://.../InventoryStatus.asp?cboSupplier=4541-134289&status=all&action-select=Query in IE worked like a charm. I didn't have to follow links. I am now trying to automate this using wget 1.8.2 (Windows). There are two steps involved: 1). Log in to the customer's web site. I was able to create the following link after I looked at the section in the "source" as explained to me earlier by Hrvoje. wget http://customer.website.com?UserAccount=USER&AccessCode=PASSWORD&Locale=English (United States)&TimeZone=(GMT-5:00) Eastern Standard Time (USA & Canada)&action-Submit=Login 2). Execute: wget http://customer.website.com/InventoryStatus.asp?cboSupplier=4541-134289&status=all&action-select=Query I tried different ways to get this working, but so far have been unsuccessful. Any ideas? Thanks, Suhas - Original Message ----- From: "Hrvoje Niksic" <[EMAIL PROTECTED]> To: "Suhas Tembe" <[EMAIL PROTECTED]> Cc: <[EMAIL PROTECTED]> Sent: Tuesday, October 07, 2003 6:12 PM Subject: Re: Web page "source" using wget? > "Suhas Tembe" <[EMAIL PROTECTED]> writes: > > > It does look a little complicated This is how it looks: > > > > [...] > > > > 454A > > 454B > > > > Those are the important parts. It's not hard to submit this form. > With Wget 1.9, you can even use the POST method, e.g.: > > wget http://.../InventoryStatus.asp --post-data \ > 'cboSupplier=4541-134289&status=all&action-select=Query' \ > -O InventoryStatus1.asp > wget http://.../InventoryStatus.asp --post-data \ > 'cboSupplier=4542-134289&status=all&action-select=Query' > -O InventoryStatus2.asp > > It might even work to simply use GET, and retrieve > http://.../InventoryStatus.asp?cboSupplier=4541-134289&status=all&action-select=Query > without the need for `--post-data' or `-O', but that depends on the > ASP script that does the processing. > > The harder part is to automate this process for *any* values in the > drop-down list. You might need to use an intermediary Perl script > that extracts all the from the HTML source of the > page with the drop-down. Then, from the output of the Perl script, > you call Wget as shown above. > > It's doable, but it takes some work. Unfortunately, I don't know of a > (command-line) tool that would make this easier. >
Error: wget for Windows.
I am trying to use wget for Windows & get this message: "The ordinal 508 could not be located in the dynamic link library LIBEAY32.dll". This is the command I am using: wget http://www.website.com --http-user=username --http-passwd=password I have the LIBEAY32.dll file in the same folder as the wget. What could be wrong? Thanks in advance. Suhas
Re: Web page "source" using wget?
It does look a little complicated This is how it looks: Supplier 454A 454B Quantity Status Over Under Both All I don't see any specific URL that would get the relevant data after I hit submit. Maybe I am missing something... Thanks, Suhas - Original Message - From: "Hrvoje Niksic" <[EMAIL PROTECTED]> To: "Suhas Tembe" <[EMAIL PROTECTED]> Cc: <[EMAIL PROTECTED]> Sent: Tuesday, October 07, 2003 5:24 PM Subject: Re: Web page "source" using wget? > "Suhas Tembe" <[EMAIL PROTECTED]> writes: > > > this page contains a "drop-down" list of our customer's locations. > > At present, I choose one location from the "drop-down" list & click > > submit to get the data, which is displayed in a report format. I > > "right-click" & then choose "view source" & save "source" to a file. > > I then choose the next location from the "drop-down" list, click > > submit again. I again do a "view source" & save the source to > > another file and so on for all their locations. > > It's possible to automate this, but it requires some knowledge of > HTML. Basically, you need to look at the ... part of the > page and find the tag that defines the drop-down. Assuming > that the form looks like this: > > http://foo.com/customer"; method=GET> > > California > Massachussetts > ... > > > > you'd automate getting the locations by doing something like: > > for loc in ca ma ... > do > wget "http://foo.com/customer?location=$loc"; > done > > Wget will save the respective sources in files named > "customer?location=ca", "customer?location=ma", etc. > > But this was only an example. The actual process depends on what's in > the form, and it might be considerably more complex than this. >
Re: Web page "source" using wget?
Got it! Thanks! So far so good. After logging-in, I was able to get to the page I am interested in. There was one thing that I forgot to mention in my earlier posts (I apologize)... this page contains a "drop-down" list of our customer's locations. At present, I choose one location from the "drop-down" list & click submit to get the data, which is displayed in a report format. I "right-click" & then choose "view source" & save "source" to a file. I then choose the next location from the "drop-down" list, click submit again. I again do a "view source" & save the source to another file and so on for all their locations. I am not quite sure how to automate this process! How can I do this non-interactively? especially the "submit" portion of the page. Is this possible using wget? Thanks, Suhas - Original Message - From: "Hrvoje Niksic" <[EMAIL PROTECTED]> To: "Suhas Tembe" <[EMAIL PROTECTED]> Cc: <[EMAIL PROTECTED]> Sent: Tuesday, October 07, 2003 5:02 PM Subject: Re: Web page "source" using wget? > "Suhas Tembe" <[EMAIL PROTECTED]> writes: > > > Thanks everyone for the replies so far.. > > > > The problem I am having is that the customer is using ASP & Java > > script. The URL stays the same as I click through the links. > > URL staying the same is usually a sign of the use of frame, not of ASP > and JavaScript. Instead of looking at the URL entry field, try using > "copy link to clipboard" instead of clicking on the last link. Then > use Wget on that. >
Re: Web page "source" using wget?
Thanks everyone for the replies so far.. The problem I am having is that the customer is using ASP & Java script. The URL stays the same as I click through the links. So, using "wget URL" for the page I want may not work (I may be wrong). Any suggestions on how I can tackle this? Thanks, Suhas - Original Message - From: "Hrvoje Niksic" <[EMAIL PROTECTED]> To: "Suhas Tembe" <[EMAIL PROTECTED]> Cc: <[EMAIL PROTECTED]> Sent: Monday, October 06, 2003 5:19 PM Subject: Re: Web page "source" using wget? > "Suhas Tembe" <[EMAIL PROTECTED]> writes: > > > Hello Everyone, > > > > I am new to this wget utility, so pardon my ignorance.. Here is a > > brief explanation of what I am currently doing: > > > > 1). I go to our customer's website every day & log in using a User Name & Password. > > 2). I click on 3 links before I get to the page I want. > > 3). I right-click on the page & choose "view source". It opens it up in Notepad. > > 4). I save the "source" to a file & subsequently perform various tasks on that > > file. > > > > As you can see, it is a manual process. What I would like to do is > > automate this process of obtaining the "source" of a page using > > wget. Is this possible? Maybe you can give me some suggestions. > > It's possible, in fact it's what Wget does in its most basic form. > Disregarding authentication, the recipe would be: > > 1) Write down the URL. > > 2) Type `wget URL' and you get the source of the page in file named >SOMETHING.html, where SOMETHING is the file name that the URL ends >with. > > Of course, you will also have to specify the credentials to the page, > and Tony explained how to do that. >
Web page "source" using wget?
Hello Everyone, I am new to this wget utility, so pardon my ignorance.. Here is a brief explanation of what I am currently doing: 1). I go to our customer's website every day & log in using a User Name & Password. 2). I click on 3 links before I get to the page I want. 3). I right-click on the page & choose "view source". It opens it up in Notepad. 4). I save the "source" to a file & subsequently perform various tasks on that file. As you can see, it is a manual process. What I would like to do is automate this process of obtaining the "source" of a page using wget. Is this possible? Maybe you can give me some suggestions. Thanks in advance. Suhas