Re: How to pull webpage into batch job

Barkow, Eileen Wed, 26 Apr 2017 10:38:55 -0700

Here is a little java program I modified from a sample.
It reads each line of a web page and displays it, so you can modify the pgm to 
save the data to a file.
I had to use a proxy to get to the web pages from Windows but your z/os system 
may not require it if you accessing a intranet web page.


To test it out from Windows, compile it and run with:   java testurl 
http://url.of.website

//------------------------------------------------------------//
//  JavaGetUrl.java:                                          //
//------------------------------------------------------------//
//  A Java program that demonstrates a procedure that can be  //
//  used to download the contents of a specified URL.         //
//------------------------------------------------------------//
//  Code created by Developer's Daily                         //
//  http://www.DevDaily.com                                   //
//------------------------------------------------------------//

import java.io.*;
import java.net.*;

public class testurl {

   public static void main (String[] args) {

      //-----------------------------------------------------//
      //  Step 1:  Start creating a few objects we'll need.
      //-----------------------------------------------------//

      URL u;
      InputStream is = null;
      DataInputStream dis;
      String s;
  String uaddr = 
"http://repo.maven.apache.org/maven2/org/apache/maven/plugins/maven-metadata.xml";;
      if (args.length>0) uaddr = args[0];
      try {

         //------------------------------------------------------------//
         // Step 2:  Create the URL.                                   //
         //------------------------------------------------------------//
         // Note: Put your real URL here, or better yet, read it as a  //
         // command-line arg, or read it from a file.                  //
         //------------------------------------------------------------//

        // u = new URL("http://200.210.220.1:8080/index.html";);
 // u = new URL("http://cpp.bw.org/code-clinic-test/test-data-2014-02-27.txt";);
//  u = new URL("http://cityshare.nycnet/portal/site/cityshare";);
 // u = new URL("http://MVSZ:9980/cppclass";);
// u = new 
URL("http://lpo.dt.navy.mil/data/DM/Environmental_Data_Deep_Moor_2014.txt";);
 u = new URL(uaddr);
  Socket sock = new Socket("bcpxy.nycnet",8080);
Proxy prox = new Proxy(Proxy.Type.HTTP,sock.getRemoteSocketAddress());

URLConnection  ucon = u.openConnection(prox);
is = ucon.getInputStream();

         //----------------------------------------------//
         // Step 3:  Open an input stream from the url.  //
         //----------------------------------------------//

       //  is = u.openStream();         // throws an IOException

         //-------------------------------------------------------------//
         // Step 4:                                                     //
         //-------------------------------------------------------------//
         // Convert the InputStream to a buffered DataInputStream.      //
         // Buffering the stream makes the reading faster; the          //
         // readLine() method of the DataInputStream makes the reading  //
         // easier.                                                     //
         //-------------------------------------------------------------//

         dis = new DataInputStream(new BufferedInputStream(is));

         //------------------------------------------------------------//
         // Step 5:                                                    //
         //------------------------------------------------------------//
         // Now just read each record of the input stream, and print   //
         // it out.  Note that it's assumed that this problem is run   //
         // from a command-line, not from an application or applet.    //
         //------------------------------------------------------------//

         while ((s = dis.readLine()) != null) {
            System.out.println(s);
         }

      } catch (MalformedURLException mue) {

         System.out.println("Ouch - a MalformedURLException happened.");
         mue.printStackTrace();
         System.exit(1);

      } catch (IOException ioe) {

         System.out.println("Oops- an IOException happened.");
         ioe.printStackTrace();
         System.exit(1);

      } finally {

         //---------------------------------//
         // Step 6:  Close the InputStream  //
         //---------------------------------//

         try {
            is.close();
         } catch (IOException ioe) {
            // just going to ignore this one
         }

      } // end of 'finally' clause

   }  // end of main

} // end of class definition

-----Original Message-----
From: IBM Mainframe Discussion List [mailto:IBM-MAIN@LISTSERV.UA.EDU] On Behalf 
Of Paul Gilmartin
Sent: Wednesday, April 26, 2017 12:16 PM
To: IBM-MAIN@LISTSERV.UA.EDU
Subject: Re: How to pull webpage into batch job

On Wed, 26 Apr 2017 08:18:31 -0500, John McKown wrote:

>On Wed, Apr 26, 2017 at 7:55 AM, Bill Ashton wrote:
>>
>> I have some internal webpages (built from multiple systems) that contain
>> particular information that I want to capture in a batch job, and then I
>> will combine that with other data from other mainframe files. The easiest
>> way is to grab the webpage in m batch job, and then I can use Rexx or Sort
>> to parse through the HTML to get what I need.
>
>Well, you could write an HTTP client in REXX using "REXX Sockets"
>https://www.ibm.com/support/knowledgecenter/en/SSLTBW_2.1.0/com.ibm.zos.v2r1.hala001/ipapirxa.htm
>
>But, being a lazy SOB (Swell Ol' Boy, that is), I'd use REXX and the
>bpxwunix() command to run the, optional, cURL command.
>
>/* REXX */
>STDIN.0=0
>STDOUT.0=0  /* Is this needed?  */
>STDERR.0=0  /* Is this needed?  */
>CURL='/usr/lpp/ported/bin/curl'
>cmd=curl" http://ibm.com";
>RC=BPXWUNIX(cmd,stdin.,stdout.,stderr.)
>do i=1 to stdout.0
>     ...
Since Bill mentioned Rexx first, fetching to a stem may be his preferred
technique -- he can do his processing in the same EXEC.  But I'm
surprised how few programmers choose the alternative path specifications
for BPXWUNIX (Rexx programmers seem fixated on stems).

Curl could write to a DDNAME allocated to a data set.

Curl output could be shell-redirected to a UNIX path.

After SYSCALL path P., curl output could be shell-redirected to P.2
and SORTIN could be BPXWDYN allocated to P.1  (BPXWUNIX should
put curl in the background ("&") for this to work.

etc.

-- gil

----------------------------------------------------------------------
For IBM-MAIN subscribe / signoff / archive access instructions,
send email to lists...@listserv.ua.edu with the message: INFO IBM-MAIN

________________________________

This e-mail, including any attachments, may be confidential, privileged or 
otherwise legally protected. It is intended only for the addressee. If you 
received this e-mail in error or from someone who was not authorized to send it 
to you, do not disseminate, copy or otherwise use this e-mail or its 
attachments. Please notify the sender immediately by reply e-mail and delete 
the e-mail from your system.

----------------------------------------------------------------------
For IBM-MAIN subscribe / signoff / archive access instructions,
send email to lists...@listserv.ua.edu with the message: INFO IBM-MAIN

Re: How to pull webpage into batch job

Reply via email to