On Wed, Apr 6, 2016 at 7:01 AM, Andrew Rowley <[email protected]> wrote:
> On 05/04/2016 01:20 AM, Tom Marchant wrote: > >> On Mon, 4 Apr 2016 16:45:37 +1000, Andrew Rowley wrote: >> >> A Hashmap potentially allows you to read sequentially and match records >>> between files, without caring about the order. >>> >> Can you please explain what you mean by this? Are you talking about using >> the hashmap to determine which record to read next, and so to read the >> records in an order that is logically sequential, but physically random? >> If so, >> that is not at all like reading the records sequentially. >> >> > If one file fits in memory, you can read it sequentially into a Hashmap > with the using the data you want to match as the key. > Then read the second one, also sequentially, retrieving matching records > from the Hashmap by key. You can also remove them from the Hashmap as they > are found if you need to know if any are unmatched. > > But this is a solution for a made up case - I don't know whether it is a > common situation. I was interested in hearing real reasons why sort is so > common on z/OS i.e. Why sort? > Not meaning to sound silly, but I fear the main reason may be the good old: "We've always done it that way". And, since most of the in-house software written in z/OS is in some version of COBOL, there is no other real choice because COBOL does not have anything like a content addressable "array" built into the language. IMO, a major deficiency in IBM's COBOL, and maybe other vendors' COBOLs, is that it does not come with a great library of functionality. It is simple to do things in Java, Perl, PHP, python, and Go because of the huge amount of support in the libraries. COBOL basically has the barest of native data types. And basically only has integer indexed arrays and structures as ways to "group" things together. Also, COBOL has pretty much the barest of run time routines. And the only invocation of anything in a library is via the CALL verb. I guess that it's sad that the object oriented portion of the latest COBOL compilers seem to be ignored. So, why not migrate away from COBOL to a more advanced language? Many places are doing so for new work or development (or going to a non-z platform). Also, do you really need to buffer up everything in a Hashmap if your data resides in a relational database? It is generally much better to let the RDBMS do most of the work. And it will buffer up the active data, not only from your program but every program which is accessing the data. In this case, do a SORT could possibly be unnecessary. Or you may need to do a SORT if you are writing a report sorted by a value created in the program itself. Do you really want to use a Hashmap to store the unsorted electricity bills for Los Angeles, and then, at the end, read & write said bills by reading the Hashmap by key? This sort of thing goes on a _lot_ on z/OS. Just my take on it. I'm not against using something other than SORT if I think it will work well. But SORT (DFSORT & Syncsort) are extremely fast and efficient. So if I need something done which they can do, then I think it is best to use them rather than code something up myself, in any language. > > On Hashmaps etc. in general - they are the memory equivalent to indexed > datasets (VSAM etc) versus sequential datasets. Their availability opens up > many new ways to process data - and algorithm changes are often where the > big savings can be made. > > -- How many surrealists does it take to screw in a lightbulb? One to hold the giraffe and one to fill the bathtub with brightly colored power tools. Maranatha! <>< John McKown ---------------------------------------------------------------------- For IBM-MAIN subscribe / signoff / archive access instructions, send email to [email protected] with the message: INFO IBM-MAIN
