[EMAIL PROTECTED] writes:

> Since much data from mainframes, or obtained via OCR (from TSU/SSU
>  sources, of course! ;-), is in fixed-record layout, it would be
>  interesting to have a program that would attempt to infer "field"
>  positions and lengths by examining the content of the data.

>  Wouldn't REBOL be a good tool to write such a sort-of-AI task in?

Warning -- this post may be off-topic for anyone not interested in exchanging 
data with a mainframe data center....Delete now!

Without denying the existence of Honeywell etc al, when I write "mainframe" I 
mean an IBM Series/360 or later.

There's probably four sensible ways to get data from a mainframe:

-- The data center people could write you a nice extract program to put all 
the data in a CSV or equivalent -- no problems with handling that. It just 
might take them years to get round to it

-- They could simply give you, in machine readable-format, an existing report 
-- maybe say a spooled copy of last night's invoice run. You could devise a 
set of Rebol heuristics to make sense of the data. But you'd probably be 
better off buying a copy of Monarch -- it's been doing that for ten years.

-- You could find a pile of old fan-fold paper with important data on it, OCR 
it and then try to make sense of the resulting file. Lots of possible fun 
code here as you mention. Though it might be better to send the scans to one 
of those keyprep farms in the Far East where they'll rekey the stuff 
overnight with six nines accuracy. Then put it through Monarch to extract the 
data.

-- You might have been landed with the original native mainframe file. Some 
nice rebol routines would be useful to convert this data -- but heuristics 
are likely to fail. You'd really need the data layout spec. Us old 
mainframers thought nothing of following a couple of fixed length text fields 
with some16-bit binary fields, and a few 32 bit binary field, and finishing 
up with some packed decimal fields. There's no way you could sensible parse 
that without heavy hinting.

A fifth possibility would be to host Rebol on a mainframe and write your own 
extract and download processes in Rebol. To be useful in this area (i.e. file 
extraction) we'd need routines that can handle fixed-length binary and packed 
decimal. That's not difficult. And an extension to handle reading and writing 
partitioned datasets (like ZIP/sea/gz: loads of files inside another) would 
be very useful.

Core ought to be an easy port as a native z/OS (nee MVS: a mainframe OS) 
program if RT wanted to. View is more problematical for the same reasons IBM 
has never claimed full Posix compliance for its various Unix implementations: 
your terminal is not directly attached to the processor, so you can't get 
very real-time interactive from the keyboard.

Given that mainframes now run GNU/Linux, a port of Core to that environment 
should be even easier.

There's never been -- as far as I know -- a scripting language that has 
successfully migrated from mainframe to micros or vice versa. Rexx got as far 
as OS/2 and gave up. Selcopy has a Unix and PC port, but it's never looked 
right there, and never thrived.

But, if the port works, 75-80% of all machine-readable data is in our grasp! 
the Rebolution is complete!!

Sunanda.
-- 
To unsubscribe from this list, please send an email to
[EMAIL PROTECTED] with "unsubscribe" in the 
subject, without the quotes.

Reply via email to