Happy New Year!

2018-12-31 Thread ED SHARPE via cctalk


Happy  New  Year  to all!
Ed#

Sent from AOL Mobile Mail


PDP-11/45 RSTS/E boot problem

2018-12-31 Thread Noel Chiappa via cctalk
> From: Paul Koning

>> On Dec 31, 2018, at 6:32 PM, Henk Gooijen via cctalk  wrote:
>> ...
>> There are one or two bits in a register of the RK11 that have a
>> different meaning/function, depending on the controller being a -C or
>> -D.

> If someone can point me to the description of the differences I should
> be able to say what RSTS will do with them.

AFAIK, the only difference (in programming terms) between the -C and -D is
that the -D has dropped the maintainance register.

Although I cheerfully admit I haven't sat down with -C and -D manuals and
done a bit-by-bit compare. I just did that (I used the "RK11-C Moving Head
Disk Drive Controller Manual", DEC-11-HRKA-D, and the 1976  "Peripherals
Handbook"), and found in the following:

In the RKDS: bit 7 has changed the definition slightly ("Drive Ready" to
"R/W/S Ready"), but seems to be basically the same. In the RKCS, bit 9 is
"Read/Write All" in the -C, and unused in the -D; bit 12 is "Maint" in the
-C, unused in the -D.

In other words, a -D driver should work just fine with a -C, IMO.

Noel


Re: PDP-11/45 RSTS/E boot problem

2018-12-31 Thread Fritz Mueller via cctalk


> On Dec 31, 2018, at 5:15 PM, Paul Koning  wrote:
> 
>> On Dec 31, 2018, at 6:32 PM, Henk Gooijen via cctalk  
>> wrote:
>> 
>> There are one or two bits in a register of the RK11 that have a different 
>> meaning/function, depending on the controller being a -C or -D. The RK11-C 
>> was quickly replaced by the RK11-D, but I guess RSTS would know the 
>> difference. Other guys here will be able to give a lot better light on this 
>> than me (Paul?)
> 
> I did not know that.  If someone can point me to the description of the 
> differences I should be able to say what RSTS will do with them.

Oh, I didn’t know that either!

> Did you specify overlapped seek in SYSGEN?

Yes, I did enable overlapped seek during SYSGEN, and I did just have some 
trouble with overlapped seeks on my RK11-C (though I repaired that well enough 
to pass the associated MAINDEC before trying RSTS).

I’ll try a non-overlap SYSGEN soon/next -- right now it takes me a little over 
two hours to image a pack using PDP11GUI and a DL11 at 9600 baud, which is a 
bit painful.  So I’d like to capture some logic analyzer traces off the RK11-C 
while I have this repro case before I write over my current pack.  If there is 
something still up with my overlapped seeks, I’d like to characterize and fix 
it anyway.

cheers,
  --FritzM.




RE: OCR old software listing

2018-12-31 Thread Kevin Parker via cctalk
I've had a lot of success using Adobe's Clearscan for OCR'ing old stuff.
Admittedly it's not perfect but it can improve the quality of an old
document a lot.


Kevin Parker
0418 815 527

-Original Message-
From: cctalk  On Behalf Of Paul Koning via
cctalk
Sent: Tuesday, 1 January 2019 12:18 PM
To: dwight ; General Discussion: On-Topic and Off-Topic
Posts 
Subject: Re: OCR old software listing



> On Dec 31, 2018, at 7:13 PM, dwight via cctalk 
wrote:
> 
> Fred is right, OCR is only worth it if the document is in perfect
condition. I just finish getting an old 4004 listing working. I made only
two mistakes on the 4K of code that were not the fault of the poorness of
the listing. Twice I put LDM instead of LD. LDM was the most commonly used.

I wouldn't put it quite so strongly.  OCR even if not perfect can help a
lot.  You can often OCR + test assembly + proofread faster than retyping,
especially since that requires fixing typos and proofreading also.  Many OCR
errors are caught by the assembler, though not all of them of course.  I've
done both in an ongoing software preservation project; my conclusion still
is to use OCR when it works "well enough".  A couple of errors per page is
definitely "well enough".

The program used matters.  I looked at Tesseract a bit but its quality was
vastly inferior to commercial products in the examples I tried.  I now use
Abbyy FineReader, which handles a lot of line printer and typewriter
material quite well.

paul





Re: OCR old software listing

2018-12-31 Thread Fred Cisin via cctalk
On the other hand, just for the FUN of it, 
can you write some software to find and fix (or simply flag) the most 
common errors?



When I had to terminate my publisher, I was s'posed to receive a copy of 
their customer database.
They deleted all delimiters (spaces, commas, periods, and other 
punctuation), and then printed it out on greenbar with a font that used 
the same character for zero and letter 'O'; and same character for 
one, lower case 'l', and upper cse 'I'.  Surprisingly, they did NOT use a 
bad ribbon and printhead!


An acquaintance OCR'ed it. They were able to get what they thought was "80 
to 90%" from the originals, but not from a xerox copy that visually 
seemed to be just as good.


I spent a little time writing some simple code to parse and fix most of 
it. 
Mostly simple context, such as a zero between two letters is likely an 
'O', or an 'O' between two numerals is likely a zero.  Similarly with one, 
lower case 'L' and upper case 'I'.   Some OCR software now pays attention 
to context.
Five consecutive numerals following two capital letters is likely to be a 
zip code, and end of the record.  USUALLY. Comparison of those digits 
with the two letters in a zipcode database provided partial confirmation.

. . . and so forth . . .

Not a practical use of time, but a fun exercise in parsing.


Another time, the .SRT file that I found for "Company Man" used upper case 
'I' instead of lower case 'L'!  (AND had a three minute offset for the 
start time)Did not take very long to fix.


--
Grumpy Ol' Fred ci...@xenosoft.com


On Tue, 1 Jan 2019, dwight wrote:


Fred is right, OCR is only worth it if the document is in perfect condition. I 
just finish getting an old 4004 listing working. I made only two mistakes on 
the 4K of code that were not the fault of the poorness of the listing. Twice I 
put LDM instead of LD. LDM was the most commonly used.
There were still some 15 or so other errors do to the printing. It looked to be 
done on a ASR33 with poor registration of the print drum. Cs and 0s were often 
missing the right 1/3. Expecting an OCR to do much would have been a folly. 
Even though some 85% to 90% could be read properly. It took be about 3 weeks of 
evenings to make heads or tails of the code. I've finally got it running 
correctly.
If it had been done with an OCR, many cases it would have simply put a C 
instead of a 0. I'd have had to go through the listing, checking each C to make 
sure it was right. It is easier in many cases to have analysed what I could see 
and make a judgement, based on what I could see and the general context as I 
was typing it in.
Dwight


From: cctalk  on behalf of Fred Cisin via cctalk 

Sent: Monday, December 31, 2018 9:46 AM
To: General Discussion: On-Topic and Off-Topic Posts
Subject: Re: OCR old software listing

On Mon, 31 Dec 2018, Larry Kraemer via cctalk wrote:

I used the libtiff-tools (Debian 8.x - 32 Bit) to extract all 61 .TIF's
from the Multipage .tif file.  While the .tif's look descent, and
RasterVect shows the .tif properties to be Group 4 Fax (1bpp) with 5100
x 6600 pixels - 300 DPI, I can't get tesseract 3.x, TextBridge Classic
2.0, or Irfanview with KADMOS Plugin to OCR any of the .tif files, with
descent results.  I'd expect an OCR of 85 to 90 % correct conversion to
ASCII text.


Software listings need more accuraacy than that.
How many wrong characters does it take for a program not to work?
"desCent" isn't good enough.

85 to 90 % correct is a character wrong in every 6 to 10 characters.
How many errors is that PER LINE?

"But, you can start with that, and just fix the errors, without retyping
the rest."  Doing it that way is a desCent into madness.
BTDT.  wore out the T-shirts.


A competent typist can retype the whole thing faster than fixing an error
in every six to ten characters.
Only if there is less than one error for every several hundred characters
does "patching it" save time for a competent typist.
In general, for a competent typist, the fastest way to reposition the
cursor to the next error in the line is to simply hit the keys of the
intervening letters.
It is NOT to move the cursor with the mouse, then put your hand back on
the keys to type a character.
Using cursor motion keys is no faster for a competent typist than hitting
the keys of the letters toskip over.


TIP: display the OCR'ed text that is to be corrected in a font that
exaggerates the difference between zero and the letter 'O', and between
one and lower case 'l'.  There are some programs that will attempt to
select those based on context.

--
Grumpy Ol' Fred  ci...@xenosoft.com


Re: OCR old software listing

2018-12-31 Thread Paul Koning via cctalk



> On Dec 31, 2018, at 7:13 PM, dwight via cctalk  wrote:
> 
> Fred is right, OCR is only worth it if the document is in perfect condition. 
> I just finish getting an old 4004 listing working. I made only two mistakes 
> on the 4K of code that were not the fault of the poorness of the listing. 
> Twice I put LDM instead of LD. LDM was the most commonly used.

I wouldn't put it quite so strongly.  OCR even if not perfect can help a lot.  
You can often OCR + test assembly + proofread faster than retyping, especially 
since that requires fixing typos and proofreading also.  Many OCR errors are 
caught by the assembler, though not all of them of course.  I've done both in 
an ongoing software preservation project; my conclusion still is to use OCR 
when it works "well enough".  A couple of errors per page is definitely "well 
enough".

The program used matters.  I looked at Tesseract a bit but its quality was 
vastly inferior to commercial products in the examples I tried.  I now use 
Abbyy FineReader, which handles a lot of line printer and typewriter material 
quite well.

paul




Re: PDP-11/45 RSTS/E boot problem

2018-12-31 Thread Paul Koning via cctalk



> On Dec 31, 2018, at 6:32 PM, Henk Gooijen via cctalk  
> wrote:
> 
> Fritz,
> 
> One thought crossed my mind, probably not an issue, but you never know.
> 
> You mentioned that you have an RK11-C, *not* RK11-D.
> 
> There are one or two bits in a register of the RK11 that have a different 
> meaning/function, depending on the controller being a -C or -D. The RK11-C 
> was quickly replaced by the RK11-D, but I guess RSTS would know the 
> difference. Other guys here will be able to give a lot better light on this 
> than me (Paul?)
> 
> A Healthy 2019!
> 
> Henk, PD8PDP

I did not know that.  If someone can point me to the description of the 
differences I should be able to say what RSTS will do with them.

If this matters it might be possible to patch the code. 

Did you specify overlapped seek in SYSGEN?  I don't remember if that exists for 
the RK driver, but if yes, that would control which of two somewhat different 
drivers is in the system.  Come to think of it, if the answer is "yes" it might 
be worth trying the experiment with "no" -- since INIT seems to work and that 
uses the non-overlapped drivers.

paul



Re: OCR old software listing

2018-12-31 Thread dwight via cctalk
Fred is right, OCR is only worth it if the document is in perfect condition. I 
just finish getting an old 4004 listing working. I made only two mistakes on 
the 4K of code that were not the fault of the poorness of the listing. Twice I 
put LDM instead of LD. LDM was the most commonly used.
There were still some 15 or so other errors do to the printing. It looked to be 
done on a ASR33 with poor registration of the print drum. Cs and 0s were often 
missing the right 1/3. Expecting an OCR to do much would have been a folly. 
Even though some 85% to 90% could be read properly. It took be about 3 weeks of 
evenings to make heads or tails of the code. I've finally got it running 
correctly.
If it had been done with an OCR, many cases it would have simply put a C 
instead of a 0. I'd have had to go through the listing, checking each C to make 
sure it was right. It is easier in many cases to have analysed what I could see 
and make a judgement, based on what I could see and the general context as I 
was typing it in.
Dwight


From: cctalk  on behalf of Fred Cisin via cctalk 

Sent: Monday, December 31, 2018 9:46 AM
To: General Discussion: On-Topic and Off-Topic Posts
Subject: Re: OCR old software listing

On Mon, 31 Dec 2018, Larry Kraemer via cctalk wrote:
> I used the libtiff-tools (Debian 8.x - 32 Bit) to extract all 61 .TIF's
> from the Multipage .tif file.  While the .tif's look descent, and
> RasterVect shows the .tif properties to be Group 4 Fax (1bpp) with 5100
> x 6600 pixels - 300 DPI, I can't get tesseract 3.x, TextBridge Classic
> 2.0, or Irfanview with KADMOS Plugin to OCR any of the .tif files, with
> descent results.  I'd expect an OCR of 85 to 90 % correct conversion to
> ASCII text.

Software listings need more accuraacy than that.
How many wrong characters does it take for a program not to work?
"desCent" isn't good enough.

85 to 90 % correct is a character wrong in every 6 to 10 characters.
How many errors is that PER LINE?

"But, you can start with that, and just fix the errors, without retyping
the rest."  Doing it that way is a desCent into madness.
BTDT.  wore out the T-shirts.


A competent typist can retype the whole thing faster than fixing an error
in every six to ten characters.
Only if there is less than one error for every several hundred characters
does "patching it" save time for a competent typist.
In general, for a competent typist, the fastest way to reposition the
cursor to the next error in the line is to simply hit the keys of the
intervening letters.
It is NOT to move the cursor with the mouse, then put your hand back on
the keys to type a character.
Using cursor motion keys is no faster for a competent typist than hitting
the keys of the letters toskip over.


TIP: display the OCR'ed text that is to be corrected in a font that
exaggerates the difference between zero and the letter 'O', and between
one and lower case 'l'.  There are some programs that will attempt to
select those based on context.

--
Grumpy Ol' Fred  ci...@xenosoft.com


Re: Original AGC restoration / was Re: Apollo 8 Mission Control printers, or not?

2018-12-31 Thread Daniel Seagraves via cctalk



> On Dec 30, 2018, at 7:16 PM, Nemo  wrote:
> 
> On 30/12/2018, Daniel Seagraves via cctalk  wrote:
>> 
>> New-era-internet term for illegally gaining access to someone's real world
>> “documents" (place of employment, home address, phone numbers, medical
>> records, family members’ info, etc) for harassment, stalking, or worse.
>> 
> 
> Interesting,  as the OED describes dox (n.) as an abbreviation for doxy (2.),
> which is an abbrevations for orthodoxy.  First reference to 1756: T. Amory J.
> Buncle (1825) III. 19 Orthodox and other dox.

Have those guys start a hashtag about it and maybe something will happen.




Re: PDP-11/45 RSTS/E boot problem

2018-12-31 Thread Ethan Dicks via cctalk
On Mon, Dec 31, 2018 at 6:32 PM Henk Gooijen via cctalk
 wrote:
> One thought crossed my mind, probably not an issue, but you never know.
>
> You mentioned that you have an RK11-C, *not* RK11-D.
>
> There are one or two bits in a register of the RK11 that have a different 
> meaning/function, depending on the controller being a -C or -D. The RK11-C 
> was quickly replaced by the RK11-D,

That is good to know.  I have only used the RK11-D, but I have an
RK11-C in need of cleaning and testing that I've never used.  My goal
is to eventually reunite it with the PDP-11/20 it came with (sadly,
the RK05-F that was part of the set was tossed 35 years ago, but I do
have more than one RK05-J, and an abundance of 12-sector packs).

-ethan


Re: More old stuff incoming

2018-12-31 Thread Jim Manley via cctalk
Hi Grant,

It can be different stroke for different folks.  For many, it's the layout,
feel, and sound of the keyboard, joystick, buttons, etc.  There is a huge
market for early "clicky" keyboards with non-linear actions (keys somewhat
resist pressure until a threshold is passed, then they allow full travel) -
some are going for thousands of dollars ... each.  For others, it's the
graphics on a real, honest-to-goodness glass tube TV or monitor, smeary
color blocks, bleepy-bloopy sounds, and all.  It may make no sense to some
people, but crystal-clear digital graphics on an LCD display look nothing
like the originals on glass tubes, and yes, we have a couple of 27-inch
analog TVs with both VHF and NTSC inputs.

A big problem with emulators that attempt to be all things to all games, as
was pointed out, is that there are timing issues when not running on native
hardware that's not multitasking with a million things being spawned and
generating interrupts that the emulator has no way to predict and account
for accurately.  Many games depended on the predictability of the hardware
to perform certain things behind-the-scenes that fail running in
emulators.  Then, there's the problem of the timing being different from
platform to platform with wholly different hardware, OS, and other
application interactions.

I was the first in the U.S. to receive a Raspberry Pi (March 22, 2012, from
the first batch of 10,000) and established one of the first Raspberry Jam
enthusiast gatherings in the world, at the Computer History Museum.  We've
been running the emulators for the Pi, and while they're fine for showing
what the games were _like_, they aren't the _same_, and we have all of the
original game software and hardware right there to compare.  I've gotten
hundreds of Pii (as in the plural of octopus is octopi) since then, as they
really fulfill the educational mission of the Raspberry Pi Foundation, and
they've been given to students where I teach, as well as kids participating
in after-school activities.

All the Best,
Jim

On Fri, Dec 21, 2018 at 9:51 AM Grant Taylor via cctalk <
cctalk@classiccmp.org> wrote:

> On 12/21/18 1:07 AM, Jim Manley via cctalk wrote:
> > no, emulators will not cut it
>
> Would you please expand upon that?
>
> Are you saying that things like a Raspberry Pi running RetroPi (I think
> that's the name) don't suffice / satisfy as the real thing that they are
> emulating?
>
> Or are you including things like the new retro consoles that original
> vendors are coming out with?  (The palm sized SNES from Nintendo comes
> to mind.)
>
> Do you have any idea why these newer things are not cutting it?
>
> I've also had great success with running '90s era games in DOSBox on
> what ever computer happens to be handy.  Does that not work at all for
> you / your crew?
>
>
>
> --
> Grant. . . .
> unix || die
>


RE: PDP-11/45 RSTS/E boot problem

2018-12-31 Thread Henk Gooijen via cctalk
Fritz,

One thought crossed my mind, probably not an issue, but you never know.

You mentioned that you have an RK11-C, *not* RK11-D.

There are one or two bits in a register of the RK11 that have a different 
meaning/function, depending on the controller being a -C or -D. The RK11-C was 
quickly replaced by the RK11-D, but I guess RSTS would know the difference. 
Other guys here will be able to give a lot better light on this than me (Paul?)



A Healthy 2019!

Henk, PD8PDP




Van: cctalk  namens Fritz Mueller via cctalk 

Verzonden: Monday, December 31, 2018 11:47:23 PM
Aan: General Discussion: On-Topic and Off-Topic Posts
Onderwerp: Re: PDP-11/45 RSTS/E boot problem

> On Dec 31, 2018, at 1:54 PM, Paul Koning  wrote:
>
> The standard idle pattern is in the data lights.  I don't remember if the 
> "fancy" pattern appeared in V7.0 or earlier, but in any case it's an 
> undocumented SYSGEN option.
>
> In RSTS/E, the display register shows the system error count.  That's from 
> I/O errors reported by the various drivers.
>
> Do you have a second disk pack?  If so, you could use the DSKINT option in 
> INIT to initialize a pack, with pattern tests.  That would show what the RSTS 
> disk driver thinks of your RK05.
>
> Something else you might try: when you start the system, don't enter 
> line-feed for the quick start, but the START command.  That is a more verbose 
> version which will display some additional messages.  If anything is getting 
> disabled, it would show there.

Thanks, Paul — that’s a bunch of helpful info!

I have done some long-form starts, but no complaints are printed to the console.

I do have an additional as-yet-untried pack that I got in a recent eBay option. 
 I’ll give it an inspection and if its good to go I’ll give the pattern DSKINT 
a try.  I only have one RK05 drive working at the moment, but I suppose I can 
swap after issuing “DS” to the “Option:” prompt?

I’m also gearing up to throw the logic analyzer on the RK11 and see what 
sector(s) it is trying to read and what error/interrupt signaling may actually 
be happening.


Last, any more info on that fancy light sysgen option, for future reference in 
case I ever get on to a later version?

cheers,
  --FritzM.



Re: PDP-11/45 RSTS/E boot problem

2018-12-31 Thread Fritz Mueller via cctalk
> On Dec 31, 2018, at 1:54 PM, Paul Koning  wrote:
> 
> The standard idle pattern is in the data lights.  I don't remember if the 
> "fancy" pattern appeared in V7.0 or earlier, but in any case it's an 
> undocumented SYSGEN option.  
> 
> In RSTS/E, the display register shows the system error count.  That's from 
> I/O errors reported by the various drivers.
> 
> Do you have a second disk pack?  If so, you could use the DSKINT option in 
> INIT to initialize a pack, with pattern tests.  That would show what the RSTS 
> disk driver thinks of your RK05.
> 
> Something else you might try: when you start the system, don't enter 
> line-feed for the quick start, but the START command.  That is a more verbose 
> version which will display some additional messages.  If anything is getting 
> disabled, it would show there.

Thanks, Paul — that’s a bunch of helpful info!

I have done some long-form starts, but no complaints are printed to the console.

I do have an additional as-yet-untried pack that I got in a recent eBay option. 
 I’ll give it an inspection and if its good to go I’ll give the pattern DSKINT 
a try.  I only have one RK05 drive working at the moment, but I suppose I can 
swap after issuing “DS” to the “Option:” prompt?

I’m also gearing up to throw the logic analyzer on the RK11 and see what 
sector(s) it is trying to read and what error/interrupt signaling may actually 
be happening.


Last, any more info on that fancy light sysgen option, for future reference in 
case I ever get on to a later version?

cheers,
  --FritzM.



Re: PDP-11/45 RSTS/E boot problem

2018-12-31 Thread Paul Koning via cctalk



> On Dec 30, 2018, at 8:55 PM, Fritz Mueller via cctalk  
> wrote:
> 
> Hi all,
> 
> Some here may know I’ve been working on an 11/45 restoration off and on for 
> some time now. My ’45 currently has floating point, KT11-C mem mgmt, 124 
> kword MS11-L, and an RK11-C with one restored RK05 drive.
> 
> Last week I decided to see if I could bring up RSTS/E on the machine. I 
> managed to sysgen a minimal V06C system that can run off a single RK05 pack 
> under simh, but when I transfer that image to the real hardware using 
> pdp11gui it does not seem to completely/successfully boot.
> 
> The “Option:” boot loader comes up and sub-commands there seem to be working 
> (in particular, the “HARDWARE” sub command shows correctly detected hardware 
> and options). When booting RSTS/E, after supplying date and time, the idle 
> pattern starts on the front panel (but just the bottom part, on the data 
> lights). When console is in display register mode, it shows an increasing 
> count. Console input is echo’d, but the INIT banner and subsequent prompts 
> are never printed and the read light on the RK05 flickers continuously as if 
> the system is trying to read the same sector over and over.
> 
> Figured I’d ping here in case this is a known failure mode to folks more 
> familiar with RSTS/E? Also posted over on the vcfed DEC forum.  FWIW, the 
> machine is passing all MAINDEC CPU, MMU, FP, KW11, and RK11 diagnostics.
> 
>   Cheers,
> —-FritzM.

The standard idle pattern is in the data lights.  I don't remember if the 
"fancy" pattern appeared in V7.0 or earlier, but in any case it's an 
undocumented SYSGEN option.  

In RSTS/E, the display register shows the system error count.  That's from I/O 
errors reported by the various drivers.

Do you have a second disk pack?  If so, you could use the DSKINT option in INIT 
to initialize a pack, with pattern tests.  That would show what the RSTS disk 
driver thinks of your RK05.

Something else you might try: when you start the system, don't enter line-feed 
for the quick start, but the START command.  That is a more verbose version 
which will display some additional messages.  If anything is getting disabled, 
it would show there.

paul



Re: wanted back issues IEEE ANNALS OF THE HISTORY OF COMPUTING bound or unbound... dtop us a line off list please.

2018-12-31 Thread ED SHARPE via cctalk
Resolution  of  photos seems  low  however.?  Was  this  a  set  made at an  
earlier  time?

A  useful  reference  though as  most   would  never  be able  to collect up  a 
set  of these.
We  are  thankful   for  what  we  have .Ed# SMECC
In a message dated 12/31/2018 10:12:47 AM US Mountain Standard Time, 
cctalk@classiccmp.org writes:


On 12/31/18 5:20 AM, ED SHARPE via cctalk wrote:
> Are these currently online?

They are on bitsavers under afips for now

The intent/agreement when I gave IEEE my scans was they were to be hosted by 
CHM,
but that hasn't happened yet.

They are also the entire volume, IEEE distributes them by paper and left off the
front matter.






Re: OCR old software listing

2018-12-31 Thread Fred Cisin via cctalk

On Mon, 31 Dec 2018, Larry Kraemer via cctalk wrote:
I used the libtiff-tools (Debian 8.x - 32 Bit) to extract all 61 .TIF's 
from the Multipage .tif file.  While the .tif's look descent, and 
RasterVect shows the .tif properties to be Group 4 Fax (1bpp) with 5100 
x 6600 pixels - 300 DPI, I can't get tesseract 3.x, TextBridge Classic 
2.0, or Irfanview with KADMOS Plugin to OCR any of the .tif files, with 
descent results.  I'd expect an OCR of 85 to 90 % correct conversion to 
ASCII text.


Software listings need more accuraacy than that.
How many wrong characters does it take for a program not to work?
"desCent" isn't good enough.

85 to 90 % correct is a character wrong in every 6 to 10 characters.
How many errors is that PER LINE?

"But, you can start with that, and just fix the errors, without retyping 
the rest."  Doing it that way is a desCent into madness.

BTDT.  wore out the T-shirts.


A competent typist can retype the whole thing faster than fixing an error 
in every six to ten characters.
Only if there is less than one error for every several hundred characters 
does "patching it" save time for a competent typist.
In general, for a competent typist, the fastest way to reposition the 
cursor to the next error in the line is to simply hit the keys of the 
intervening letters.
It is NOT to move the cursor with the mouse, then put your hand back on 
the keys to type a character.
Using cursor motion keys is no faster for a competent typist than hitting 
the keys of the letters toskip over.



TIP: display the OCR'ed text that is to be corrected in a font that 
exaggerates the difference between zero and the letter 'O', and between 
one and lower case 'l'.  There are some programs that will attempt to 
select those based on context.


--
Grumpy Ol' Fred ci...@xenosoft.com


Re: wanted back issues IEEE ANNALS OF THE HISTORY OF COMPUTING bound or unbound... dtop us a line off list please.

2018-12-31 Thread Al Kossow via cctalk



On 12/30/18 5:04 PM, Paul Koning wrote:

> It might be helpful to state the policy (or choice, if any) explicitly so 
> people know what to expect.  
> 

I will return documents if requested.

Originals may or may not be donated to CHM for archiving, depending on
if they are duplicative or are of duplicative scope.

I do not archive any paper myself.

Currently, I am being asked to reduce my backlog inside of Shustek
and am making some hard choices.




Re: wanted back issues IEEE ANNALS OF THE HISTORY OF COMPUTING bound or unbound... dtop us a line off list please.

2018-12-31 Thread Al Kossow via cctalk



On 12/31/18 5:20 AM, ED SHARPE via cctalk wrote:
> Are these currently online?

They are on bitsavers under afips for now

The intent/agreement when I gave IEEE my scans was they were to be hosted by 
CHM,
but that hasn't happened yet.

They are also the entire volume, IEEE distributes them by paper and left off the
front matter.






Re: OCR old software listing

2018-12-31 Thread Toby Thain via cctalk
On 2018-12-31 7:20 AM, Larry Kraemer via cctalk wrote:
> I used the libtiff-tools (Debian 8.x - 32 Bit) to extract all 61 .TIF's
> from the
> Multipage .tif file.  While the .tif's look descent, and RasterVect shows
> the
> .tif properties to be Group 4 Fax (1bpp) with 5100 x 6600 pixels - 300 DPI,
> I can't get tesseract 3.x, TextBridge Classic 2.0, or Irfanview with KADMOS
> Plugin to OCR any of the .tif files, with descent results.  I'd expect an
> OCR
> of 85 to 90 % correct conversion to ASCII text.
> 
> Typically, one of the three above Software packages will do a descent job
> of OCRing .tif's of such scans.  (Most PDF's end up at 72 x 72 DPI, and
> converting them to 300 DPI, allows them to be properly OCR'd.)
> 
> If anyone else has had better luck, I'd like to know what your process is.

I don't know if OCR software is sensitive to having correct resolution
(I've practically zero experience with it), but 300 dpi seems wrong for
Mattis' scans.

Seems they should be 600 dpi (21.7 cm x 28 cm).

--Toby

> 
> Thanks.
> 
> Larry
> 



Re: OCR old software listing

2018-12-31 Thread Larry Kraemer via cctalk
I used the libtiff-tools (Debian 8.x - 32 Bit) to extract all 61 .TIF's
from the
Multipage .tif file.  While the .tif's look descent, and RasterVect shows
the
.tif properties to be Group 4 Fax (1bpp) with 5100 x 6600 pixels - 300 DPI,
I can't get tesseract 3.x, TextBridge Classic 2.0, or Irfanview with KADMOS
Plugin to OCR any of the .tif files, with descent results.  I'd expect an
OCR
of 85 to 90 % correct conversion to ASCII text.

Typically, one of the three above Software packages will do a descent job
of OCRing .tif's of such scans.  (Most PDF's end up at 72 x 72 DPI, and
converting them to 300 DPI, allows them to be properly OCR'd.)

If anyone else has had better luck, I'd like to know what your process is.

Thanks.

Larry


Re: wanted back issues IEEE ANNALS OF THE HISTORY OF COMPUTING bound or unbound... dtop us a line off list please.

2018-12-31 Thread ED SHARPE via cctalk
Are these currently online?
Ed#

Thanks go to Al Kossow for supplying ACM with scans of AFIPS and perhaps 
more.)> 
> 
> 


Re: KIP 2050 image scanner

2018-12-31 Thread Guy Dunphy via cctalk
At 03:10 AM 31/12/2018 -0600, you wrote:
>I know of one outside of Chicago that is as is. I might be ab;e to move it
>a state or two or help out with the arrangements.  I know nothing about
>it,  but I can text or email 2 pics.
>
>Paul


User manual: https://www.manualslib.com/download/647954/Kip-2050.html


KIP 2050 image scanner

2018-12-31 Thread Paul Anderson via cctalk
I know of one outside of Chicago that is as is. I might be ab;e to move it
a state or two or help out with the arrangements.  I know nothing about
it,  but I can text or email 2 pics.

Paul