date:20190204

WTB : Card extender for HP 21xx ( Fit from 2116 to 21MX )

2019-02-04 Thread GerardCJAT via cctalk

All is in the title.
An other very long shoot, isn't it ? ;-)

WTB : CTL, CTuL 9956 ??

2019-02-04 Thread GerardCJAT via cctalk

This is a (very ) long shoot  !
I am looking for ( Fairchild ? ) CTL / CTuL  9956 chips.
Anyone ? Thanks.

HP 1000 L series, parts available, anyone ??

2019-02-04 Thread GerardCJAT via cctalk

I own a HP 1000 L series 200
Cards are badly corroded at connectors level  BUT some other parts are in very 
good shape.

I can offer, 

HP "special SoS" processors:  1AA6-60004,  1AC5,   1AB5,  1AF5 ( - 60001 )

The set of (3) Eprom  
I cannot image them, but I am willing to send them to someone that will do it.
Al. may be ?

The power supply seems in pristine state. Just have to check more closely if 
needed.

Other parts ?  just ask.

"fee" :Eprom, free
 Processors : shipping cost
 Power supply : shipping cost + a little more for packing material 
and my time.

I am in France, close to Paris.

Re: Mounting HP7970e 9-Trk 1/2" Tape Drive

2019-02-04 Thread Brent Hilpert via cctalk

On 2019-Feb-04, at 3:40 PM, Jack Harper via cctalk wrote:
> 
> I am mounting a couple of heavy (130-pounds each) HP7970e tape drives to a 
> 19" rack.
> 
> The screw holes that mate to the standard spaced holes on the right side of 
> the drive after you open the case are visible and obvious.
> 
> However, the holes on the left are hidden under the heavy die-cast(?) frame 
> of the drive.
> 
> Anyone know how to get to those three screw holes with a horrible disassembly.
> 
> There has to be a trick to this that I don't see (so what else is new? :)

If, or presuming, it's the same as the 7970A:

The left-side mount is actually a vertical right-angle bracket screwed to the 
drive proper.

1. Open the drive and unscrew that bracket from the inside.
2. Mount the bracket to the left side of the rack.
3. Lift the drive into place. The mounted bracket has a short right 
angle bend at the bottom (and top)
 that provides a lip on which to rest some of the weight of the 
drive and guide it in.
4. Screw the right side of the drive to the rack.
5. Replace the screws removed in step 1, reattaching the drive to the 
bracket.

The right side of the drive (should) have little pieces of 1/8" thick aluminum 
glued at the back of the rack-mount holes to match
the thickness of the left-side bracket, to make the drive seat parallel to the 
face of the rack.
In my (limited) experience those can have a tendency to fall off and may be 
lost when the drive is unmounted due to dried glue.

Re: Mounting HP7970e 9-Trk 1/2" Tape Drive

2019-02-04 Thread Chuck Guzis via cctalk

On 2/4/19 3:40 PM, Jack Harper via cctalk wrote:
> 
> 
> Greetings to the List -
> 
> I am mounting a couple of heavy (130-pounds each) HP7970e tape drives to
> a 19" rack.
> 
> The screw holes that mate to the standard spaced holes on the right side
> of the drive after you open the case are visible and obvious.
> 
> However, the holes on the left are hidden under the heavy die-cast(?)
> frame of the drive.
> 
> Anyone know how to get to those three screw holes with a horrible
> disassembly.
> 
> There has to be a trick to this that I don't see (so what else is new? :)
>

There's a mounting kit for this as described here;

http://bitsavers.informatik.uni-stuttgart.de/pdf/hp/tape/7970/07970-90383_7970B_7970C_Operating_and_Service_Manual_upd_Feb76/07970-90383_7970B_7970C_Operating_and_Service_Manual_Section_1.pdf

Page 2-1, section 2-8, installation.

Whatever you do, be sure that your rack has an "anti-tip" foot (most
full-size 19" HP racks do.  When you open the drive, the entire assembly
but for the card cages and the power supply swing out.  It's more than
enough to tip a rack over.

--Chuck

Re: Looking for Limited Function Board

2019-02-04 Thread Paul Anderson via cctalk

I have a few of the limited function and programmers panels. I do not
recall either of them being dependent on particular backplane. I'll try to
pull a few a few later and check the part number on them.

Remember, a 54-X number can become a 70-X with the addition of a
cable or something simple.

Paul

On Mon, Feb 4, 2019 at 12:51 PM Anders Sandahl via cctech <
cct...@classiccmp.org> wrote:

> > Date: Sun, 3 Feb 2019 22:22:42 +0100
> > From: Pontus Pihlgren 
> > To: "General Discussion: On-Topic and Off-Topic Posts"
> >   
> > Subject: Looking for Limited Function Board
> > Message-ID: <20190203212242.gf24...@update.uu.se>
> > Content-Type: text/plain; charset=us-ascii
> >
> > Hi
> >
> > I'm restoring a PDP-8/a with the help of some
> > friends. The CPU is now passing the MAINDECs I've
> > thrown at it. The memory is a modern semiconductor
> > board my friend Anders Sandahl made.
> >
> > This machine is pieced together from several others
> > and the limited function panel I got does not match
> > the backplane I have.
> >
> > My theory is the DEC simplified the design of the
> > boardto cut costs and simpler design is not
> > compatible. Mine is labeled (on the PCB):
> >
> > "LIMITED FUNCTION BD.
> > 5411507
> > 5011506C-P2"
> >
> > And the one I need is:
> >
> > "LIMITED FUNCTION
> > 5411165
> > 5011167"
> >
> > However, the picture I have of the other is not so
> > good. I may have read the numbera wrong.
> >
> > I would very much like to buy one to finish this
> > project.
> >
> > /P
>
> Får du inget napp så ritar jag upp ett kort till dig, det borde gå att
> flytta över brytarna från det du har. Lite synd att scrappa ett
> originalkort bara, men är man försiktigt så man inte tar sönder det så går
> det ju att återställa...
>
> /A
>
>

RE: PDP-11/45 RSTS/E boot problem

2019-02-04 Thread Wayne S via cctalk

Yep,  I noticed that, but thought it was a idea you might want to explore and 
it’s simple enough to do.

Without the full output from the ls command and how it was executed I was just 
throwing it out there.

For instance, was the default dir where ls was run, the same dir as when the 
backgrounded one was run.

That would make a difference if the filesystem was corrupt. In previous 
threads, there was an issue getting the proper image onto the disk, there is 
the potential for corruption.



There is the assumption, since  boards were being worked on, that the problem 
for a software is probably due to said hardware, even though diags pass.  With 
that assumption,  shouldn’t you try to eliminate different hardware pieces?   I 
would try running something that uses memory and doesn’t use disk to narrow the 
problem down.



Anyway,

Take care and good luck,



Wayne







Sent from Mail for Windows 10




From: Noel Chiappa 
Sent: Monday, February 4, 2019 12:43:09 PM
To: cctalk@classiccmp.org
Cc: j...@mercury.lcs.mit.edu
Subject: RE: PDP-11/45 RSTS/E boot problem

> From: Wayne S

> it might be a wonky filesystem. ...
> The corruption probably came because the entire disk was going bad.

This theory is contradicted by the fact (mentioned several times, including in
the message you were replying to) that doing a plain 'ls' bombs, but 'sleep
300 &; ls' works fine.

Noel

Mounting HP7970e 9-Trk 1/2" Tape Drive

2019-02-04 Thread Jack Harper via cctalk





Greetings to the List -

I am mounting a couple of heavy (130-pounds each) HP7970e tape drives 
to a 19" rack.


The screw holes that mate to the standard spaced holes on the right 
side of the drive after you open the case are visible and obvious.


However, the holes on the left are hidden under the heavy die-cast(?) 
frame of the drive.


Anyone know how to get to those three screw holes with a horrible disassembly.

There has to be a trick to this that I don't see (so what else is new? :)


Best,

Jack



--
Jack Harper, President
Secure Outcomes Inc
2942 Evergreen Parkway, Suite 300
Evergreen, Colorado 80439 USA

303.670.8375
303.670.3750 (fax)

http://www.secureoutcomes.net for Product Info.

RE: PDP-11/45 RSTS/E boot problem

2019-02-04 Thread Wayne S via cctalk

Noel,  it might be a wonky filesystem.

I’ve had ls -l seg fault because of bad attribute data on a file in a directory 
on Solaris.

Interestingly, ls (without the -l) worked okay.

Maybe fsck or the equivalent command may show something.

It was a Solaris system with many concurrent users so I couldn’t  take it down 
to  run fsck so I

ended up writing a quick Perl program to just list file names and then modified 
it to get the attributes. It seg faulted when it came to the bad file name.  I 
used Perl unlink to kill it and everything was okay.

The corruption probably came because the entire disk was going bad.

Just a thought.







Sent from Mail for Windows 10




From: cctalk  on behalf of Noel Chiappa via 
cctalk 
Sent: Monday, February 4, 2019 11:24:19 AM
To: cctalk@classiccmp.org
Cc: j...@mercury.lcs.mit.edu
Subject: Re: PDP-11/45 RSTS/E boot problem

> From: Jay Jaeger

> This sort of situation, where DEC diagnostics run OK but UNIX has issues
> was reported to be not all that uncommon - to the point where the urban
> legend was that some DEC FE's would fire up Unix V6 as a sort of system
> exerciser.

Amusing! Never heard that; our -11's were never under maintenance, so DEC FE's
never worked on them.

> Make a copy of ls, and see if the copy also fails

It acts just like the original; fails when run by itself, runs OK when 'sleep'
is also running (in the background).


> From: Bob Smith

> We finally had the cpu backplane replaced

Ow. Not an option for Fritz, I expect. (I dunno - anyone have a spare /45
backplane?)


> From: Paul Koning

> Is there any way to attach a logic analyzer to various data paths on
> this machine?

I had suggested to Fritz that the symptoms led me to believe that it was time
to deploy a LA, especially since the MM trap only occurs once between him
typing 'ls' and the process failing - i.e. easy to trigger on.

He offered me the options of look at the IR or at the UNIBUS - I opted for
the IR so we can see _exactly_ what the machine _thinks_ it is doing! No
report back yet, though.

   Noel

Re: PDP-11/45 RSTS/E boot problem

2019-02-04 Thread Jon Elson via cctalk


On 02/04/2019 11:34 AM, Fritz Mueller via cctalk wrote:



2.  Make a copy of ls, and see if the copy also fails
(different location on disk would mess with timing just a bit).

Also done; the copy appears to behave identically to the original.


OK, here's a really complicated thing to try.  If you know 
the physical memory address of ls when it has the problem, 
write a machine language program that loads a copy of ls 
into that location and then tries to read it back.  You 
might be able to do this in Unix, having it start with the 
exact code of ls, but then has the tester above that and the 
entry point is for the test program.
This would detect a pattern sensitivity in the memory.  If 
ls, when actually running reads an instruction wrong, it 
could then try to read a bad address, and cause the MMU trap.


Jon

Re: PDP-11/45 RSTS/E boot problem

2019-02-04 Thread Jon Elson via cctalk


On 02/04/2019 11:20 AM, Fritz Mueller via cctalk wrote:


The MMU classifies the error in register SR0; this decodes to a segment length 
error (access within the segment beyond configured bound).  As Noel notes, 
however, this is not consistent with the instructions we see at the point of 
fault.
OK, so the CPU presents an address that is within the 
segment bound, but the MMU declares it to be OUTSIDE the 
bounds of the segment. That could be a CPU problem, but 
likely would be the same with the MMU on or off, so the 
diags SHOULD catch that.  But, if the CPU is sending a good 
address, then it has to be the MMU is failing on the 
addition/comparison with the segment size.

Anyway, is it possible to borrow an MMU from somebody else?

Potentially...  It is a two board option; I do have a spare for both of the 
boards, but these spares each are in need of other repairs at the moment.

One slightly complicating factor is that I have a *very* early 11/45.  Most of 
my boards (including the MMU boards), as well as my backplane, pre-date the 
currently available schematics on bitsavers, etc., and there are no records 
regarding which ECOs have been applied on my hardware.  Thus my interest in 
tracking down ECOs/FCOs...  I've been picking my way through the list that Jay 
recently posted, verifying by looking at the greenwires which FCO's I have 
applied and which not.  Its a bit painstaking.


This could be messy, but DEC was FAIRLY good at making 
updates backwards compatible where possible.  So, it MAY be 
true that a later MMU will still work in this CPU.


Jon

Re: E01 (Was: Raspberry Pi floppy interface.

2019-02-04 Thread Fred Cisin via cctalk


eg: What is the structure of the "Header Case Information" block?
The E01 would be adequate (barely), if accompanied by an additional
"metadata" file that describes the physical format.  (In much more
detail than just "IBM PC 360K", etc.)  For MOST situations, OS,
encoding, bytes per sector, sectors per track, interleave, side pattern,
size of index and inter-sector gaps, etc. might do.  That would still be
far from PERFECT, because it would fail to catch . . .


On Mon, 4 Feb 2019, Chuck Guzis via cctalk wrote:

Somewhere on the LOC website, there is a bit more detail--and source for
Linux tools under "ewf-tools" is also available.

The header information for E01 files is fairly rigid in structure.  But
a text description of, say, a Victor 9000 floppy is kind of hard to put
into 50 words or less.


not even 64 bit ones.
With a lot more space, you could write a reasonably usable description. 
But it would have to be an extensible variable length field to permit 
identifying various exceptions and oddities.  Therefore, the media 
physical format description would have to be a separate file from the E01 
"data from the disk".




So, when an archivist talks about forensic data image, I scratch my head
in bewilderment.  I try to put things in terms that they might
understand; to wit, "If you had temporary custody of an extremely rare
book, would you be content with just the text of the book, or would you
want photographic images of every page?"


An excellent analogy.  I guess that they would not be interested in a 
separate file of a photographic image of every page to accompany the file 
of the text of the book.



But I'm sure that Fred is well acquainted with the "This is what they
told us was the Right Thing to Do, so that's what we want."  phenomenon.


BTDT.
In addition to Xenosoft (concurrently!), I was a community college 
professor, dealing with college administrators for 35 years.  Sometimes 
you can not get past the Misplaced Authority Syndrome to explain reality.


--
Grumpy Ol' Fred ci...@xenosoft.com

Re: E01 (Was: Raspberry Pi floppy interface.

2019-02-04 Thread Fred Cisin via cctalk

And, of course, a lossy compression, such as MP4 leaves room for an 
enormous amount of steganographic data, with documants and data hidden 
in porn.  (MANY different MP4 files will still play the same movie)


On Mon, 4 Feb 2019, John Foust via cctalk wrote:

That would be a very sneaky criminal if they were still using floppies.


"Hey, Captain!  Does anybody know what this 8 inch square thing is, and 
whether it goes with a computer?"



That paragraph was not specifically about floppies, but about the entire 
concept of hidden data.

Re: PDP-11/45 RSTS/E boot problem

2019-02-04 Thread Noel Chiappa via cctalk

> From: Fritz Mueller

> I've had a bit of time in front of the machine to repro this and take a
> look. What I actually see is:

> R0 10
> R1 0
> R2 0
> R3 0
> R4 0
> R5 34
> R6 141774
> PC 000254

Argh. (Very red face!)

I worked out the trap stack layout by looking at m40.s and trap.c, and
totally forgot about the return PC (that's the 0444) from the call to
trap():

  0001740 13 141756 022050 13 00 00 00 34
  0001760 000444 31 177760 00 030351 10 010210 170010

I clearly should have looked at core(V) in the V6 manual!

The R6 you have recorded is correct for just after the trap; that's
the kernel mode SP, which points to the top of the kernel stack,
in segment 6 (in the swappable per-process kernel area, which runs
from 14-1776).

So there is no R5 mystery, I was just confused. Back to the other two!

Noel

Re: E01 (Was: Raspberry Pi floppy interface.

2019-02-04 Thread Chuck Guzis via cctalk

On 2/4/19 3:40 PM, Jim Manley via cctalk wrote:
> Did someone say "punched cards ... with steganographic bits in chads that
> are only attached along a couple of edges"?

NCR CRAM?

Re: E01 (Was: Raspberry Pi floppy interface.

2019-02-04 Thread Jim Manley via cctalk

Did someone say "punched cards ... with steganographic bits in chads that
are only attached along a couple of edges"?

On Mon, Feb 4, 2019 at 4:36 PM Chuck Guzis via cctalk 
wrote:

> On 2/4/19 3:22 PM, John Foust via cctalk wrote:
> > At 04:49 PM 2/4/2019, Fred Cisin via cctalk wrote:
> >> And, of course, a lossy compression, such as MP4 leaves room for an
> enormous amount of steganographic data, with documants and data hidden in
> porn.  (MANY different MP4 files will still play the same movie)
> >
> > That would be a very sneaky criminal if they were still using floppies.
>
> As opposed to, say DECtape?
>
>

Re: E01 (Was: Raspberry Pi floppy interface.

2019-02-04 Thread Chuck Guzis via cctalk

On 2/4/19 3:22 PM, John Foust via cctalk wrote:
> At 04:49 PM 2/4/2019, Fred Cisin via cctalk wrote:
>> And, of course, a lossy compression, such as MP4 leaves room for an enormous 
>> amount of steganographic data, with documants and data hidden in porn.  
>> (MANY different MP4 files will still play the same movie)
> 
> That would be a very sneaky criminal if they were still using floppies.

As opposed to, say DECtape?

Re: E01 (Was: Raspberry Pi floppy interface.

2019-02-04 Thread Chuck Guzis via cctalk

On 2/4/19 2:49 PM, Fred Cisin via cctalk wrote:

> Well, conversion between E01 and IMD or teledisk formats looks
> straightforward.
> 
> http://www.forensicsware.com/blog/e01-file-format.html
> Is there a better description handy?
> 
> eg: What is the structure of the "Header Case Information" block?
> 
> The E01 would be adequate (barely), if accompanied by an additional
> "metadata" file that describes the physical format.  (In much more
> detail than just "IBM PC 360K", etc.)  For MOST situations, OS,
> encoding, bytes per sector, sectors per track, interleave, side pattern,
> size of index and inter-sector gaps, etc. might do.  That would still be
> far from PERFECT, because it would fail to catch several obvious ways to
> hide additional data on a disk;  eg. different physical interleaves that
> would still read the same on "normal" reading, or RSA encrypted data
> with the key stored in intersector gaps. Or, a small amount of data
> stored as locations of deliberate disk errors.  Think about ProLock.

Somewhere on the LOC website, there is a bit more detail--and source for
Linux tools under "ewf-tools" is also available.

The header information for E01 files is fairly rigid in structure.  But
a text description of, say, a Victor 9000 floppy is kind of hard to put
into 50 words or less.

There seems to be a notion that "an image used for forensic purposes" is
job-guarantee.  In fact, when the term "forensic" is used, it has to do
with crime detection and use as evidence in a legal proceeding.

That is, the point of forensic examination is to prove or disprove
something--completeness isn't necessary in all cases.  For example, if
examining DNA evidence is used to tie or eliminate a suspect, it isn't
necessary that the whole genome be sequenced; presence or absence of a
certain number of "markers' will do the job.

(I spent some years (1987-2000) providing products and training for law
enforcement forensics and am a life member of IACIS.)

So, when an archivist talks about forensic data image, I scratch my head
in bewilderment.  I try to put things in terms that they might
understand; to wit, "If you had temporary custody of an extremely rare
book, would you be content with just the text of the book, or would you
want photographic images of every page?"

But I'm sure that Fred is well acquainted with the "This is what they
told us was the Right Thing to Do, so that's what we want."  phenomenon.

--Chuck

Re: PDP-11/45 RSTS/E boot problem

2019-02-04 Thread Fritz Mueller via cctalk



>>> The obvious answer is bad memory.
>> 
>> At the board level, yes.  Deeper, it could be bad memory bits or bad
>> memory decode.
> 
> Yes, one of the standard early PDP-11 memory tests is the "no duplicate 
> address test".

I should say that the memory board is not _completely_ whack -- it is passing 
the rather thorough MAINDEC ZQMC, a 0-124k exerciser with multiple 
pattern/sequence tests which also kicks around the KT11.

That doesn't rule out the possibility that there is a lurker in there not 
covered by the DEC diags.  But if there is, its something subtle...

Re: PDP-11/45 RSTS/E boot problem

2019-02-04 Thread Fritz Mueller via cctalk

> On Feb 4, 2019, at 2:28 AM, Noel Chiappa via cctalk  
> wrote:
> 
> I'm pretty sure the command only gets a few instructions in before it blows
> up.  Here are the process' registers, and the _entire_ contents of the user
> mode stack:
> 
> R0 10
> R1 0
> R2 0
> R3 0
> R4 34
> R5 444
> SP 177760
> PC 010210
> 
> 060: 00 20 01 10 14 17 071554 00

Okay, I've had a bit of time in front of the machine to repro this and take a 
look.  What I actually see is:

R0 10
R1 0
R2 0
R3 0
R4 0
R5 34
R6 141774
PC 000254

(remember, for the last, this will have been after taking a trap to 250, where 
I have the usual "BR .+2; HALT" catcher installed)

Also, memory at 060 (PA:164060) is all zeros as far as the eye can see...

I have a bit of water on the basement floor right now after the recent rains 
here, which is complicating setup of the LA.  There's a big puddle where I 
normally place it...

Re: E01 (Was: Raspberry Pi floppy interface.

2019-02-04 Thread John Foust via cctalk

At 04:49 PM 2/4/2019, Fred Cisin via cctalk wrote:
>And, of course, a lossy compression, such as MP4 leaves room for an enormous 
>amount of steganographic data, with documants and data hidden in porn.  (MANY 
>different MP4 files will still play the same movie)

That would be a very sneaky criminal if they were still using floppies.

- John

Re: PDP-11/45 RSTS/E boot problem

2019-02-04 Thread Paul Koning via cctalk




> On Feb 4, 2019, at 5:47 PM, Ethan Dicks  wrote:
> 
> On Mon, Feb 4, 2019 at 3:15 PM Paul Koning via cctalk
>  wrote:
>>> On Feb 4, 2019, at 3:43 PM, Noel Chiappa via cctalk  
>>> wrote:
>> That translates into "the problem depends on the physical address of the 
>> code being executed".
>> 
>> The obvious answer is bad memory.
> 
> At the board level, yes.  Deeper, it could be bad memory bits or bad
> memory decode.
> 
> A simple ones-and-zeros test can identify bad DRAMs.  It's not as
> likely to find bad decoding, which could result in the same chips
> tested more than once and other chips not tested at all.  I've found
> both problems in real MS11-L boards I have for my stack of 11/04 and
> 11/34s I'm testing.
> 
> ISTR in the DEC world, they were good about that.  I have multiple
> papertapes for the PDP-8, that I think were literally called "ones and
> zeros" and "memory address" tests.  I would think XXDP has something
> similar in terms of progressive tests that expect the previous stage
> passed.

Yes, one of the standard early PDP-11 memory tests is the "no duplicate address 
test".

paul

Re: E01 (Was: Raspberry Pi floppy interface.

2019-02-04 Thread Fred Cisin via cctalk


On Mon, 4 Feb 2019, Chuck Guzis via cctalk wrote:

Based on my conversations with clients, the problem is not the
equipment, but rather the lack of an open, vetted and documented file
format.

As an example, customers of mine insist on a "forensic" image file of
type E01 (Encase format), which has been endorsed by the Library of
Congress and several law enforcement agencies as a valid "forensic" format.

As insane as it sounds, I've had to provide floppy images as E01 files.
The insanity stems from the loss of information that would enable one to
recreate the original (e.g. sector headers, modulation, data rate, track
spacing, etc.).

But one does what one does to keep customers happy.


Well, conversion between E01 and IMD or teledisk formats looks 
straightforward.


http://www.forensicsware.com/blog/e01-file-format.html
Is there a better description handy?

eg: What is the structure of the "Header Case Information" block?

The E01 would be adequate (barely), if accompanied by an additional 
"metadata" file that describes the physical format.  (In much more detail 
than just "IBM PC 360K", etc.)  For MOST situations, OS, encoding, bytes 
per sector, sectors per track, interleave, side pattern, size of 
index and inter-sector gaps, etc. might do.  That would still be 
far from PERFECT, because it would fail to catch several obvious ways to 
hide additional data on a disk;  eg. different physical interleaves 
that would still read the same on "normal" reading, or RSA encrypted data 
with the key stored in intersector gaps. Or, a small amount of data 
stored as locations of deliberate disk errors.  Think about ProLock.


And, of course, a lossy compression, such as MP4 leaves room for an 
enormous amount of steganographic data, with documants and data hidden in 
porn.  (MANY different MP4 files will still play the same movie)


--
Grumpy Ol' Fred ci...@xenosoft.com

Re: PDP-11/45 RSTS/E boot problem

2019-02-04 Thread Ethan Dicks via cctalk

On Mon, Feb 4, 2019 at 3:15 PM Paul Koning via cctalk
 wrote:
> > On Feb 4, 2019, at 3:43 PM, Noel Chiappa via cctalk  
> > wrote:
> That translates into "the problem depends on the physical address of the code 
> being executed".
>
> The obvious answer is bad memory.

At the board level, yes.  Deeper, it could be bad memory bits or bad
memory decode.

A simple ones-and-zeros test can identify bad DRAMs.  It's not as
likely to find bad decoding, which could result in the same chips
tested more than once and other chips not tested at all.  I've found
both problems in real MS11-L boards I have for my stack of 11/04 and
11/34s I'm testing.

ISTR in the DEC world, they were good about that.  I have multiple
papertapes for the PDP-8, that I think were literally called "ones and
zeros" and "memory address" tests.  I would think XXDP has something
similar in terms of progressive tests that expect the previous stage
passed.

-ethan

Re: PDP-11/45 RSTS/E boot problem

2019-02-04 Thread Jay Jaeger via cctalk

On 2/4/2019 11:34 AM, Fritz Mueller via cctech wrote:
> 
>> On Feb 4, 2019, at 9:13 AM, Jay Jaeger  wrote:
>>
>> If he hasn't already, if Fritz has more than one memory board, he might
>> try swapping them to see if that changes anything.
> 
> I only have an 128kw MS11-L here to work with, unfortunately.  Its been 
> through a bunch of recent troubleshooting (tracking down and replacing failed 
> DRAMs).  I *think* its pretty solid at this point (also passing some of the 
> hairier DEC diagnostics) but...
> 
> I'd be happy to try out a different memory board if anybody was interested in 
> sending out a loaner?  (I'm in the SF Bay area).
> 

Well it turns out I have a couple of spares, but maybe someone closer
would be easier (Madison, WI  53711)

I have an MS11-LB, 64Kw, M7891-BB and two MS11-LD, 128Kw, M7891-DB and
an M7891-D?

So, two of these are newer revisions (rather than M7891-xA) - I have no
idea what the difference is.  On that last one I probably didn't record
where it was D, DB or DA

I also have quite a few RK05 packs and would be willing to sell one (and
I have boxes to ship boards and packs in).  The ones I am most willing
to part with would need their open/close springs removed, as they are
broken and dangerous to the platter in their current condition, but are
otherwise fine.  I would just remove the spring.

$20 for a pack is what I usually price them at, plus shipping.  (PayPal,
preferably)

The board would be a loan (with compensation for time spent if it is bad
*and* gets fixed) ;).

Let me know - might take me a couple of days to hunt the board down and
remove the spring and re-test the pack and pack everything up and ship
it.  (in my 11/34 which runs @rkunix V6 just fine.  ;))

JRJ

Re: Raspberry Pi floppy interface.

2019-02-04 Thread Chuck Guzis via cctalk

On 2/4/19 1:17 PM, Paul Koning wrote:

> Yes, but if the current format is wrong for the job, and people who should 
> know better do not realize this, it would be a good idea (as a separate 
> activity) to educate them and propose a better answer.
Yes, I know, and I've tried and written volumes about the subject when
E01 for floppies requested.  To no avail.

I'll make E01 files and then toss in the IMD file as a freebie  The
simple fact is that archivists in general seem to have a "don't rock the
boat" mentality.

--Chuck

Re: Raspberry Pi floppy interface.

2019-02-04 Thread Paul Koning via cctalk




> On Feb 4, 2019, at 4:05 PM, Chuck Guzis via cctalk  
> wrote:
> 
> ...
> Based on my conversations with clients, the problem is not the
> equipment, but rather the lack of an open, vetted and documented file
> format.
> 
> As an example, customers of mine insist on a "forensic" image file of
> type E01 (Encase format), which has been endorsed by the Library of
> Congress and several law enforcement agencies as a valid "forensic" format.
> 
> As insane as it sounds, I've had to provide floppy images as E01 files.
> The insanity stems from the loss of information that would enable one to
> recreate the original (e.g. sector headers, modulation, data rate, track
> spacing, etc.).
> 
> But one does what one does to keep customers happy.
> 
> --Chuck

Yes, but if the current format is wrong for the job, and people who should know 
better do not realize this, it would be a good idea (as a separate activity) to 
educate them and propose a better answer.

paul

Re: PDP-11/45 RSTS/E boot problem

2019-02-04 Thread Paul Koning via cctalk

> On Feb 4, 2019, at 3:43 PM, Noel Chiappa via cctalk  
> wrote:
> 
>> From: Wayne S
> 
>> it might be a wonky filesystem. ...
>> The corruption probably came because the entire disk was going bad.
> 
> This theory is contradicted by the fact (mentioned several times, including in
> the message you were replying to) that doing a plain 'ls' bombs, but 'sleep
> 300 &; ls' works fine.

That translates into "the problem depends on the physical address of the code 
being executed".

The obvious answer is bad memory.  Another possibility occurs to me: bad bits 
in the MMU (UISAR0 register if I remember correctly).  Bad memory is likely to 
show up with a few bits wrong; if UISAR0 has a stuck bit so the "plain" case 
maps incorrectly you'd expect to come up with execution that looks nothing at 
all like what was intended.

paul

Re: Raspberry Pi floppy interface.

2019-02-04 Thread Chuck Guzis via cctalk

On 1/18/19 7:40 AM, geneb via cctalk wrote:
> This looks like a project with a ton of potential for archviving media
> without having to deal with the asshattery of the kryoflux people.
> 
> https://github.com/picosonic/bbc-fdc

Yes, you can do this, as I've said many times before, with just about
any semi-modern MCU that has sufficient memory.  I've got a couple of
prototypes from years back, that, for example, use AVR (Mega162 and
Mega256) to do this.  The 162 version had write buffers also, so it
could emulate a floppy quite nicely.

An Orange Pi Zero with appropriate buffers could also do this and would
essentially be a $10 sledgehammer.
-
Based on my conversations with clients, the problem is not the
equipment, but rather the lack of an open, vetted and documented file
format.

As an example, customers of mine insist on a "forensic" image file of
type E01 (Encase format), which has been endorsed by the Library of
Congress and several law enforcement agencies as a valid "forensic" format.

As insane as it sounds, I've had to provide floppy images as E01 files.
The insanity stems from the loss of information that would enable one to
recreate the original (e.g. sector headers, modulation, data rate, track
spacing, etc.).

But one does what one does to keep customers happy.

--Chuck

RE: PDP-11/45 RSTS/E boot problem

2019-02-04 Thread Noel Chiappa via cctalk

> From: Wayne S

> it might be a wonky filesystem. ...
> The corruption probably came because the entire disk was going bad.

This theory is contradicted by the fact (mentioned several times, including in
the message you were replying to) that doing a plain 'ls' bombs, but 'sleep
300 &; ls' works fine.

Noel

Re: Looking for Limited Function Board

2019-02-04 Thread Anders Sandahl via cctalk

> Date: Sun, 3 Feb 2019 22:22:42 +0100
> From: Pontus Pihlgren 
> To: "General Discussion: On-Topic and Off-Topic Posts"
>   
> Subject: Looking for Limited Function Board
> Message-ID: <20190203212242.gf24...@update.uu.se>
> Content-Type: text/plain; charset=us-ascii
>
> Hi
>
> I'm restoring a PDP-8/a with the help of some
> friends. The CPU is now passing the MAINDECs I've
> thrown at it. The memory is a modern semiconductor
> board my friend Anders Sandahl made.
>
> This machine is pieced together from several others
> and the limited function panel I got does not match
> the backplane I have.
>
> My theory is the DEC simplified the design of the
> boardto cut costs and simpler design is not
> compatible. Mine is labeled (on the PCB):
>
> "LIMITED FUNCTION BD.
> 5411507
> 5011506C-P2"
>
> And the one I need is:
>
> "LIMITED FUNCTION
> 5411165
> 5011167"
>
> However, the picture I have of the other is not so
> good. I may have read the numbera wrong.
>
> I would very much like to buy one to finish this
> project.
>
> /P

Får du inget napp så ritar jag upp ett kort till dig, det borde gå att
flytta över brytarna från det du har. Lite synd att scrappa ett
originalkort bara, men är man försiktigt så man inte tar sönder det så går
det ju att återställa...

/A

Re: PDP-11/45 RSTS/E boot problem

2019-02-04 Thread Noel Chiappa via cctalk

> From: Jay Jaeger

> This sort of situation, where DEC diagnostics run OK but UNIX has issues
> was reported to be not all that uncommon - to the point where the urban
> legend was that some DEC FE's would fire up Unix V6 as a sort of system
> exerciser.

Amusing! Never heard that; our -11's were never under maintenance, so DEC FE's
never worked on them.

> Make a copy of ls, and see if the copy also fails

It acts just like the original; fails when run by itself, runs OK when 'sleep'
is also running (in the background).


> From: Bob Smith

> We finally had the cpu backplane replaced

Ow. Not an option for Fritz, I expect. (I dunno - anyone have a spare /45
backplane?)


> From: Paul Koning

> Is there any way to attach a logic analyzer to various data paths on
> this machine?

I had suggested to Fritz that the symptoms led me to believe that it was time
to deploy a LA, especially since the MM trap only occurs once between him
typing 'ls' and the process failing - i.e. easy to trigger on.

He offered me the options of look at the IR or at the UNIBUS - I opted for
the IR so we can see _exactly_ what the machine _thinks_ it is doing! No
report back yet, though.

   Noel

Re: PDP-11/45 RSTS/E boot problem

2019-02-04 Thread Warner Losh via cctalk

On Mon, Feb 4, 2019 at 11:35 AM Paul Koning via cctalk <
cctalk@classiccmp.org> wrote:

>  The spec says allowed tolerances are +/- 5%.  He knew the reality for
> correct operation was -0%, +5%, so he tweaked all the supplies to read a
> hair above nominal.
>

Ah, the good old days...  I recall our PDP-11 tech tweaking +5V from 5.05V
to 4.95V and back again to demonstrate that tiny differences matter a lot
on one of the cranky 11/23+''s we had after I made a particularly unhelpful
teenage smart ass remark... The 11/23+ wouldn't boot at the slightly lower
than full voltage. It as cranky for a couple of years. Before that unit was
retired, the 5V and 12V rails had been tweek up to 5.2V and 12.5V in an
effort to keep the system alive long enough to transition customers from it
to a new Vax installed to deal with the growth in demand...  In the end, we
put that 11/23+ back in service for developers with a different disk
controller and it was happy back at +5.05V / +12.1V...

Warner

Re: PDP-11/45 RSTS/E boot problem

2019-02-04 Thread Fritz Mueller via cctalk

> On Feb 4, 2019, at 10:34 AM, Paul Koning via cctalk  
> wrote:
> 
>> On Feb 4, 2019, at 12:18 PM, Bob Smith via cctalk  
>> wrote:
>> 
>> I keep wondering about the psu.
> 
> Good theory.

I'll give these a double-check...

> Is there any way to attach a logic analyzer to various data paths on this 
> machine?

Yes; the caveat is that it's really only practical to have one card out on 
extenders at a time (and I have only one hex extender in any case).  So you 
have to be a little choosy about what you want to capture/see.

Some things can be plucked from the backplane, but I like to avoid sliding too 
many probe connectors onto the backplane pins unless I really have to, in order 
to avoid perturbing the 45-year-old wraps.

  --FritzM.

Re: PDP-11/45 RSTS/E boot problem

2019-02-04 Thread Paul Koning via cctalk

> On Feb 4, 2019, at 12:18 PM, Bob Smith via cctalk  
> wrote:
> 
> I keep wondering about the psu. I just recall the /45 in my lab was
> always a little flakey.
> We suspected everything in the machine, and it was ak to chasing a sea
> bat in the dark.

Good theory.

In RSTS development we once ran into DMC-11s not working reliably.  The field 
service tech knew exactly what to look for, and started checking all the supply 
voltages.  The spec says allowed tolerances are +/- 5%.  He knew the reality 
for correct operation was -0%, +5%, so he tweaked all the supplies to read a 
hair above nominal.

Is there any way to attach a logic analyzer to various data paths on this 
machine?

paul

Re: PDP-11/45 RSTS/E boot problem

2019-02-04 Thread Noel Chiappa via cctalk

> From: Jon Elson

> Does the MMU classify what the error condition was

Yes, there are a series of bits in SSR0 to indicate the particular error:
'non-resident', 'length', 'read-only', etc (and also the segment number the
error's from).  As my message mentioned, we're seeing the 'length' error bit
on, and for segment 1 (which the instruction isn't using).

> is it possible to borrow an MMU from somebody else?  

Fritz does have a spare board, but it has known errors. We haven't thought about
borrowing one yet, it may come to that.

> If this fault could be caused by memory

Well, _some_ of it could. E.g. if the 'jsr r5, csv' is read as 'jsr r4, csv',
dropping the '1' bit in the register number, that would explain the wonky R5
content - R4 does contain the 034 that should be in R5, I have just noticed.

But that doesn't explain the bogus MM trap. Although I suppose there could be
several different problems, all at the same time.

> does this machine have a cache? 

Nope.

Noel

Re: PDP-11/45 RSTS/E boot problem

2019-02-04 Thread Fritz Mueller via cctalk

> On Feb 4, 2019, at 9:13 AM, Jay Jaeger  wrote:
> 
> If he hasn't already, if Fritz has more than one memory board, he might
> try swapping them to see if that changes anything.

I only have an 128kw MS11-L here to work with, unfortunately.  Its been through 
a bunch of recent troubleshooting (tracking down and replacing failed DRAMs).  
I *think* its pretty solid at this point (also passing some of the hairier DEC 
diagnostics) but...

I'd be happy to try out a different memory board if anybody was interested in 
sending out a loaner?  (I'm in the SF Bay area).

> The issue I'd see with the MMU swapout idea would be finding one that
> would be ECO-compatible with the rest of the processor.

Yes; per previous email.

> Other things I might be tempted to try in order to coax more information
> out of the situation:
> 
> 1.  Run the DEC system exerciser, if that has not already been done.

Done; passes consistently.

> 2.  Make a copy of ls, and see if the copy also fails
>(different location on disk would mess with timing just a bit).

Also done; the copy appears to behave identically to the original.

> 3.  Use SimH to build a pack image with an instance of ls that is not
>pure text (no -n or -i flag)
> 4.  Use SimH to build an ls that does/does not start the data segment
>for ls at 0 (has / does not have the -i flag) [I have not looked to
>see how ls would normally be built.]

These may yet be interesting experiments.

> 5.  Use SimH to gen a pack image with a kernel that is a not split I/D

The image I'm using is a /40-compatible one, built from the distro tape, the 
one that you get before going on to rebuild for /45. So, as Noel has reminded 
me, no split I/D.  But we could try the opposite experiment (build a pack 
customized for /45) and see if that's different?

The pain with that is that it takes me almost three hours to image a pack with 
PDP11GUI over my DL11, and I have a very limited set of packs with which to 
work (others on this list have graciously offered to help there; I need to send 
y'all some shipping boxes!)

cheers,
  --FritzM.

Re: PDP-11/45 RSTS/E boot problem

2019-02-04 Thread Fritz Mueller via cctalk

Hi all; thanks for the write-up on the issue, Noel! 

> On Feb 4, 2019, at 8:24 AM, Jon Elson via cctalk  
> wrote:
> Is this truly a fault given by the memory management system, or some other 
> kind of fault (Unibus timeout or memory parity error)?

Trap 250, which is explicitly memory management.

> Does the MMU classify what the error condition was, or just assume the trap 
> handler can figure it out be looking at the registers?

The MMU classifies the error in register SR0; this decodes to a segment length 
error (access within the segment beyond configured bound).  As Noel notes, 
however, this is not consistent with the instructions we see at the point of 
fault.

> Anyway, is it possible to borrow an MMU from somebody else?

Potentially...  It is a two board option; I do have a spare for both of the 
boards, but these spares each are in need of other repairs at the moment.

One slightly complicating factor is that I have a *very* early 11/45.  Most of 
my boards (including the MMU boards), as well as my backplane, pre-date the 
currently available schematics on bitsavers, etc., and there are no records 
regarding which ECOs have been applied on my hardware.  Thus my interest in 
tracking down ECOs/FCOs...  I've been picking my way through the list that Jay 
recently posted, verifying by looking at the greenwires which FCO's I have 
applied and which not.  Its a bit painstaking.

> I can easily imagine that the diags can't test every possible bit combination 
> while the diags are ALSO running in memory.

The "KT11-C Exerciser" diag is a pretty mean beast.  It relocates itself around 
memory, and is pretty thorough.  It takes about 45 mins to work its way through 
testing access to all memory from all segments in all processor modes.  It uses 
the line clock and terminal interface as a source of interrupts to check fault 
behavior wrt. interrupts, etc. etc.  This, and all the other KT11 diagnostics, 
pass cleanly on the machine.

> OH, does this machine have a cache?

Nope, no cache.

One other thing of note, per the genesis of this thread, RSTS/E also has a 
consistent failure at boot.  As far as we got looking into that before jumping 
tracks to V6 Unix, the symptoms there looked similar.  Paul thought the 
wreckage indicated a bad binary, but an image of the disk we were trying to 
boot works fine under SIMH.

RT-11, being much more of a honey badger, just works :-)

--FritzM.

Re: PDP-11/45 RSTS/E boot problem

2019-02-04 Thread Bob Smith via cctalk

I keep wondering about the psu. I just recall the /45 in my lab was
always a little flakey.
We suspected everything in the machine, and it was ak to chasing a sea
bat in the dark.
Our environment in ML 1-2 was not the best, the floor actually moved,
we were right at the mid building elevator.
We finally had the cpu backplane replaced and the problem went away.

On Mon, Feb 4, 2019 at 12:13 PM Jay Jaeger via cctalk
 wrote:
>
> On 2/4/2019 10:20 AM, Jon Elson via cctalk wrote:
> > On 02/04/2019 04:28 AM, Noel Chiappa via cctalk wrote:
> >>
> >> First oddity - the problem is dependent on the location of the command
> >> in main
> >> memory! If Fritz says "sleep 360 &", to run a trivial command in the
> >> background, and _then_ says 'ls' - it works (so we know the binary of
> >> 'ls' on
> >> disk is OK)! We _think_ this is because the process executing the 'sleep'
> >> takes up a chunk of main memory, and thus changes the location of the
> >> process
> >> executing the 'ls'.
> >>
> >>
> > OK, the classic Heisenbug.  Is this truly a fault given by the memory
> > management system, or some other kind of fault (Unibus timeout or memory
> > parity error)?  If really related to MMU, then maybe there is a bad bit
> > in the MMU that is causing it to reference the wrong segment entry or
> > somehow thinking the setting of that segment entry is invalid.  Does the
> > MMU classify what the error condition was, or just assume the trap
> > handler can figure it out be looking at the registers?
> >
> > Anyway, is it possible to borrow an MMU from somebody else?  I can
> > easily imagine that the diags can't test every possible bit combination
> > while the diags are ALSO running in memory.
> > So, a somewhat cryptic bug could go undetected.
> >
> > If this fault could be caused by memory, then it may be a
> > pattern-sensitive error, and ls is just the perfect pattern to trip it up.
> >
> > Jon
> >
>
> If he hasn't already, if Fritz has more than one memory board, he might
> try swapping them to see if that changes anything.
>
> The issue I'd see with the MMU swapout idea would be finding one that
> would be ECO-compatible with the rest of the processor.
>
> This sort of situation, where DEC diagnostics run OK but UNIX has issues
> was reported to be not all that uncommon - to the point where the urban
> legend was that some DEC FE's would fire up Unix V6 as a sort of system
> exerciser.
>
> Other things I might be tempted to try in order to coax more information
> out of the situation:
>
> 1.  Run the DEC system exerciser, if that has not already been done.
> 2.  Make a copy of ls, and see if the copy also fails
> (different location on disk would mess with timing just a bit).
> 3.  Use SimH to build a pack image with an instance of ls that is not
> pure text (no -n or -i flag)
> 4.  Use SimH to build an ls that does/does not start the data segment
> for ls at 0 (has / does not have the -i flag) [I have not looked to
> see how ls would normally be built.]
> 5.  Use SimH to gen a pack image with a kernel that is a not split I/D
>
> JRJ

Re: PDP-11/45 RSTS/E boot problem

2019-02-04 Thread Jay Jaeger via cctalk

On 2/4/2019 10:20 AM, Jon Elson via cctalk wrote:
> On 02/04/2019 04:28 AM, Noel Chiappa via cctalk wrote:
>>
>> First oddity - the problem is dependent on the location of the command
>> in main
>> memory! If Fritz says "sleep 360 &", to run a trivial command in the
>> background, and _then_ says 'ls' - it works (so we know the binary of
>> 'ls' on
>> disk is OK)! We _think_ this is because the process executing the 'sleep'
>> takes up a chunk of main memory, and thus changes the location of the
>> process
>> executing the 'ls'.
>>
>>
> OK, the classic Heisenbug.  Is this truly a fault given by the memory
> management system, or some other kind of fault (Unibus timeout or memory
> parity error)?  If really related to MMU, then maybe there is a bad bit
> in the MMU that is causing it to reference the wrong segment entry or
> somehow thinking the setting of that segment entry is invalid.  Does the
> MMU classify what the error condition was, or just assume the trap
> handler can figure it out be looking at the registers?
> 
> Anyway, is it possible to borrow an MMU from somebody else?  I can
> easily imagine that the diags can't test every possible bit combination
> while the diags are ALSO running in memory.
> So, a somewhat cryptic bug could go undetected.
> 
> If this fault could be caused by memory, then it may be a
> pattern-sensitive error, and ls is just the perfect pattern to trip it up.
> 
> Jon
> 

If he hasn't already, if Fritz has more than one memory board, he might
try swapping them to see if that changes anything.

The issue I'd see with the MMU swapout idea would be finding one that
would be ECO-compatible with the rest of the processor.

This sort of situation, where DEC diagnostics run OK but UNIX has issues
was reported to be not all that uncommon - to the point where the urban
legend was that some DEC FE's would fire up Unix V6 as a sort of system
exerciser.

Other things I might be tempted to try in order to coax more information
out of the situation:

1.  Run the DEC system exerciser, if that has not already been done.
2.  Make a copy of ls, and see if the copy also fails
(different location on disk would mess with timing just a bit).
3.  Use SimH to build a pack image with an instance of ls that is not
pure text (no -n or -i flag)
4.  Use SimH to build an ls that does/does not start the data segment
for ls at 0 (has / does not have the -i flag) [I have not looked to
see how ls would normally be built.]
5.  Use SimH to gen a pack image with a kernel that is a not split I/D

JRJ

Re: PDP-11/45 RSTS/E boot problem

2019-02-04 Thread Jon Elson via cctalk


On 02/04/2019 04:28 AM, Noel Chiappa via cctalk wrote:

So I've been helping Fritz look into his -11/45 problem, and things have
gotten to a point where I'd like to reach out for help, more eyes, etc.


OH, does this machine have a cache?  We had a /45, and got a 
cache module for it.  It DRAMATICALLY speeded up the system, 
BUT, it caused extremely weird and irreproducible results, 
so we had to take it out.  I never understood the problem, 
but I think it might have been that the cache did not snoop 
DMA transfers.  It ran a single user fine all day, when a 
second user started doing things, both users got crashes and 
garbled data.  This was on RSX-11M.


Jon

Re: PDP-11/45 RSTS/E boot problem

2019-02-04 Thread Jon Elson via cctalk


On 02/04/2019 04:28 AM, Noel Chiappa via cctalk wrote:


First oddity - the problem is dependent on the location of the command in main
memory! If Fritz says "sleep 360 &", to run a trivial command in the
background, and _then_ says 'ls' - it works (so we know the binary of 'ls' on
disk is OK)! We _think_ this is because the process executing the 'sleep'
takes up a chunk of main memory, and thus changes the location of the process
executing the 'ls'.


OK, the classic Heisenbug.  Is this truly a fault given by 
the memory management system, or some other kind of fault 
(Unibus timeout or memory parity error)?  If really related 
to MMU, then maybe there is a bad bit in the MMU that is 
causing it to reference the wrong segment entry or somehow 
thinking the setting of that segment entry is invalid.  Does 
the MMU classify what the error condition was, or just 
assume the trap handler can figure it out be looking at the 
registers?


Anyway, is it possible to borrow an MMU from somebody else?  
I can easily imagine that the diags can't test every 
possible bit combination while the diags are ALSO running in 
memory.

So, a somewhat cryptic bug could go undetected.

If this fault could be caused by memory, then it may be a 
pattern-sensitive error, and ls is just the perfect pattern 
to trip it up.


Jon

Re: PDP-11/45 RSTS/E boot problem

2019-02-04 Thread Noel Chiappa via cctalk

So I've been helping Fritz look into his -11/45 problem, and things have
gotten to a point where I'd like to reach out for help, more eyes, etc.

I have to say, I spent almost a decade at the start of my career working on
PDP-11 hardware ('new build' DMA devices, as well as fixing broken stuff), and
software, and this is, I think, the most confusing and difficult problem I
have _ever_ seen on one. Hence the above...

What's _particularly_ confusing and difficult is that it seems like _three_
separate, un-related things all go wrong at exactly (2 of 3) or close to (the
other) the same time. And the machine now passes all the diagnostics that have
been thrown at it, particularly the KT11 and RK11 diagnostics (why this is
important will become clear). So here's what we've found to date.


The failure we're looking at is that an attempt to execute the 'ls' command
under Unix V6 fails; it gets a memory mangement fault, and dumps core.

AFAICT, the shell successfully forks, and its attempt to do an exec() of 'ls'
sort of works (more below), but a few instructions in, we get the MM fault - but
there's even more wrong when that happens (details toward the end below).

I've been looking at the core dump produced by the process, which gives me the
registers at the time of the trap, the user's stack, etc - but not a copy of
the binary code - the 'ls' command is a so-called 'pure text', i.e. the binary
is segregated into separate, potentially shared, read-only 'segment(s)' (only
1 in this case) of the PDP-11's User mode address space, and is not included
in the process dump.

(I use the term 'segment', which is actually what DEC called them in the first
version of the PDP-11/45 processor handbook, because that's what they are, not
pages, as pages are on most systems. I assume they changed to 'page' for
marketing reasons. And please, can we hold debate about this and focus on the
problem? Thanks! :-)

I do have the ability to look at the binary that it _should_ be executing, by
examining the command in its file. Also, Fritz has worked out that he can
patch the MM trap vector (before trying to do the 'ls') to halt the machine
when it happens, so he can read out all the KT11 registers, look at the actual
program in main memory, etc.

First oddity - the problem is dependent on the location of the command in main
memory! If Fritz says "sleep 360 &", to run a trivial command in the
background, and _then_ says 'ls' - it works (so we know the binary of 'ls' on
disk is OK)! We _think_ this is because the process executing the 'sleep'
takes up a chunk of main memory, and thus changes the location of the process
executing the 'ls'.

The problem is that I'm reluctant to try and change anything (e.g. to have the
OS print out anything) because that will change the location of things, and we
may (likely?) will not get the problem. With nothing changed, it _reliably_
fails - I've looked at two different core dumps, and all the essential data
(registers, user stack etc) are identical. The KT11 registers all seems to be
the same, too.

So, on to details.


I'm pretty sure the command only gets a few instructions in before it blows
up.  Here are the process' registers, and the _entire_ contents of the user
mode stack:

R0 10
R1 0
R2 0
R3 0
R4 34
R5 444
SP 177760
PC 010210

060: 00 20 01 10 14 17 071554 00

010210 turns out to be the first word in 'csv', which is an internal routine
which PDP-11 C uses to build a stack frame - _every_ C routine starts with
a "JSR R5, CSV" instruction as the first thing it does.

So looking at the stack (which looks good; it contains a valid 'argc' and 'argv'
that the process would be started with), and the registers, I'm pretty sure
it does these starting instuctions OK:

  start:
setd
mov sp,r0
mov (r0),-(sp)
tst (r0)+
mov r0,2(sp)
jsr pc,_main

  _main:
jsr r5,csv

and then blows up on:

  csv:
mov r5,r0

So it's the 8th instruction in that blows up (*): but not only is what's in
memory at that location _not_ 'mov r5,r0', it also gets an MM trap that
makes no sense.

(*: In user mode: if you don't have an FPP, the first one will trap, which
UNIX ignores.)

Fritz has looked at the KT11 register when the trap happens, and the PARs and
PDRs all look good. The SSRs contain:

> SSR's: 040143 00 010210 00

SSR2 gives the PC at the time of the fault (again 010210); SSR0 shows:

 Abort - segment (page) length error
 User mode
 Segment (Page) 1

which is the first thing that's wrong - neither the instruction that's
_supposed_ to be there (next), nor the one that's _actually_ there, contains
any reference to segment 1!

The _actual_ code it's trying to execute is:

> 171600: 016162 004767 000224 000414 006700 006152 006702 006144

(Per UISA0, text base is 0161400, plus a PC of 010210, gives us 0171610, which
is right in the middle there.) That does not, alas, look anything

43 matches

Mail list logo