Lars Brinkhoff <[email protected]> wrote: > Johnny Billquist wrote: > > But if we're getting back into that you need a tool to convert the > > text for local usability, then you loose that whole point about the > > suggested format as transparently giving you the text again, and you > > might as well go with another format that is a bit more compact? > > Yes, if there were a significant amount of files that would be garbled > by the simple transformation. But to me it seems like a win if 99% of > all files are readable as is, and the possible 1% that aren't are still > encoded losslessly so they can be put through further transformations > if necessary. > > Core dump isn't more compact, it's exactly the same size. And 0% of all > text files are readable.
I have made some changes to back10. But first, a quick run-thru on data formats and other fun things. Lets start with a file on a tops-10 machine. That file is a sequence of 36-bit words, with some metadata like file name, creation date etc. When we run backup.exe on that machine, and save that specific file, we will generate a saveset, which is another set of 36-bit words. That set of words will be written to tape, in core-dump format, i.e. with every word mapping to five tape characters, 8+8+8+8+4 bits from the word going to the tape, in succession. We then make an image of that tape to a file on another machine, one that uses 8-bit bytes for storage. That image might or might not have some extra framing, lets ignore that for the moment. At this point we can run back10 to extract files from the saveset on the (virtual) tape. The question is how to represent the 36-bit words in the 8-bit file system. There are many ways, but normally we want to preserve all the bits so that we can reverse the whole process given for instance a simulated tops-10 machine, or even the real hardware if we have it. One way, one that makes text files on the tops side readable, is to use the so-called ANSI-ASCII format, which will map each word onto five seven-bit characters stored in five eight-bit octets, with the fifth octet getting the final bit (bit 35) in its high-order bit as a bonus. This is reversible. Another way is to map the 36-bit word to five octets directly, with the final four bits stored in the low-order four bits of the fifth octet. This is the same way that the original tape was written, the so called CORE-DUMP format. This is also reversible. A third way is to map the high-order 32 bits of the 36-bit word onto four octets, and throw away the last four bits. This format is called INDUSTRY-STANDARD. This obviously loses data. I have updated back10 to handle these ways in case anyone wants to use them. The following options exist to select the format: -A Select ANSI-ASCII -C Select CORE-DUMP -I Select INDUSTRY-COMPATIBLE The default is still ANSI-ASCII. --Johnny _______________________________________________ Simh mailing list [email protected] http://mailman.trailing-edge.com/mailman/listinfo/simh
