Re: Unix History: Why does hexdump default to word alignment?
On Thu, Dec 01, 2011, Elazar Leibovich wrote about Unix History: Why does hexdump default to word alignment?: The default behaviour of hexdump is to align data word-wide. For instance Just as a comment, if I remember correctly, hexdump isn't actually part of ancient Unix history - the original was od, which as its name suggests dumps in *octal*, but had an option od -x to see hexadecimal. In any case, od and hexdump are very similar, and apparently have the same ideosynchacies, as od -x also defaults two two-byte words. printf '\xFF\xFF\x01' | hexdump 000 0001 003 This makes little sense to me. In C, structs are not necessarily aligned to words, and it doesn't seems useful to view about any data format for which you're sure everything is word-aligned. The hexdump -C behaviour makes much more sense in the general case. When you say words and word aligned here, you mean historic 2 byte words. This is indeed *NOT* a very useful default on any modern computers. In some old computers, like the PDP11 2 byte words were common and useful. In other old computers, this was never a useful default. I guess nobody cares because since the 1970s when these tools were written, nobody uses them any more ;-) I don't think I used od in at least two decade... At least since less was invented and usually does the right thing (show ASCII when possible, or hex for nonvisible characters). Amazingly, I don't believe that the original od even had an option to see hex for each byte: od -c didn't show hex, od -x showed hex for each two bytes, and od -b (for bytes) showed each byte but octal (which evidentally was more popular than hex in the old days). Gnu's od can do what you want with od -t x1. As you saw, so can hexdump with the -C flag. -- Nadav Har'El| Thursday, Dec 1 2011, n...@math.technion.ac.il |- Phone +972-523-790466, ICQ 13349191 |Make it idiot proof and someone will make http://nadav.harel.org.il |a better idiot. ___ Linux-il mailing list Linux-il@cs.huji.ac.il http://mailman.cs.huji.ac.il/mailman/listinfo/linux-il
Re: Unix History: Why does hexdump default to word alignment?
On Thu, Dec 1, 2011 at 10:10 AM, Nadav Har'El n...@math.technion.ac.ilwrote: When you say words and word aligned here, you mean historic 2 byte words. Indeed. Is there any other meaning for word other than two bytes? This is indeed *NOT* a very useful default on any modern computers. In some old computers, like the PDP11 2 byte words were common and useful. I'm still not convinced that it was a useful default. Since C which is the lingua fanca of Unix was clearly bytes based. I guess nobody cares because since the 1970s when these tools were written, nobody uses them any more ;-) I don't think I used od in at least two decade... Well, I use them if I need to quickly inspect a file in binary format when I'm already using the command line. Say, I'm having a unit test that implements a binary protocol, and I want to verify with my eyes that I'm getting the right results. ./generate_msg | hexdump -C is quicker than ./generate_msg tmp sane_hex_editor tmp. How do you do that without hexdump, if you actually have this need at all. But maybe it's just a bad old habit of mine. I guess that if you get used to a more modern workflow, you can make using modern tools to inspect the same data just as quickly. As you can understand, less will not help me with that. ___ Linux-il mailing list Linux-il@cs.huji.ac.il http://mailman.cs.huji.ac.il/mailman/listinfo/linux-il
Re: Unix History: Why does hexdump default to word alignment?
On 12/01/2011 10:10 AM, Nadav Har'El wrote: On Thu, Dec 01, 2011, Elazar Leibovich wrote about Unix History: Why does hexdump default to word alignment?: The default behaviour of hexdump is to align data word-wide. For instance Just as a comment, if I remember correctly, hexdump isn't actually part of ancient Unix history - the original was od, which as its name suggests dumps in *octal*, but had an option od -x to see hexadecimal. In any case, od and hexdump are very similar, and apparently have the same ideosynchacies, as od -x also defaults two two-byte words. printf '\xFF\xFF\x01' | hexdump 000 0001 003 This makes little sense to me. In C, structs are not necessarily aligned to words, and it doesn't seems useful to view about any data format for which you're sure everything is word-aligned. The hexdump -C behaviour makes much more sense in the general case. When you say words and word aligned here, you mean historic 2 byte words. This is indeed *NOT* a very useful default on any modern computers. In some old computers, like the PDP11 2 byte words were common and useful. In other old computers, this was never a useful default. I guess nobody cares because since the 1970s when these tools were written, nobody uses them any more ;-) I don't think I used od in at least two decade... At least since less was invented and usually does the right thing (show ASCII when possible, or hex for nonvisible characters). Amazingly, I don't believe that the original od even had an option to see hex for each byte: od -c didn't show hex, od -x showed hex for each two bytes, and od -b (for bytes) showed each byte but octal (which evidentally was more popular than hex in the old days). Gnu's od can do what you want with od -t x1. As you saw, so can hexdump with the -C flag. apparently, you did not use binary data serialization in the past two decades. when you serialize data and store it into a file (also on the network), it is very useful to be able to see the data as 2-byte or 4-byte or whatever-byte numbers, when debugging. in the last few years, i have been using od more then i did in the decade before that ;) --guy ___ Linux-il mailing list Linux-il@cs.huji.ac.il http://mailman.cs.huji.ac.il/mailman/listinfo/linux-il
Re: Unix History: Why does hexdump default to word alignment?
On Dec 1, 2011, at 10:28 AM, Elazar Leibovich wrote: Indeed. Is there any other meaning for word other than two bytes? This is indeed *NOT* a very useful default on any modern computers. In some old computers, like the PDP11 2 byte words were common and useful. Well, let's see, going back to the 1960's, IBM 1401, word size set by a bit in memory, a word mark on a digit. IBM 360, 32 bit words. IBM 1130 16 bit words, HP 2000/2100 16 bit words, CDC 6400/6600 (basis for Cray) 60 bit words. Burroughs 5500 series used a 48 bit word. Philco 2000 (I actually used an 1000 or 1100 (I can't remember) which was a smaller version) 48 bit word. SDS940 used a 24 bit word. I was too late to use Quicktran (it was used up until June of the year I started 10th grade, but I did not start until September) which ran on an IBM 7044 with a 36 bit word. Another high school nearby which I did not go to had a PDP 8 with a 12 bit word. Those were the ones I can remember having used in the time frame 1969 to 1972. The original Unics (later UNIX) machine was a PDP 7 with an 18 bit word. PDP 11's were relative latecomers to the game First released in 1970. The PDP 11 offered Unix from the start, but most PDP11's ran a much less demanding operating system. Geoff. -- Geoffrey S. Mendelson, N3OWJ/4X1GM My high blood pressure medicine reduces my midichlorian count. :-( ___ Linux-il mailing list Linux-il@cs.huji.ac.il http://mailman.cs.huji.ac.il/mailman/listinfo/linux-il
Re: Unix History: Why does hexdump default to word alignment?
On Thu, Dec 1, 2011 at 11:32 AM, geoffrey mendelson geoffreymendel...@gmail.com wrote: Well, let's see, going back to the 1960's, IBM 1401, word size set by a bit in memory, a word mark on a digit. Thanks for educating me, you need to get a job in CS archaeology. But what did the word mark mean? In my ignorance I thought that work meant to imply amount of bits. ___ Linux-il mailing list Linux-il@cs.huji.ac.il http://mailman.cs.huji.ac.il/mailman/listinfo/linux-il
Fwd: Unix History: Why does hexdump default to word alignment?
On Dec 1, 2011, at 12:00 PM, Elazar Leibovich wrote: But what did the word mark mean? In my ignorance I thought that work meant to imply amount of bits. The IBM 1401 and similar series of computers used DECIMAL not binary numbers and the word mark was the extra bit turned on to indicate an end of word. Actually the word mark was at the beginning of a word so the end was really the word mark of the next word after it. So if you had the number 123456789 in memory and you wanted to address it the one would be at the low memory address with the word mark bit turned on, and the nine at the high end. You would point the instruction to the high address (that of the nine). If I remember correctly, instructions addressed the ones digit in a number, so you could specify as many digits in a word as you needed. This was common in 1950's vintage computers as business computers had decimal instructions and scientific ones binary instructions (integer with the option of floating point on some computers). The IBM 360 was the first AFAIK to have both. It used instructions with the decimal length in them, so although binary words were 32 bit, decimal ones, if you want to call them words at all, were up to 31 digits plus a sign (1-16 bytes). Turbo Pascal for the IBM PC had a decimal mode were it would store numbers as decimal digits and do decimal arithmetic on them. I never used TP, so I don't know much more about it. Any Pascal programmers out there? Do Linux Pascal compilers have it? Geoff. -- Geoffrey S. Mendelson, N3OWJ/4X1GM My high blood pressure medicine reduces my midichlorian count. :-( ___ Linux-il mailing list Linux-il@cs.huji.ac.il http://mailman.cs.huji.ac.il/mailman/listinfo/linux-il
Re: Unix History: Why does hexdump default to word alignment?
On Thu, Dec 1, 2011 at 12:25 PM, geoffrey mendelson geoffreymendel...@gmail.com wrote: Turbo Pascal for the IBM PC had a decimal mode were it would store numbers as decimal digits and do decimal arithmetic on them. I never used TP, so I don't know much more about it. Any Pascal programmers out there? My first paid software job was programming in Turbo Pascal on IBM PCs, 20-something years ago. Late 80ies... Before Linux even existed as a concept... Oh, my. I *think* TP had BCD (binary-coded decimals, google or check Wikipedia if you are interested). Basically, in BCDs every decimal digit is coded separately as a 4-bit binary number. IIRC, all x86 processors provided BCD-related instructions (conversions to and from), but I think even then it was slower than straightforward binary arithmetic. It was slow because the machine instructions were for single bytes only, but not for wider objects. I do not remember if it was in any way related to 8087 math co-processors. I don't recall ever using BCD explicitly, but I may have been too inexperienced to notice. Never programmed in Pascal since. -- Oleg Goldshmidt | p...@goldshmidt.org ___ Linux-il mailing list Linux-il@cs.huji.ac.il http://mailman.cs.huji.ac.il/mailman/listinfo/linux-il
Re: Unix History: Why does hexdump default to word alignment?
On Dec 1, 2011, at 12:51 PM, Oleg Goldshmidt wrote: I don't recall ever using BCD explicitly, but I may have been too inexperienced to notice. Never programmed in Pascal since. Oleg, One used BCD for money. I once worked at a place where one of the programmers wrote the pension reporting programs for the IBM 370 in PL/ I using floating point arithmetic. When people saw the reports and noticed that they had strangely rounded balances in their accounts, the whole thing was scrapped and re-written using decimal numbers. Geoff. -- Geoffrey S. Mendelson, N3OWJ/4X1GM My high blood pressure medicine reduces my midichlorian count. :-( ___ Linux-il mailing list Linux-il@cs.huji.ac.il http://mailman.cs.huji.ac.il/mailman/listinfo/linux-il
Re: Unix History: Why does hexdump default to word alignment?
On Thu, Dec 1, 2011 at 12:58 PM, geoffrey mendelson geoffreymendel...@gmail.com wrote: One used BCD for money. I once worked at a place where one of the programmers wrote the pension reporting programs for the IBM 370 in PL/I using floating point arithmetic. When people saw the reports and noticed that they had strangely rounded balances in their accounts, the whole thing was scrapped and re-written using decimal numbers. Oh, I know that. I also know a bit about the dangers of floating point arithmetic. That job of mine had nothing to do with money though (unlike some of the subsequent ones). By the way, eliminating rounding errors is the primary reason why beasts like Java's BigDecimals exist today. True to its (ugly) form, Java does not allow operator overloading so BigDecimal arithmetic is implemented via method calls and even simple formulas look unparseable by naked eye in code. [Just venting workplace-related frustration here, sorry... ;-)] -- Oleg Goldshmidt | o...@goldshmidt.org ___ Linux-il mailing list Linux-il@cs.huji.ac.il http://mailman.cs.huji.ac.il/mailman/listinfo/linux-il
Re: Unix History: Why does hexdump default to word alignment?
On Thu, 2011-12-01 at 12:51 +0200, Oleg Goldshmidt wrote: IIRC, all x86 processors provided BCD-related instructions (conversions to and from), but I think even then it was slower than straightforward binary arithmetic. It was slow because the machine instructions were for single bytes only, but not for wider objects. I do not remember if it was in any way related to 8087 math co-processors. The 8086 had several instructions for adjusting results of arithmetic operations to conform to BCD values. The 8087 had the FBLD and FBSTP for loading and storing BCD values. --- Omer -- My Commodore 64 is suffering from slowness and insufficiency of memory; and its display device is grievously short of pixels. Can anyone help? My own blog is at http://www.zak.co.il/tddpirate/ My opinions, as expressed in this E-mail message, are mine alone. They do not represent the official policy of any organization with which I may be affiliated in any way. WARNING TO SPAMMERS: at http://www.zak.co.il/spamwarning.html ___ Linux-il mailing list Linux-il@cs.huji.ac.il http://mailman.cs.huji.ac.il/mailman/listinfo/linux-il
Re: Unix History: Why does hexdump default to word alignment?
On Thu, Dec 01, 2011, guy keren wrote about Re: Unix History: Why does hexdump default to word alignment?: apparently, you did not use binary data serialization in the past two decades. when you serialize data and store it into a file (also on the network), it is very useful to be able to see the data as 2-byte or 4-byte or whatever-byte numbers, when debugging. Well, for debugging you typically use tools like a debugger (gdb, ddd, etc.) or network sniffer or something - and those have their own methods of displaying data, and do not use od. So using the actual od command in a shell or shell-script is not something I ended up doing in recent years. I don't think I even noticed the new hexdump sibling of od cropped up in Linux ;-) -- Nadav Har'El| Thursday, Dec 1 2011, n...@math.technion.ac.il |- Phone +972-523-790466, ICQ 13349191 |He who dies with the most toys is still http://nadav.harel.org.il |dead -- Citibank billboard, Manhattan 2001 ___ Linux-il mailing list Linux-il@cs.huji.ac.il http://mailman.cs.huji.ac.il/mailman/listinfo/linux-il
Re: Unix History: Why does hexdump default to word alignment?
On Thu, Dec 01, 2011 at 01:55:24PM +0200, Nadav Har'El wrote: On Thu, Dec 01, 2011, guy keren wrote about Re: Unix History: Why does hexdump default to word alignment?: apparently, you did not use binary data serialization in the past two decades. when you serialize data and store it into a file (also on the network), it is very useful to be able to see the data as 2-byte or 4-byte or whatever-byte numbers, when debugging. Well, for debugging you typically use tools like a debugger (gdb, ddd, etc.) or network sniffer or something - and those have their own methods of displaying data, and do not use od. So using the actual od command in a shell or shell-script is not something I ended up doing in recent years. I don't think I even noticed the new hexdump sibling of od cropped up in Linux ;-) Regarding new siblings of od, and just in case someone expects a useful piece of information in this thread - I happened to use several times xxd, which can also do the reverse - convert its output back to binary - so you can use it together with your $EDITOR as a poor man's binary editor. I guess people used uuencode/uudecode for this in the past, perhaps I did too, but xxd is much more comfortable. -- Didi ___ Linux-il mailing list Linux-il@cs.huji.ac.il http://mailman.cs.huji.ac.il/mailman/listinfo/linux-il
Re: Unix History: Why does hexdump default to word alignment?
On 12/01/2011 01:55 PM, Nadav Har'El wrote: On Thu, Dec 01, 2011, guy keren wrote about Re: Unix History: Why does hexdump default to word alignment?: apparently, you did not use binary data serialization in the past two decades. when you serialize data and store it into a file (also on the network), it is very useful to be able to see the data as 2-byte or 4-byte or whatever-byte numbers, when debugging. Well, for debugging you typically use tools like a debugger (gdb, ddd, etc.) or network sniffer or something - and those have their own methods of displaying data, and do not use od. So using the actual od command in a shell or shell-script is not something I ended up doing in recent years. I don't think I even noticed the new hexdump sibling of od cropped up in Linux ;-) you can use a debugger only for the basic code. you cannot use a debugger when you're dealing with multiple threads that access the same shared data and could have race conditions. in those cases you need to run a test, find that the eventual data is incorrect, and track back using logs and friends, to find the culprit(s). this is the common case in storage systems - but also in other types of systems. --guy ___ Linux-il mailing list Linux-il@cs.huji.ac.il http://mailman.cs.huji.ac.il/mailman/listinfo/linux-il
Re: Unix History: Why does hexdump default to word alignment?
On Fri, Dec 2, 2011 at 9:28 AM, guy keren c...@actcom.co.il wrote: you can use a debugger only for the basic code. you cannot use a debugger when you're dealing with multiple threads that access the same shared data and could have race conditions. in those cases you need to run a test, find that the eventual data is incorrect, and track back using logs and friends, to find the culprit(s). I think that what Nadav meant, is instead of adding log_raw_data_to_file(file); you can set a breakpoint there, and watch the data with gdb's x. Like you I find the printf-debugging approach more appealing, but it might be that I'm just stuck in the past, and reluctant to try new tools. this is the common case in storage systems - but also in other types of systems. Other type I ran into, is when passing binary data from process A to process B using pipes, it's extremely quick to test the actual data with ./proc | hexdump -C, then by redirecting the output to a file, or tapping the network/pipe. ___ Linux-il mailing list Linux-il@cs.huji.ac.il http://mailman.cs.huji.ac.il/mailman/listinfo/linux-il