bug#16578: Wish: Support for non-native endianness in od

2014-02-09 Thread Niels Möller
Pádraig Brady p...@draigbrady.com writes:

 Attached in the patch I intend to push in your name.

Nice.

 I also added docs to usage() and the texinfo file, and added a test.

I don't quite understand how the test works, but as far as I see, it
doesn't test floats? So that's inconsistent with the commit message.

 BTW I checked if there was any speed difference with the new code.
 I wasn't expecting this to be a bottleneck, and true enough
 there is only a marginal change. The new code is consistently
 a little _faster_ though on my i3-2310M which is a bit surprising.

Odd. But performance of x86 is usually pretty hard to predict by just
looking at the source or assembly code. I was hoping that in the
non-swapped case, the false conditional

   if (input_swap  sizeof(T)  1)

should be very friendly to the branch predictor, and hence almost free.

Jim Meyering j...@meyering.net writes:

 One nit: please change the type of j here (identical in attached)
 to be unsigned, to match that of the upper bound.

Makes sense. In my own projects, I tend to use unsigned int for loop
counts whereever I don't need to iterate over any negative values. But
my impression is that most others prefer to use signed int for
everything which doesn't rely on mod 2^n arithmetic, so that's why I
made j signed here.

 That would be our first use of rev. Is it ubiquitous enough to depend on?

It appears *not* to be available on my closest solaris box. While on my
gnu/linux system, it's provided by util-linux. For the test, I guess rev
could be implemented something like

while read line
  printf %s line | tr -d '\n' | sed 's/./.\n/' | tac | tr -d '\n'
  echo
done 

Maybe rev should be provided by coreutils, similarly to tac? I'd prefer
not to think about the unicode issues for rev, though...

Regards,
/Niels

-- 
Niels Möller. PGP-encrypted email is preferred. Keyid C0B98E26.
Internet email is subject to wholesale government surveillance.





bug#16578: Wish: Support for non-native endianness in od

2014-02-09 Thread Pádraig Brady
On 02/09/2014 08:42 AM, Niels Möller wrote:
 Pádraig Brady p...@draigbrady.com writes:
 
 Attached in the patch I intend to push in your name.
 
 Nice.
 
 I also added docs to usage() and the texinfo file, and added a test.
 
 I don't quite understand how the test works, but as far as I see, it
 doesn't test floats? So that's inconsistent with the commit message.

Oops, I removed an 'f' while developing. Added that back now
which also gets sizes up to 16 tested.

 BTW I checked if there was any speed difference with the new code.
 I wasn't expecting this to be a bottleneck, and true enough
 there is only a marginal change. The new code is consistently
 a little _faster_ though on my i3-2310M which is a bit surprising.
 
 Odd. But performance of x86 is usually pretty hard to predict by just
 looking at the source or assembly code. I was hoping that in the
 non-swapped case, the false conditional
 
if (input_swap  sizeof(T)  1)
 
 should be very friendly to the branch predictor, and hence almost free.
 
 Jim Meyering j...@meyering.net writes:
 
 One nit: please change the type of j here (identical in attached)
 to be unsigned, to match that of the upper bound.
 
 Makes sense. In my own projects, I tend to use unsigned int for loop
 counts whereever I don't need to iterate over any negative values. But
 my impression is that most others prefer to use signed int for
 everything which doesn't rely on mod 2^n arithmetic, so that's why I
 made j signed here.

done

 That would be our first use of rev. Is it ubiquitous enough to depend on?

Ugh good point.

 It appears *not* to be available on my closest solaris box. While on my
 gnu/linux system, it's provided by util-linux. For the test, I guess rev
 could be implemented something like
 
 while read line
   printf %s line | tr -d '\n' | sed 's/./.\n/' | tac | tr -d '\n'
   echo
 done 

I went with:

rev() {
  while read line; do
printf '%s' $line | sed 's/./\n/g' | tac | paste -s -d ''
  done
}

 Maybe rev should be provided by coreutils, similarly to tac? I'd prefer
 not to think about the unicode issues for rev, though...

I think so too. It's not Linux specific and we've previously
mentioned rev in alternative for adding various functionality to coreutils.

Thanks to both of you for the review!

I've now pushed:
http://git.sv.gnu.org/gitweb/?p=coreutils.git;a=commit;h=b370924c

Pádraig.





bug#16578: Wish: Support for non-native endianness in od

2014-02-09 Thread Pádraig Brady
On 02/10/2014 01:59 AM, Paul Eggert wrote:
 Pádraig Brady wrote:
   $ time od.new -tx8 --endian=bug od.in
   4.97 elapsed
 
 If you really used --endian=bug and there was no diagnostic, then there 
 must have been a bug.  :-)

Ha!
I retyped incorrectly rather than copy/pasted.
I can confirm the params are checked correctly:

$ od -tx8 --endian=bug od.in
od: invalid argument ‘bug’ for ‘--endian’
Valid arguments are:
  - ‘little’
  - ‘big’
Try 'src/od --help' for more information.






bug#16578: Wish: Support for non-native endianness in od

2014-02-08 Thread Pádraig Brady
On 02/02/2014 01:20 AM, Pádraig Brady wrote:
 On 01/31/2014 09:44 AM, Niels Möller wrote:
 ni...@lysator.liu.se (Niels Möller) writes:

 Pádraig Brady p...@draigbrady.com writes:
 I agree this would be useful and easy enough to add.
 I suppose the interface would be --endian=little|big

 Maybe I can have a look at what it takes.

 Below is a crude patch (missing: usage message, tests cases, docs,
 translation). I think it should work fine for floats too. I see no
 obvious and more beautiful way to do it. 

 (And I think I have copyright assignment papers for coreutils in place,
 since work on factor some year ago).

 Regards,
 /Niels

 diff --git a/src/od.c b/src/od.c
 index 514fe50..a71e302 100644
 --- a/src/od.c
 +++ b/src/od.c
 @@ -259,13 +259,16 @@ static enum size_spec 
 integral_type_size[MAX_INTEGRAL_TYPE_SIZE + 1];
  #define MAX_FP_TYPE_SIZE sizeof (long double)
  static enum size_spec fp_type_size[MAX_FP_TYPE_SIZE + 1];
  
 +bool input_swap;
 +
  static char const short_options[] = A:aBbcDdeFfHhIij:LlN:OoS:st:vw::Xx;
  
  /* For long options that have no equivalent short option, use a
 non-character as a pseudo short option, starting with CHAR_MAX + 1.  */
  enum
  {
 -  TRADITIONAL_OPTION = CHAR_MAX + 1
 +  TRADITIONAL_OPTION = CHAR_MAX + 1,
 +  ENDIAN_OPTION,
  };
  
  static struct option const long_options[] =
 @@ -278,6 +281,7 @@ static struct option const long_options[] =
{strings, optional_argument, NULL, 'S'},
{traditional, no_argument, NULL, TRADITIONAL_OPTION},
{width, optional_argument, NULL, 'w'},
 +  {endian, required_argument, NULL, ENDIAN_OPTION },
  
{GETOPT_HELP_OPTION_DECL},
{GETOPT_VERSION_OPTION_DECL},
 @@ -406,7 +410,21 @@ N (size_t fields, size_t blank, void const *block,  
 \
  {   \
int next_pad = pad * (i - 1) / fields;\
int adjusted_width = pad_remaining - next_pad + width;\
 -  T x = *p++;   \
 +  T x;  \
 +  if (input_swap  sizeof(T)  1)  \
 +{   \
 +  int j;\
 +  union {   \
 +T x;\
 +char b[sizeof(T)];  \
 +  } u;  \
 +  for (j = 0; j  sizeof(T); j++)   \
 +u.b[j] = ((const char *) p)[sizeof(T) - 1 - j]; \
 +  x = u.x;  \
 +}   \
 +  else  \
 +x = *p; \
 +  p++;  \
ACTION;   \
pad_remaining = next_pad; \
  }   \
 @@ -1664,6 +1682,24 @@ main (int argc, char **argv)
traditional = true;
break;
  
 +case ENDIAN_OPTION:
 +  if (!strcmp (optarg, big))
 +{
 +#if !WORDS_BIGENDIAN
 +  input_swap = true;
 +#endif
 +}
 +  else if (!strcmp (optarg, little))
 +{
 +#if WORDS_BIGENDIAN
 +input_swap = true;
 +#endif
 +}
 +  else
 +error (EXIT_FAILURE, 0,
 +   _(bad argument '%s' for --endian option), optarg);
 +  break;
 +
/* The next several cases map the traditional format
   specification options to the corresponding modern format
   specs.  GNU od accepts any combination of old- and
 
 That looks good.
 I'll adjust slightly to use XARGMATCH and add some docs/tests.
 I'm travelling at the moment but merge this soon.

Attached in the patch I intend to push in your name.

I changed the option handling to reuse the XARGMATCH functionality.
Also I changed things slightly so as the last --endian option
specified wins. Previously we only set the input_swap variable
to true, never to false. On a related point I set the input_swap
global to be static.

I also added docs to usage() and the texinfo file, and added a test.

BTW I checked if there was any speed difference with the new code.
I wasn't expecting this to be a bottleneck, and true enough
there is only a marginal change. The new code is consistently
a little _faster_ though on my i3-2310M which 

bug#16578: Wish: Support for non-native endianness in od

2014-02-08 Thread Jim Meyering
On Sat, Feb 8, 2014 at 2:01 PM, Pádraig Brady p...@draigbrady.com wrote:
 +  if (input_swap  sizeof(T)  1)  \
 +{   \
 +  int j;\

The new patch looks complete.  Thanks to both of you.
One nit: please change the type of j here (identical in attached)
to be unsigned, to match that of the upper bound.

 +  union {   \
 +T x;\
 +char b[sizeof(T)];  \
 +  } u;  \
 +  for (j = 0; j  sizeof(T); j++)   \
 +u.b[j] = ((const char *) p)[sizeof(T) - 1 - j]; \

Re this function in the new test,

 +in_swapped() { printf '%s' $in | sed s/.\{$1\}/\\n/g | rev | tr -d 
 '\n'; }

That would be our first use of rev. Is it ubiquitous enough to depend on?





bug#16578: Wish: Support for non-native endianness in od

2014-02-01 Thread Pádraig Brady
On 01/31/2014 09:44 AM, Niels Möller wrote:
 ni...@lysator.liu.se (Niels Möller) writes:
 
 Pádraig Brady p...@draigbrady.com writes:
 I agree this would be useful and easy enough to add.
 I suppose the interface would be --endian=little|big

 Maybe I can have a look at what it takes.
 
 Below is a crude patch (missing: usage message, tests cases, docs,
 translation). I think it should work fine for floats too. I see no
 obvious and more beautiful way to do it. 
 
 (And I think I have copyright assignment papers for coreutils in place,
 since work on factor some year ago).
 
 Regards,
 /Niels
 
 diff --git a/src/od.c b/src/od.c
 index 514fe50..a71e302 100644
 --- a/src/od.c
 +++ b/src/od.c
 @@ -259,13 +259,16 @@ static enum size_spec 
 integral_type_size[MAX_INTEGRAL_TYPE_SIZE + 1];
  #define MAX_FP_TYPE_SIZE sizeof (long double)
  static enum size_spec fp_type_size[MAX_FP_TYPE_SIZE + 1];
  
 +bool input_swap;
 +
  static char const short_options[] = A:aBbcDdeFfHhIij:LlN:OoS:st:vw::Xx;
  
  /* For long options that have no equivalent short option, use a
 non-character as a pseudo short option, starting with CHAR_MAX + 1.  */
  enum
  {
 -  TRADITIONAL_OPTION = CHAR_MAX + 1
 +  TRADITIONAL_OPTION = CHAR_MAX + 1,
 +  ENDIAN_OPTION,
  };
  
  static struct option const long_options[] =
 @@ -278,6 +281,7 @@ static struct option const long_options[] =
{strings, optional_argument, NULL, 'S'},
{traditional, no_argument, NULL, TRADITIONAL_OPTION},
{width, optional_argument, NULL, 'w'},
 +  {endian, required_argument, NULL, ENDIAN_OPTION },
  
{GETOPT_HELP_OPTION_DECL},
{GETOPT_VERSION_OPTION_DECL},
 @@ -406,7 +410,21 @@ N (size_t fields, size_t blank, void const *block,   
\
  {   \
int next_pad = pad * (i - 1) / fields;\
int adjusted_width = pad_remaining - next_pad + width;\
 -  T x = *p++;   \
 +  T x;  \
 +  if (input_swap  sizeof(T)  1)  \
 +{   \
 +  int j;\
 +  union {   \
 +T x;\
 +char b[sizeof(T)];  \
 +  } u;  \
 +  for (j = 0; j  sizeof(T); j++)   \
 +u.b[j] = ((const char *) p)[sizeof(T) - 1 - j]; \
 +  x = u.x;  \
 +}   \
 +  else  \
 +x = *p; \
 +  p++;  \
ACTION;   \
pad_remaining = next_pad; \
  }   \
 @@ -1664,6 +1682,24 @@ main (int argc, char **argv)
traditional = true;
break;
  
 +case ENDIAN_OPTION:
 +  if (!strcmp (optarg, big))
 +{
 +#if !WORDS_BIGENDIAN
 +  input_swap = true;
 +#endif
 +}
 +  else if (!strcmp (optarg, little))
 +{
 +#if WORDS_BIGENDIAN
 +input_swap = true;
 +#endif
 +}
 +  else
 +error (EXIT_FAILURE, 0,
 +   _(bad argument '%s' for --endian option), optarg);
 +  break;
 +
/* The next several cases map the traditional format
   specification options to the corresponding modern format
   specs.  GNU od accepts any combination of old- and

That looks good.
I'll adjust slightly to use XARGMATCH and add some docs/tests.
I'm travelling at the moment but merge this soon.

thanks!
Pádraig.






bug#16578: Wish: Support for non-native endianness in od

2014-01-31 Thread Niels Möller
ni...@lysator.liu.se (Niels Möller) writes:

 Pádraig Brady p...@draigbrady.com writes:
 I agree this would be useful and easy enough to add.
 I suppose the interface would be --endian=little|big

 Maybe I can have a look at what it takes.

Below is a crude patch (missing: usage message, tests cases, docs,
translation). I think it should work fine for floats too. I see no
obvious and more beautiful way to do it. 

(And I think I have copyright assignment papers for coreutils in place,
since work on factor some year ago).

Regards,
/Niels

diff --git a/src/od.c b/src/od.c
index 514fe50..a71e302 100644
--- a/src/od.c
+++ b/src/od.c
@@ -259,13 +259,16 @@ static enum size_spec 
integral_type_size[MAX_INTEGRAL_TYPE_SIZE + 1];
 #define MAX_FP_TYPE_SIZE sizeof (long double)
 static enum size_spec fp_type_size[MAX_FP_TYPE_SIZE + 1];
 
+bool input_swap;
+
 static char const short_options[] = A:aBbcDdeFfHhIij:LlN:OoS:st:vw::Xx;
 
 /* For long options that have no equivalent short option, use a
non-character as a pseudo short option, starting with CHAR_MAX + 1.  */
 enum
 {
-  TRADITIONAL_OPTION = CHAR_MAX + 1
+  TRADITIONAL_OPTION = CHAR_MAX + 1,
+  ENDIAN_OPTION,
 };
 
 static struct option const long_options[] =
@@ -278,6 +281,7 @@ static struct option const long_options[] =
   {strings, optional_argument, NULL, 'S'},
   {traditional, no_argument, NULL, TRADITIONAL_OPTION},
   {width, optional_argument, NULL, 'w'},
+  {endian, required_argument, NULL, ENDIAN_OPTION },
 
   {GETOPT_HELP_OPTION_DECL},
   {GETOPT_VERSION_OPTION_DECL},
@@ -406,7 +410,21 @@ N (size_t fields, size_t blank, void const *block, 
 \
 {   \
   int next_pad = pad * (i - 1) / fields;\
   int adjusted_width = pad_remaining - next_pad + width;\
-  T x = *p++;   \
+  T x;  \
+  if (input_swap  sizeof(T)  1)  \
+{   \
+  int j;\
+  union {   \
+T x;\
+char b[sizeof(T)];  \
+  } u;  \
+  for (j = 0; j  sizeof(T); j++)   \
+u.b[j] = ((const char *) p)[sizeof(T) - 1 - j]; \
+  x = u.x;  \
+}   \
+  else  \
+x = *p; \
+  p++;  \
   ACTION;   \
   pad_remaining = next_pad; \
 }   \
@@ -1664,6 +1682,24 @@ main (int argc, char **argv)
   traditional = true;
   break;
 
+case ENDIAN_OPTION:
+  if (!strcmp (optarg, big))
+{
+#if !WORDS_BIGENDIAN
+  input_swap = true;
+#endif
+}
+  else if (!strcmp (optarg, little))
+{
+#if WORDS_BIGENDIAN
+input_swap = true;
+#endif
+}
+  else
+error (EXIT_FAILURE, 0,
+   _(bad argument '%s' for --endian option), optarg);
+  break;
+
   /* The next several cases map the traditional format
  specification options to the corresponding modern format
  specs.  GNU od accepts any combination of old- and




-- 
Niels Möller. PGP-encrypted email is preferred. Keyid C0B98E26.
Internet email is subject to wholesale government surveillance.





bug#16578: Wish: Support for non-native endianness in od

2014-01-30 Thread Niels Möller
Pádraig Brady p...@draigbrady.com writes:

 On 01/28/2014 12:54 PM, Niels Möller wrote:
 For the od program, it would be nice with a flag to specify the
 endianness for all types which are larger than a byte. Possible
 alternatives could be big endian, little endian, native endian.

 I agree this would be useful and easy enough to add.
 I suppose the interface would be --endian=little|big

Maybe I can have a look at what it takes.

 And for floats, besides endianness, it would be nice to be able to
 specify native format or ieee format, for systems where these are
 different.

 That's a bit less useful I think and harder to implement.

I agree that's a bit more obscure. So I understand if you don't want to do
that until there's some concrete usecase.

Endianness for float types should be easier, I hope.

Regards,
/Niels

-- 
Niels Möller. PGP-encrypted email is preferred. Keyid C0B98E26.
Internet email is subject to wholesale government surveillance.





bug#16578: Wish: Support for non-native endianness in od

2014-01-29 Thread Pádraig Brady
On 01/28/2014 12:54 PM, Niels Möller wrote:
 For the od program, it would be nice with a flag to specify the
 endianness for all types which are larger than a byte. Possible
 alternatives could be big endian, little endian, native endian.

I agree this would be useful and easy enough to add.
I suppose the interface would be --endian=little|big
We could augment that with specific byte order spec,
but those two are probably enough.

 And for floats, besides endianness, it would be nice to be able to
 specify native format or ieee format, for systems where these are
 different.

That's a bit less useful I think and harder to implement.
We say this in the info docs:

Almost all modern systems use IEEE-754 floating point,
and it is typically portable to assume IEEE-754 behavior these days.

thanks,
Pádraig.





bug#16578: Wish: Support for non-native endianness in od

2014-01-28 Thread Niels Möller
For the od program, it would be nice with a flag to specify the
endianness for all types which are larger than a byte. Possible
alternatives could be big endian, little endian, native endian.

And for floats, besides endianness, it would be nice to be able to
specify native format or ieee format, for systems where these are
different.

Regards,
/Niels

-- 
Niels Möller. PGP-encrypted email is preferred. Keyid C0B98E26.
Internet email is subject to wholesale government surveillance.