bug#17505: Pádraig: does this solve your consistency concern? (was bug#17505: dd statistics output)

2014-07-28 Thread Christian Groessler

On 07/27/14 19:11, Linda Walsh wrote:

It is more common to specify transfer sizes in SI and mean IEC if you
are in the US where the digital computer was created.

People in the US have not adopted SI units and many wouldn't know
a meter from a molehill, so SI units aren't the first thing that
they are likely to be meaning.  Computer scientists and the industry 
here,

grew up with using IEC prefixes where multiples of 8 are already in
use.  I.e. if you are talking *bytes*, you are using base 2.



I didn't grow up in the US, and grew up with the metric system, but when I'm
talking about memory sizes I always mean IEC (2^10) and never SI (10^3).
The only pitfall here are hard disk sizes where I have to remember that 
they

mean SI.




It is inconsistent to switch to decimal prefixes when talking about
binary numbers.



Agreed.






BTW I was playing devil's advocate with my mention of the SIGUSR1 
inconsistency.

I'm still of the opinion that the dynamic switch of human units based on
current transferred amount is the lesser of two evils, since this output
is destined for human consumption.



I don't get the reason for the dynamic switch at all. Can somebody 
enlighten me?


regards,
chris







bug#17505: Pádraig: does this solve your consistency concern? (was bug#17505: dd statistics output)

2014-07-28 Thread Linda Walsh



Christian Groessler wrote:

On 07/27/14 19:11, Linda Walsh wrote:

It is more common to specify transfer sizes in SI and mean IEC if you
are in the US where the digital computer was created.

People in the US have not adopted SI units and many wouldn't know
a meter from a molehill, so SI units aren't the first thing that
they are likely to be meaning.  Computer scientists and the industry 
here,

grew up with using IEC prefixes where multiples of 8 are already in
use.  I.e. if you are talking *bytes*, you are using base 2.



I didn't grow up in the US, and grew up with the metric system, but when 
I'm

talking about memory sizes I always mean IEC (2^10) and never SI (10^3).
The only pitfall here are hard disk sizes where I have to remember that 
they

mean SI.


I was trying to come up with some reason for Padraig's belief
that people usually meant SI when using IEC prefixes for computer
sizes like units bytes (2^3bits) or sectors (2^12 bits)... now what
power of 10 is that?  I've never heard of anyone supporting Padraig
position -- so I assumed it must be some foreign country where the
metric system and metric prefixes are meant to apply to non-unary
and non-base-10 quantities.  Pádraig: where did you get your impression?

When it comes to disk space -- computers always give it in
IEC -- except where they've bought the line that mixed base-2 and power-of-10
prefixes is a good thing, then they try to get others to buy into such.

But reality is that one can't express disk space as a power of 10 as there
is no multiple of 10 that lines up with a 512-byte multiple.  I.e. the system is
designed to be inaccurate and confuse the issue to make it harder for
consumers to do comparisons.


I don't get the reason for the dynamic switch at all. Can somebody 
enlighten me?


I think it was thrown in as a red herring, as I can't think
of any useful case for it.  Having the output vary units randomly, not
at the bequest of the user, doesn't seem especially useful.





bug#17505: Pádraig: does this solve your consistency concern? (was bug#17505: dd statistics output)

2014-07-27 Thread Linda Walsh



Pádraig Brady wrote:


That was the original approach but is a bit worse than the dynamic approach
since it's common to specify transfer sizes in IEC units for SI sized data.


It is more common to specify transfer sizes in SI and mean IEC if you
are in the US where the digital computer was created.

People in the US have not adopted SI units and many wouldn't know
a meter from a molehill, so SI units aren't the first thing that
they are likely to be meaning.  Computer scientists and the industry here,
grew up with using IEC prefixes where multiples of 8 are already in
use.  I.e. if you are talking *bytes*, you are using base 2.

It is inconsistent to switch to decimal prefixes when talking about
binary numbers.

OTOH, if you are talking *bits*, I would say usage meaning SI units
are more common.

Bytes = 2^3 bits.  not 10 bits.

Now I was willing to go so far as to not force incompatible or bad
nomenclature upon others, but to use their own nomenclature when
replying to them.

If someone came up to you and spoke a question in French, would you
answer them in English and make some comment about people using
French by accident and they really mean to use English?

If you goal was clear communication, you'd try to answer in the language
they were querying in (presuming you knew it).  Only giving responses
in English, when you accept input in French, would likely be thought
insulting.

If people are that concerned to get the output they want in SI, they
might be bothered to use it on input (or read the manpage and
find out how to make it happen).  For those that are concerned to get the
output they want in computer compatible binary, you seem to be
saying they are S-O-L, which seems a poor and selfish attitude to
be taking.


BTW I was playing devil's advocate with my mention of the SIGUSR1 inconsistency.
I'm still of the opinion that the dynamic switch of human units based on
current transferred amount is the lesser of two evils, since this output
is destined for human consumption.


If it is for human consumption, humans like consistency --
if they speak to you in 1 language, they likely appreciate being
replied to in the same .. same goes for terminology and units.

If someone asks you how many kilometers it is to XXX and
you come back with 38 miles, you think that's a user friendly design?




cheers,
Pádraig.









bug#17505: Pádraig: does this solve your consistency concern? (was bug#17505: dd statistics output)

2014-07-26 Thread Pádraig Brady
On 07/26/2014 02:35 AM, Linda Walsh wrote:
 Pádraig: you may have missed this as it was a reply to
 an old thread, but, changing the subj and composing as new
 should prevent that (I hope)
 
 You were concerned that the user would get different outputs
 based on the previously suggested algorithm -- as well as
 possibly different output when SIGUSR1 came in.
 
 This idea seems to solve both of those -- so if the patch that was
 proposed for this was modified in line with this suggestion,
 would there be any further problems?
 
 
 Linda Walsh wrote:
 Found old bug, still open...

 Pádraig Brady wrote:
 On 07/16/2014 10:38 AM, Pádraig Brady wrote:
  
 http://bugs.gnu.org/17505#37 was proposed do the following automatically 
 (depending on the amount output):

   268435456 bytes (256 MiB) copied, 0.0248346 s, 10.8 GB/s

 However that wasn't applied due to inconsistency concerns.
 I'm still of the opinion that the change above would be a net gain,
 as the number in brackets is for human interpretation, and in the vast
 majority of cases would be the best representation for that.
 
One patch that would not be inconsistent:

If the user uses units of a single system (i.e. doesn't use 'si' and b2 
 units
 in same statement), then display the summary units using the same notation 
 the
 user used:

 dd if=xx bs=256M
 ...(256M copied)
 vs.
 dd if=xx bs=256MB
 ...(256MB copied)...

 Note another reason to _not_ apply the patch is that
 requests to print the statistics can come async through SIGUSR1,
 and thus increase the chances of inconsistent output.
 Solves this too, since the units are decided when the command is parsed,
 so SIGUSR would use the same units as would come out on a final summary.


 Or is using consistent units w/what the user users not ok?

 Note, for statements w/o units (or mixed system), there would be no reason 
 to change
 current behavior.

That was the original approach but is a bit worse than the dynamic approach
since it's common to specify transfer sizes in IEC units for SI sized data.

BTW I was playing devil's advocate with my mention of the SIGUSR1 inconsistency.
I'm still of the opinion that the dynamic switch of human units based on
current transferred amount is the lesser of two evils, since this output
is destined for human consumption.

cheers,
Pádraig.





bug#17505: Pádraig: does this solve your consistency concern? (was bug#17505: dd statistics output)

2014-07-25 Thread Linda Walsh

Pádraig: you may have missed this as it was a reply to
an old thread, but, changing the subj and composing as new
should prevent that (I hope)

You were concerned that the user would get different outputs
based on the previously suggested algorithm -- as well as
possibly different output when SIGUSR1 came in.

This idea seems to solve both of those -- so if the patch that was
proposed for this was modified in line with this suggestion,
would there be any further problems?


Linda Walsh wrote:

Found old bug, still open...

Pádraig Brady wrote:

On 07/16/2014 10:38 AM, Pádraig Brady wrote:
 
http://bugs.gnu.org/17505#37 was proposed do the following 
automatically (depending on the amount output):


  268435456 bytes (256 MiB) copied, 0.0248346 s, 10.8 GB/s

However that wasn't applied due to inconsistency concerns.
I'm still of the opinion that the change above would be a net gain,
as the number in brackets is for human interpretation, and in the vast
majority of cases would be the best representation for that.


   One patch that would not be inconsistent:

   If the user uses units of a single system (i.e. doesn't use 'si' 
and b2 units
in same statement), then display the summary units using the same 
notation the

user used:

dd if=xx bs=256M
...(256M copied)
vs.
dd if=xx bs=256MB
...(256MB copied)...


Note another reason to _not_ apply the patch is that
requests to print the statistics can come async through SIGUSR1,
and thus increase the chances of inconsistent output.

Solves this too, since the units are decided when the command is parsed,
so SIGUSR would use the same units as would come out on a final summary.


Or is using consistent units w/what the user users not ok?

Note, for statements w/o units (or mixed system), there would be no 
reason to change

current behavior.











bug#17505: dd statistics output

2014-07-21 Thread Linda Walsh

Found old bug, still open...

Pádraig Brady wrote:

On 07/16/2014 10:38 AM, Pádraig Brady wrote:
  

http://bugs.gnu.org/17505#37 was proposed do the following automatically 
(depending on the amount output):

  268435456 bytes (256 MiB) copied, 0.0248346 s, 10.8 GB/s

However that wasn't applied due to inconsistency concerns.
I'm still of the opinion that the change above would be a net gain,
as the number in brackets is for human interpretation, and in the vast
majority of cases would be the best representation for that.


   One patch that would not be inconsistent:

   If the user uses units of a single system (i.e. doesn't use 'si' and 
b2 units
in same statement), then display the summary units using the same 
notation the

user used:

dd if=xx bs=256M
...(256M copied)
vs.
dd if=xx bs=256MB
...(256MB copied)...


Note another reason to _not_ apply the patch is that
requests to print the statistics can come async through SIGUSR1,
and thus increase the chances of inconsistent output.

Solves this too, since the units are decided when the command is parsed,
so SIGUSR would use the same units as would come out on a final summary.


Or is using consistent units w/what the user users not ok?

Note, for statements w/o units (or mixed system), there would be no 
reason to change

current behavior.







bug#17505: dd statistics output

2014-07-16 Thread Pádraig Brady
On 07/16/2014 10:38 AM, Pádraig Brady wrote:
 On 07/16/2014 03:45 AM, Christian Groessler wrote:
 Hi,

 the final output of 'dd' is in SI mode (or how to call it). It uses 10^6 
 instead of 2^20 for megabyte.

 Example:

 $ dd if=/dev/zero of=/dev/null bs=65536 count=4096
 4096+0 records in
 4096+0 records out
 268435456 bytes (268 MB) copied, 0.0248346 s, 10.8 GB/s
 $

 Is there a switch to display in traditional units, I'd like to have

 268435456 bytes (256 MB) copied, ...
 
 http://bugs.gnu.org/17505#37 was proposed do the following automatically 
 (depending on the amount output):
 
   268435456 bytes (256 MiB) copied, 0.0248346 s, 10.8 GB/s
 
 However that wasn't applied due to inconsistency concerns.
 I'm still of the opinion that the change above would be a net gain,
 as the number in brackets is for human interpretation, and in the vast
 majority of cases would be the best representation for that.

Note another reason to _not_ apply the patch is that
requests to print the statistics can come async through SIGUSR1,
and thus increase the chances of inconsistent output.

thanks,
Pádraig.





bug#17505: dd statistics output

2014-07-16 Thread Christian Groessler

On 07/16/14 15:42, Pádraig Brady wrote:

Note another reason to _not_ apply the patch is that
requests to print the statistics can come async through SIGUSR1,
and thus increase the chances of inconsistent output.



Sorry, I cannot follow. Which inconsistent output are you referring to?

regards,
chris