Re: Can this implementation of Damm algorithm be optimized?

Cym13 via Digitalmars-d-learn Thu, 09 Feb 2017 13:01:46 -0800

On Thursday, 9 February 2017 at 19:39:49 UTC, Nestor wrote:

On Thursday, 9 February 2017 at 18:34:30 UTC, Era Scarecrowwrote:
On Thursday, 9 February 2017 at 17:36:11 UTC, Nestor wrote:
I was trying to port C code from the article in Wikiversity[1] to D, but I'm not sure this implementation is the mostefficient way to do it in D, so suggestions to optimize itare welcome:
import std.stdio;

static immutable char[] QG10Matrix =
  "03175986427092154863420687135917509834266123045978" ~
  "36742095815869720134894536201794386172052581436790";

char checkDigit(string str) {
  char tmpdigit = '0';
foreach(chr; str) tmpdigit = QG10Matrix[(chr - '0') +(tmpdigit - '0') * 10];
  return tmpdigit;
}
Well one thing is you can probably reduce them from chars tojust bytes, instead of having to subtract you can instead addat the end. Although unless you're working with a VERY largeinput you won't see a difference.
Actually since you're also multiplying by 10, you canincorporate that in the table too... (although a mixin mightbe better for the conversion than by hand)
 static immutable char[] QG10Matrix = [
    00,30,10,70,50,90,80,60,40,20,
    70,00,90,20,10,50,40,80,60,30,
    40,20,00,60,80,70,10,30,50,90,
    10,70,50,00,90,80,30,40,20,60,
    60,10,20,30,00,40,50,90,70,80,
    30,60,70,40,20,00,90,50,80,10,
    50,80,60,90,70,20,00,10,30,40,
    80,90,40,50,30,60,20,00,10,70,
    90,40,30,80,60,10,70,20,00,50,
    20,50,80,10,40,30,60,70,90,00];

 char checkDigit(string str) {
   char tmpdigit = 0;
   foreach(chr; str) tmpdigit = QG10Matrix[(chr - '0') +
 tmpdigit];
   return (tmpdigit/10) + '0';
 }
OK I changed the approach using a multidimensional array forthe matrix so I could ditch arithmetic operations altogether,but curiously after measuring a few thousand runs of bothimplementations through avgtime, I see no noticeabledifference. Why?
import std.stdio;

static immutable ubyte[][] QG10Matrix = [
  [0,3,1,7,5,9,8,6,4,2],[7,0,9,2,1,5,4,8,6,3],
  [4,2,0,6,8,7,1,3,5,9],[1,7,5,0,9,8,3,4,2,6],
  [6,1,2,3,0,4,5,9,7,8],[3,6,7,4,2,0,9,5,8,1],
  [5,8,6,9,7,2,0,1,3,4],[8,9,4,5,3,6,2,0,1,7],
  [9,4,3,8,6,1,7,2,0,5],[2,5,8,1,4,3,6,7,9,0],
];

static int charToInt(char chr) {
  scope(failure) return -1;
  return cast(int)(chr - '0');
}

ubyte checkDigit(string str) {
  ubyte tmpdigit;
foreach(chr; str) tmpdigit =QG10Matrix[tmpdigit][charToInt(chr)];
  return tmpdigit;
}

enum {
  EXIT_SUCCESS = 0,
  EXIT_FAILURE = 1,
}

int main(string[] args) {
  scope(failure) {
    writeln("Invalid arguments. You must pass a number.");
    return EXIT_FAILURE;
  }
  assert(args.length == 2);
  ubyte digit = checkDigit(args[1]);
  if(digit == 0) writefln("%s is a valid number.", args[1]);
  else {
writefln("%s is not a valid number (but it would be,appending digit %s).",
      args[1], digit);
  }

  return EXIT_SUCCESS;
}

The question is, why do you expect it to be noticably faster? Onmodern hardware it the optimizations are such that so little achange in code is very hard to link to a difference in runningtime. If you really want to show one the following code does thebenchmark the right way (ie: using a very long input, tens ofthousands of runs, and avoiding I/O and load time of the programto compare the bare function implementations):


import std.conv;
import std.stdio;
import std.range;
import std.array;
import std.algorithm;

string testcase;

static immutable char[] QG10MatrixOne =
  "03175986427092154863420687135917509834266123045978" ~
  "36742095815869720134894536201794386172052581436790";

char checkDigitOne(string str) {
  char tmpdigit = 0;

foreach(chr; str) tmpdigit = QG10MatrixOne[(chr - '0') +(tmpdigit - '0') * 10];

  return tmpdigit;
}

void testCheckDigitOne() {
    checkDigitTwo(testcase);
}

static immutable ubyte[][] QG10MatrixTwo = [
  [0,3,1,7,5,9,8,6,4,2],[7,0,9,2,1,5,4,8,6,3],
  [4,2,0,6,8,7,1,3,5,9],[1,7,5,0,9,8,3,4,2,6],
  [6,1,2,3,0,4,5,9,7,8],[3,6,7,4,2,0,9,5,8,1],
  [5,8,6,9,7,2,0,1,3,4],[8,9,4,5,3,6,2,0,1,7],
  [9,4,3,8,6,1,7,2,0,5],[2,5,8,1,4,3,6,7,9,0],
];

static int charToInt(char chr) {
  scope(failure) return -1;
  return cast(int)(chr - '0');
}

ubyte checkDigitTwo(string str) {
  ubyte tmpdigit;
  foreach(chr; str) {
      tmpdigit = QG10MatrixTwo[tmpdigit][charToInt(chr)];
  }
  return tmpdigit;
}

void testCheckDigitTwo() {
    checkDigitTwo(testcase);
}

void main(string[] args) {

testcase =iota(10).cycle.take(100000).map!(to!string).array.join;


  import std.datetime: benchmark;

benchmark!(testCheckDigitOne,testCheckDigitTwo)(10000).each!writeln;

Yet on my machine I have the following times (note that the timeitself depends on my hardware, what's important is thedifference):


TickDuration(15785255852)
TickDuration(15784826803)

So it seems that the first version is slower than the second one,but by so little that it's hard to link to the actualimplementation. If anything it shows the futility of suchlow-level tricks for meaningless optimization. And by the waynote that the benchmark avoid measuring IO because we wereinterested in the functions. But if we were measuring it it wouldrepresent 4 fifth of the running time (on my machine, withnumbers as long as the one used in the benchmark). This meanseven a 20% improvement in checkDigit would actually onlyrepresent a 4% improvement of the program.

Re: Can this implementation of Damm algorithm be optimized?

Reply via email to