Re: code D'ish enough? - ignore previous post with same subject

ag0aep6g via Digitalmars-d-learn Sun, 26 Feb 2017 13:07:23 -0800

On Sunday, 26 February 2017 at 19:34:33 UTC, thorstein wrote:

// Reads CSV-like files with only numeric values in each column
// new_ndv replaces ndv, which is the original no-data-value
double[][]* readNumMatCsv(char[] filePath, int numHeaderLines,char[] ndv, char[] new_ndv)


* "no-data-value"?

* Returning a pointer seems dubious. I'd expect double[][] towork.

* It's not obvious to me what "Mat" means in "readNumMatCsv".
* The char[] arguments should be const if possible.

* You can probably add some attributes to the function: pure,@safe.

{ double[][]* p_numArray;


This brace style is rather uncommon.

  double[][] numArray;
  char[] line;
  string noCont = "File content not usable. Quit here.";
  string noFile = "Could not read file. Quit here.";
string nonNum = "Found a non-numeric value in data matrix.Quit here.";

Those message constants should be enums or static immutable.Personally, I'd just put the literals directly in the writelncalls, as you're not using them only once.

  Regex!char re = regex(r"(\n$)");

  if(exists(filePath))
  { File f = File(filePath, "r");
    if((line = f.readln().dup).length > 0)

As far as I see, you're never using the line you're reading here.Does this just skip the header line? You're also not usingnumHeaderLines. Maybe this should loop numHeaderLines times?There's no need to assign to `line` then. Also no need to dup.

    { while (!f.eof())
// 1st replace ndv with new_ndv, 2nd remove all \n, 3rdcheck id size of read line >0{ if((line = replaceAll(f.readln().dup.replace(ndv,new_ndv), re, "")).length > 0)

byLineCopy may be simpler to use here than readln. For this usecase, byLine (no copy) could probably work, too. But byLineCopyis less bug prone, so better start with that one.


https://dlang.org/phobos/std_stdio.html#.File.byLineCopy

It's still not clear to me what's the deal with ndv and new_ndv.

A line should contain at most one newline, at the end. There's aspecial function in std.string to remove an optional suffic:chomp. And with byLineCopy there's a parameter to omit thenewline. I'd prefer both of those over regex here.


https://dlang.org/phobos/std_string.html#.chomp

        // check if all elements of splitted line are numeric
        { foreach(i;split(line,","))


A string that's called "i" may be surprising to some.

Could probably use the range version of `split`:std.algorithm.iteration.splitter. That would avoid an allocation.But split is ok if you're more comfortable with arrays than withranges.


https://dlang.org/phobos/std_algorithm_iteration.html#.splitter

          { if(isNumeric(i) == false)


if (!isNumeric(i))

            // otherwise return pointer to empty array
            { writeln(nonNum);
              return p_numArray;
            }
          }
          // convert characters to double
          if(split(line,",").length > 0)
          { numArray ~= to!(double[])(split(line,","));

You're executing `split(line, ",")` three times. Better do itjust once and save the result in a (const) variable.

Instead of checking beforehand if all values are numeric, you canalso just let to!(double[]) fail, and catch the exception if youwant.

If I read it right, the `.length > 0` check means that blanklines are allowed at this point, right? Seems inconsistent toforbid a blank leading line but allow them later on.

          }
        }
      }
      // pass reference to pointer
      p_numArray = &numArray;

No, no, no. This is a pointer to a local variable and later youreturn it. That's invalid, undefined behavior. Don't do it. Justreturn numArray itself. As expected, you don't need to return apointer, just double[][] is fine.

      // first line empty -> return pointer to empty array
    } else { writeln(noCont); }
    // file could not be find
  } else { writeln(noFile); }

Instead of the nested `if`s style, you could reverse theconditions and return early, or maybe throw an Exception:


----
if (/* ... no file ... */)
{
    writeln(noFile);
    return [];
}
if (/* ... no content ... */)
{
    writeln(noCont);
    return [];
}
/* ... rest of the code ... */
----

That saves some indentation levels, and makes the purpose ofthose checks more obvious.

  return p_numArray;
}

Re: code D'ish enough? - ignore previous post with same subject

Reply via email to