On Friday, 5 January 2018 at 13:09:25 UTC, Vino wrote:
Sorry, I'm asking what problem are you solving, what the program should do, what is its idea. Not what code you have written.

Hi,

I am trying to implement data dictionary compression, and below is the function of the program,

Function read:
This function read a csv file which contains 3 column as and stores the value of each column in an array Col1: Array1 (Ucol1), Col2: Array2 (Ucol2), Col3: Array3(Ucol3) and returns the data.

CSV file content:
Miller  America 23
John    India   42
Baker   Australia       21
Zsuwalski       Japan   45
Baker   America 45
Miller  India   23

Function Main
This function receives the data from the function read.
Creates an array based of the function return type – ( typeof(read()[i]) Data ); Sorts the data and removes the duplicates and stores the data in the above array. Then using “countUntil” function we can accomplish the data dictionary compression.

Thank you for the explanation, this is a nice little task.
Here's my version of solution. I've used ordinary arrays instead of std.container.array, since the data itself is in GC'ed heap anyway. I used csv file separated by tabs, so told csvReader to use '\t' for delimiter.

import std.algorithm: countUntil, joiner, sort, uniq, map;
import std.csv: csvReader;
import std.stdio: File, writeln;
import std.typecons: Tuple, tuple;
import std.meta;
import std.array : array;

//we know types of columns, so let's state them once
alias ColumnTypes = AliasSeq!(string, string, int);
alias Arr(T) = T[];

auto readData() {
    auto file = File("data.csv", "r");
    Tuple!( staticMap!(Arr, ColumnTypes) ) res; // tuple of arrays
foreach (record; file.byLineCopy.joiner("\n").csvReader!(Tuple!ColumnTypes)('\t'))
        foreach(i, T; ColumnTypes)
res[i] ~= record[i]; // here res[i] can have different types
    return res;
}

//compress a single column
auto compress(T)(T[] col) {
    T[] vals = sort(col.dup[]).uniq.array;
    auto ks = col.map!(v => col.countUntil(v)).array;
    return tuple(vals, ks);
}

void main() {
    auto columns = readData();
    foreach(i, ColT; ColumnTypes) {
        // here the data can have different type for different i
        auto vk = compress(columns[i]);
writeln(vk[0][]); //output data, you can write files here
        writeln(vk[1][]); //output indices
    }
}

Reply via email to