On Sunday, 4 September 2016 at 09:55:53 UTC, data pulverizer
wrote:
I am trying to build a data table object with unrestricted
column types. The approach I am taking is to build a generic
interface BaseVector class and then a subtype GenericVector(T)
which inherits from the BaseVector. I then to build a Table
class which contains columns that is a BaseVector array to
represent the columns in the table.
My main question is how to return GenericVector!(T) from the
getCol() method in the Table class instead of BaseVector.
Perhaps my Table implementation somehow needs to be linked to
GenericVector(T) or maybe I have written BaseTable instead and
I need to do something like a GenericTable(T...). However, my
previous approach created a tuple type data object but once
created, the type structure (column type configuration) could
not be changed so no addition/removal of columns.
------------------------------------------------
import std.stdio : writeln, write, writefln;
import std.format : format;
interface BaseVector{
BaseVector get(size_t);
}
class GenericVector(T) : BaseVector{
T[] data;
alias data this;
GenericVector get(size_t i){
return new GenericVector!(T)(data[i]);
}
this(T[] arr){
this.data = arr;
}
this(T elem){
this.data ~= elem;
}
void append(T[] arr){
this.data ~= arr;
}
override string toString() const {
return format("%s", data);
}
}
class Table{
private:
BaseVector[] data;
public:
// How to return GenericVector!(T) here instead of
BaseVector
BaseVector getCol(size_t i){
return data[i];
}
this(BaseVector[] x ...){
foreach(col; x)
this.data ~= col;
}
this(BaseVector[] x){
this.data ~= x;
}
this(Table x, BaseVector[] y ...){
this.data = x.data;
foreach(col; y){
this.data ~= col;
}
}
void append(BaseVector[] x ...){
foreach(col; x)
this.data ~= x;
}
}
void main(){
auto index = new GenericVector!(int)([1, 2, 3, 4, 5]);
auto numbers = new GenericVector!(double)([1.1, 2.2, 3.3,
4.4, 5.5]);
auto names = new GenericVector!(string)(["one", "two",
"three", "four", "five"]);
Table df = new Table(index, numbers, names);
// I'd like this to be GenericVector!(T)
writeln(typeid(df.getCol(0)));
}
Since BaseVector is a polymorphic type you can't know in advance
(at compile-time) the type of the object at a particular index.
The only way to get a typed result is to specify the type that
you expect, by providing a type parameter to the function:
The cast operator will perform a dynamic cast at runtime which
will return an object of the requested type, or null, if object
is of some other type.
GenericVector!ExpectedType getTypedCol(ExpectedType)(size_t i){
assert (cast(GenericVector!ExpectedType)data[i],
format("The vector at col %s is not of type %s, but %s",
i,
ExpectedType.stringof, typeof(data[i])));
return cast(GenericVector!ExpectedType)data[i];
}
void main(){
auto index = new GenericVector!(int)([1, 2, 3, 4, 5]);
auto numbers = new GenericVector!(double)([1.1, 2.2, 3.3,
4.4, 5.5]);
auto names = new GenericVector!(string)(["one", "two",
"three", "four", "five"]);
Table df = new Table(index, numbers, names);
if (typeid(df.getCol(0) == typeid(string))
writeln(df.getTypedCol!string(0).data);
else if (typeid(df.getCol(0) == typeid(int))
writeln(df.getTypedCol!int(0).data);
// and so on...
}
Another way to approach the problem is to keep your data in an
Algebraic.
(https://dpaste.dzfl.pl/7a4e9bf408d1):
import std.meta : AliasSeq;
import std.variant : Algebraic, visit;
import std.stdio : writefln;
alias AllowedTypes = AliasSeq!(int[], double[], string[]);
alias Vector = Algebraic!AllowedTypes;
alias Table = Vector[];
void main()
{
Vector indexes = [1, 2, 3, 4, 5];
Vector numbers = [1.1, 2.2, 3.3, 4.4, 5.5];
Vector names = ["one", "two", "three", "four", "five"];
Table table = [indexes, numbers, names];
foreach (idx, col; table)
col.visit!(
(int[] indexColumn) =>
writefln("An index column at %s. Contents: %s",
idx, indexColumn),
(double[] numberColumn) =>
writefln("A number column at %s. Contents: %s",
idx, numberColumn),
(string[] namesColumn) =>
writefln("A string column at %s. Contents: %s",
idx, namesColumn)
);
}
Application output:
An index column at 0. Contents: [1, 2, 3, 4, 5]
A number column at 1. Contents: [1.1, 2.2, 3.3, 4.4, 5.5]
A string column at 2. Contents: ["one", "two", "three", "four",
"five"]