Github user jingyimei commented on the issue:
https://github.com/apache/madlib/pull/325
@iyerr3 The install-check schema is by module, and we may have multiple
files for one module.
For your proposed solution, what do you expect in the single source of
table creation? Do you mean audit all the table creation and try to combine all
the data into one big table and every sql just use the same table? For example,
for pagerank and hits in `graph` module, we only create one vertex and one
edge table with all the special cases/data, and always use those two tables for
all the sql files? Besides, how do we deal with some special test case when we
only need 1 row/ few row in the table instead of a bigger table? Also in some
DEV check we assert a specific sql returns a specific number of rows, and those
assertion will fail because of the change of input table.
The other concern is that this solution requires lots of audition and
changes in every sql file. Developers need to understand how to add new row in
the single table creation file and how sql files under the same module work
together with each other. For example, if someone wants to add a test case in
HITS under graph, instead of adding a new case in hits sql file, he needs to go
to another file and add table, and go back to hits sql file and add test case.
It sounds a bit confusing and inconvenient to me.
---