Github user jingyimei commented on the issue: https://github.com/apache/madlib/pull/325 @iyerr3 The install-check schema is by module, and we may have multiple files for one module. For your proposed solution, what do you expect in the single source of table creation? Do you mean audit all the table creation and try to combine all the data into one big table and every sql just use the same table? For example, for pagerank and hits in `graph` module, we only create one vertex and one edge table with all the special cases/data, and always use those two tables for all the sql files? Besides, how do we deal with some special test case when we only need 1 row/ few row in the table instead of a bigger table? Also in some DEV check we assert a specific sql returns a specific number of rows, and those assertion will fail because of the change of input table. The other concern is that this solution requires lots of audition and changes in every sql file. Developers need to understand how to add new row in the single table creation file and how sql files under the same module work together with each other. For example, if someone wants to add a test case in HITS under graph, instead of adding a new case in hits sql file, he needs to go to another file and add table, and go back to hits sql file and add test case. It sounds a bit confusing and inconvenient to me.
---