Github user iyerr3 commented on the issue:
https://github.com/apache/madlib/pull/325
> > * Use a function-specific name (`"pagerank_vertex"`) instead of a
generic name with (`"vertex"`) no drop by in the beginning.
>
> This seems doesn't solve the potential conflict with user created table.
It does with the original suggestion of not dropping a newly created table
the first time. That suggestion led to this discussion about same table name
being reused and the need to `drop table` before `create table` (which should
not actually be a problem - see end of comment).
>Besides, in some IC/DC file we drop tables in between and create table
with the same name again. In this case the second drop/creation will fail if
user already created it.
No; the table *created* in install-check will be in the install-check
schema and the subsequent drop will look inside that schema. The original issue
is due to a `drop table` statement on a table that was *not created in the
file*.
> > * Create a single `"vertex"` table (with no drop by statement) and use
that for all graph functions
>
> This may cause some assertion based on returning number of records fail
as I mentioned in the previous comment, and also needs effort of audition and
combining test data.
If the data required in a table is different for different tests then we
would benefit by giving it different names. This is a problem mostly in graphs
where "vertex" and "edge" are re-used. These names are already vague and should
ideally be renamed.
> > * Schema-qualify the data table with the install-check schema name.
>
> (not sure if I totally understand this) install-check schema is based on
module, not algo, for example pagerank and hits and sssp use the same IC
schema, and there will be issue between running two sql files for
droping/creating same table.
Schema-qualifying was an alternative suggestion to avoid looking into the
`madlib` schema. With the schema-qualification, we can drop the table just like
we're doing now (so no conflicts between test files).
**Also note:** `madpack` is supposed to drop the file-specific schema after
executing each file (see function
[_execute_per_module_install_dev_check_algo](https://github.com/apache/madlib/blob/03e82bc50494367c6b65dced66cf0042b3060042/src/madpack/madpack.py#L768)).
Hence, common table names in independent tests within same module are not
supposed to conflict with each other (if you've seen this happen then it
requires investigation).
Finally, a guideline MADlib followed a few years ago was to not issue drop
table statements in IC since all created tables are in a unique schema and will
be dropped at end of test execution (I plead guilty of ignoring this
guideline). This requires the names of data and output tables within a file to
be unique - hopefully, we give such tables meaningful names so that we can
identify the cause of a failure from the name.
**In summary**: I suggest we don't circumvent the original issue, rather
fix it by giving meaningful names and avoiding `drop table` statements in
IC/DC.
---