Right now we have this slightly odd naming convention for schema and table
names when building metadata for e.g. a CSV file or a fixed width value
file.

Schema name: The filename, including file extension.
Table name: The filename without extension.
Resulting in e.g. a column path like this: people.csv.people.name

I suggest we change it to this convention:

Schema name: Folder name
Table name: The filename, including file extension.
Resulting in e.g. a column path like this: documents.people.csv.name

Why do I think this would be an improvement?

1) Because this would first of all make a kind of sense to the user to see
the file system's hierarchy reflected in the schema model.
2) Because it allows us to make these DataContext's operate not on a single
file, but on a directory of files. I have seen this quite a number of times
by now that users of MetaModel, or users of e.g. DataCleaner, which uses
MetaModel quite heavily, wants to do this sort of stuff.
3) The removing of the file extension stuff is kind of broken and a strange
convention in the first place.

While this doesn't really break backwards compatibility in terms of Java
code, it would break configuration files and other stuff of applications
that use MetaModel. But I do believe that can be communicated and handled
through carefully explaining the new convention on the migration page (that
I recently started writing [1]).

What do you think?

[1] http://wiki.apache.org/metamodel/MigratingFromEobjectsMetaModel

Reply via email to