Github user kaspersorensen commented on the pull request:
https://github.com/apache/metamodel/pull/17#issuecomment-104928231
Thinking about this some more, I guess the main principle that I am trying
to enforce is:
_metadata about a column must be consistent with the values you get when
you query that column_
In other words, it is currently consistent that ColumnType is always STRING
because we always return a String value.
I would be very open to a way of working with CSV files where you could
specify/override the column type of certain columns. That would actually be a
pretty cool feature - very related, but a different way to see it.
If that would be the case then you could first use the untouched
CsvDataContext to analyze the content of various columns. If you feel confident
enough you might then switch the column type from STRING to something else. And
when it is then set to eg. INTEGER then our cast method should automatically
apply a Integer cast/conversion operation. Same with other types then.
Would that maybe be a model that fits us all?
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---