Github user drexler42 commented on the pull request:
https://github.com/apache/metamodel/pull/17#issuecomment-122790553
I'll just add my 2 cents... The CSV data type is not "typed", like a data
base. It would be nice if the user can impose a data model on top of the CSV
file, to express his intention. For example, the CSV file could be a database
dump file. The user would then like to declare String, Numerical and Boolean
columns.
The first thing the user wants is then to declare the column types.
Detecting column types is much less needed IMHO. The "foo bar" would result in
a type exception for the row.
Next, we could think about handling exceptional cases like:
- Handling missing values with some sort of coalesce function to handling
missing values;
- Converting invalid values into some special value;
This does not conflict with Kaspers notion of changing the column types.
But it is a bit less ambitious. I would be very content if it were possible to
declare column types statically.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---