paul-rogers commented on PR #12770:
URL: https://github.com/apache/druid/pull/12770#issuecomment-1198787027
A bit late to the party on this one. Hit this when merging code. I'm not
sure the fixes are quite right.
The fix in `ExternalTableMacro` essentially restricts the columns that can
appear in an external file. Yet, users have no control over the existing data.
It is not the job of a random CSV or JSON file to have a `__time` column of the
form needed by Druid. That's the job of the mapping layer.
Then, in `DruidTable`, we also check. But, again, that's the wrong place.
Again, it is not the job of the table metadata to conform.
The proper place for this kind of check is in the validator. But, since
`INSERT` queries are not yet validated, the next best place is in the
`DruidPlanner` where we're about to hand the `DruidSqlInsert` node off to the
`QueryMaker`. At that point, we can check if the user has done the proper
mapping.
The result is that I should be able to have a CSV file with a `__time`
column as a string and do:
```sql
INSERT INTO foo
SELECT TIME_PARSE("__time") AS __time, ...
FROM TABLE(...)
```
To be clear how SQL works: the first `__time` is in the name space of the
input table row signature. The second one, in the `AS` clause, is in the name
space of the row signature of the `SELECT` statement. Since we match-by-name
for `INSERT`, this is also the column we'd insert into the target data source,
`foo`.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]