[
https://issues.apache.org/jira/browse/DRILL-7765?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17151721#comment-17151721
]
Paul Rogers commented on DRILL-7765:
------------------------------------
Thanks for the bug report. Drill's implementation is based on value vectors.
Each vector (column) is of a single type. Drill infers the type of the column
from the first record that has a value for that column. In this case, it
appears that the Mongo connector decided that the type is a timestamp, then
later tried to write a string into that same column, which didn't work since
some kind of conversion is needed to transform the string into a timestamp.
Recent work on a revised JSON reader revealed that Drill has special processing
for Mongo "extended" JSON types. An in-progress, but not-yet-committed change
replaces this functionality with a new version that attempts to be much more
forgiving. I can't recall what was done for this case, but a logical thing to
try is to convert the string to date since the field is already known to be a
date.
Note, however, that this is not actually a stable solution. Suppose the records
were in the opposite order: string first, then date. In that case, we'd have to
convert the date to a string. The result would be that, depending on data
storage order, some days your query would produce a string column, others a
date column. Not at all ideal.
Drill has an (eternally) experimental feature for "union" types. However, most
operators don't support that type and it has never been clear how they could.
(How does one group a combination of dates and strings, say? How do we join
them?)
Using all-text mode is a good alternative. Then, your query can decide what to
do with the strings.
> Query failing for Mixed Datatype Date and String
> ------------------------------------------------
>
> Key: DRILL-7765
> URL: https://issues.apache.org/jira/browse/DRILL-7765
> Project: Apache Drill
> Issue Type: Bug
> Components: Storage - MongoDB
> Affects Versions: 1.17.0
> Reporter: Swatantra Agrawal
> Priority: Critical
>
> When a single field has 2 datatypes i.e. String and Date, the following
> exception is thrown:
> {noformat}
> org.apache.drill.common.exceptions.UserRemoteException: INTERNAL_ERROR ERROR:
> You tried to write a VarChar type when you are using a ValueWriter of type
> NullableTimeStampWriterImpl.
> {noformat}
>
> Steps to Reproduce:
> Create 2 records in a collection say tmp:
> {noformat}
> db.tmp.save({"_id" : "date", "reportDate" : ISODate("1970-01-01T00:00:00Z")});
> db.tmp.save({"_id" : "date", "reportDate" : "1970-01-01T00:00:00Z"});
> {noformat}
>
> Fire Drill Query to see the above Exception:
> {noformat}
> Select reportDate from tmp;{noformat}
>
> Additional Setting:
> *store.mongo.all_text_mode* is set to true.
>
>
--
This message was sent by Atlassian Jira
(v8.3.4#803005)