[ 
https://issues.apache.org/jira/browse/DRILL-7765?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17151721#comment-17151721
 ] 

Paul Rogers commented on DRILL-7765:
------------------------------------

Thanks for the bug report. Drill's implementation is based on value vectors. 
Each vector (column) is of a single type. Drill infers the type of the column 
from the first record that has a value for that column. In this case, it 
appears that the Mongo connector decided that the type is a timestamp, then 
later tried to write a string into that same column, which didn't work since 
some kind of conversion is needed to transform the string into a timestamp.

Recent work on a revised JSON reader revealed that Drill has special processing 
for Mongo "extended" JSON types. An in-progress, but not-yet-committed change 
replaces this functionality with a new version that attempts to be much more 
forgiving. I can't recall what was done for this case, but a logical thing to 
try is to convert the string to date since the field is already known to be a 
date.

Note, however, that this is not actually a stable solution. Suppose the records 
were in the opposite order: string first, then date. In that case, we'd have to 
convert the date to a string. The result would be that, depending on data 
storage order, some days your query would produce a string column, others a 
date column. Not at all ideal.

Drill has an (eternally) experimental feature for "union" types. However, most 
operators don't support that type and it has never been clear how they could. 
(How does one group a combination of dates and strings, say? How do we join 
them?)

Using all-text mode is a good alternative. Then, your query can decide what to 
do with the strings.

 

> Query failing for Mixed Datatype Date and String
> ------------------------------------------------
>
>                 Key: DRILL-7765
>                 URL: https://issues.apache.org/jira/browse/DRILL-7765
>             Project: Apache Drill
>          Issue Type: Bug
>          Components: Storage - MongoDB
>    Affects Versions: 1.17.0
>            Reporter: Swatantra Agrawal
>            Priority: Critical
>
> When a single field has 2 datatypes i.e. String and Date, the following 
> exception is thrown:
> {noformat}
> org.apache.drill.common.exceptions.UserRemoteException: INTERNAL_ERROR ERROR: 
> You tried to write a VarChar type when you are using a ValueWriter of type 
> NullableTimeStampWriterImpl.
> {noformat}
>  
> Steps to Reproduce:
> Create 2 records in a collection say tmp:
> {noformat}
> db.tmp.save({"_id" : "date", "reportDate" : ISODate("1970-01-01T00:00:00Z")});
> db.tmp.save({"_id" : "date", "reportDate" : "1970-01-01T00:00:00Z"}); 
> {noformat}
>  
> Fire Drill Query to see the above Exception:
> {noformat}
> Select reportDate from tmp;{noformat}
>  
> Additional Setting: 
> *store.mongo.all_text_mode* is set to true.
>  
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to