Csaba Ringhofer created IMPALA-9575:
---------------------------------------

             Summary: Add basic BINARY support
                 Key: IMPALA-9575
                 URL: https://issues.apache.org/jira/browse/IMPALA-9575
             Project: IMPALA
          Issue Type: Sub-task
          Components: Backend, Frontend
            Reporter: Csaba Ringhofer


An initial testable implementation of BINARY would contain the following:
- DDL support for BINARY, e.g. create table
- read support from text file (stored with base64 encoding)
- basic client support (hs2, beeswax)
- cast from/to STRING
- basic operators (=,<,>), all should work the same way as for STRING

Optional in the first step:
- write support for text file
- joins on BINARY columns
- aggregates on BINARY columns

Hive also allows binary columns for partitioning, but it seems buggy and I 
would prefer to avoid it in Impala. 

The last time a new type (DATE) was added in Impala was a massive change:
https://gerrit.cloudera.org/#/c/12481/

I hope that BINARY will be much simpler, as:
- It should be handled by the backend exactly the same way as STRING, which can 
mean that the backend work will be minimal (only the file readers/writers have 
to differentiate between them). This is different in Hive, where STRING is 
treated UTF-8, and binary is not. 
- The frontend should also treat it similarly to STRING, just with much less 
capabilities, e.g. no casts to other types than STRING and it shouldn't be 
accepted by UDFs that expect STRING.
- As BINARY supports very few features, tests also need to cover much less 
cases.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to