Creating parquet with Drill and mode NOT NULL

2019-05-06 Thread benj.dev
Hi, I would like to know if a solution exists to force the mode to "NOT NULL" for a data type when creating a parquet via "CREATE TABLE AS". I know how to force the mode to "NULL" using the NULLIF function - SELECT modeof(3) => NOT NULL - SELECT modeof(NULLIF(3,CAST(NULL AS integer))) =>

Re: How to store/export personal settings of Drill

2019-04-13 Thread benj.dev
Appreciate your detailed answer. I have tried to put the JSON part in the drill-override.conf with absolutely no result, but it's may be a particular case. First, I ended up confirm by reading https://github.com/apache/drill/blob/master/exec/java-exec/src/main/resources/drill-module.conf that the

How to store/export personal settings of Drill

2019-04-11 Thread benj.dev
Hi, I would like to know if it's possible to configure options at Drill startup I know that it's possible to do ALTER SESSION/SYSTEM in command line and if it's SYSTEM the value will be retained even after a reboot. I can use the webinterface to change the value of the options (ip:8047/options)

regexp_replace function and MultiEncoding problem

2019-03-15 Thread benj.dev
Hi, I have a source (.csv) with multi-encoding (it's [bs]ad but can't change that). When I try to apply a regexp_replace on a field (like...regexp_replace(`myfield`,'...','...')...) I get an error - Error: SYSTEM ERROR: MalformedInputException: Input length = 1 For example, I have a case due to

Re: Big varchar are ok when extractHeader=false but not when extractHeader=true

2019-01-30 Thread benj.dev
s.`/data/bar.csv`; > +-+ > | EXPR$0  | > +-+ > | 5   | > | 72061   | > +-+ > >  -- Boaz > > On 1/23/19 12:29 PM, benj.dev wrote: >> Hi, >> >> With a CSV file test.csv >> col1,col2 >> w,x >>

Big varchar are ok when extractHeader=false but not when extractHeader=true

2019-01-23 Thread benj.dev
Hi, With a CSV file test.csv col1,col2 w,x ...y...,z where ...y... is a > 65536 character string (let say 66000 for example) Error with extract of storage : "csv": { "type": "text", "extensions": [ "csv" ], "extractHeader": true, "delimiter": "," }, SELECT * FROM tmp.`test.csv` Error:

Problem when using files with differents schemas in the same SELECT

2019-01-02 Thread benj.dev
Hi, I have read that in SELECT from multiple sources (SELECT * FROM tmp.`myfile*`), the files are treated in random order. But I don't understand why the processing of (parquet) files that do not have the same columns is not homogeneous. Example (on Drill 1.14) : CREATE TABLE tmp2.`mytable1` AS

Re: drill parquet - create table as ... partition by ... non present column

2018-12-07 Thread benj.dev
Hi, Thanks for details. It's the point, I don't want to write additional metadata, but just organize the parquet file to have more useful stats. In a simple GROUP BY it's possible to not SELECT some of "grouped" column. (Example SELECT a, b FROM ... GROUP BY a, b, c;) In the same way, I think it