Re: Accessing underlying scheme of input

Boaz Ben-Zvi Thu, 01 Mar 2018 16:05:12 -0800

From the docs (https://drill.apache.org/docs/describe/): “Currently, DESCRIBE 
does not support tables created in a file system.”
Seems that it only works well for Hive and HBase tables.


The create view statement does not explore the actual schema of the query’s 
table(s); it only parses and keeps the query.
Hence it can not tell what ‘*’ (or any list of column names) means:

 0: jdbc:drill:zk=local> create view pview as select * from  
dfs.`/data/PARQUET-1M.parquet`;
+-------+-------------------------------------------------------+
|  ok   |                        summary                        |
+-------+-------------------------------------------------------+
| true  | View 'pview' created successfully in 'dfs.tmp' schema  |
+-------+-------------------------------------------------------+
1 row selected (0.274 seconds)
0: jdbc:drill:zk=local> describe pview;
+--------------+---------------+--------------+
| COLUMN_NAME  |   DATA_TYPE   | IS_NULLABLE  |
+--------------+---------------+--------------+
| **           | DYNAMIC_STAR  | YES          |
+--------------+---------------+--------------+
1 row selected (0.419 seconds)
0: jdbc:drill:zk=local> create view pview2 as select row_count,just_garbage 
from  dfs.`/data/PARQUET-1M.parquet`;
+-------+---------------------------------------------------------+
|  ok   |                         summary                         |
+-------+---------------------------------------------------------+
| true  | View 'pview2' created successfully in 'dfs.tmp' schema  |
+-------+---------------------------------------------------------+
1 row selected (0.3 seconds)
0: jdbc:drill:zk=local> describe pview2;
+---------------+------------+--------------+
|  COLUMN_NAME  | DATA_TYPE  | IS_NULLABLE  |
+---------------+------------+--------------+
| row_count     | ANY        | YES          |
| just_garbage  | ANY        | YES          |
+---------------+------------+--------------+
2 rows selected (0.533 seconds)
0: jdbc:drill:zk=local>

Thanks,

      Boaz

On 3/1/18, 3:05 PM, "Padma Penumarthy" <[email protected]> wrote:

    Not sure why it is not showing the fields. It does not work for me either. 
    Does anyone know more ? Is this broken ? 
    
    Thanks
    Padma 
    
    > On Mar 1, 2018, at 2:54 PM, Erol Akarsu <[email protected]> wrote:
    > 
    > Somehow, after "user dfs.tmp", I was able to create view. But "describe"
    > for view does not give much information. I was expecting "describe" 
command
    > would give type  definitions of fields " employee_id  |    full_name     |
    > first_name  | last_name  | position_id  |   position_title    | store_id  
|
    > depart ". But it does give a very generic field type.
    > 
    > 
    > 0: jdbc:drill:zk=local> create view mydonuts2 as SELECT * FROM
    > cp.`employee.json` LIMIT 3;
    > +-------+------------------------------------------------------------+
    > |  ok   |                          summary                           |
    > +-------+------------------------------------------------------------+
    > | true  | View 'mydonuts2' created successfully in 'dfs.tmp' schema  |
    > +-------+------------------------------------------------------------+
    > 1 row selected (0.283 seconds)
    > 0: jdbc:drill:zk=local> describe mydonuts2;
    > +--------------+------------+--------------+
    > | COLUMN_NAME  | DATA_TYPE  | IS_NULLABLE  |
    > +--------------+------------+--------------+
    > | *            | ANY        | YES          |
    > +--------------+------------+--------------+
    > 1 row selected (0.388 seconds)
    > 0: jdbc:drill:zk=local>  SELECT * FROM cp.`employee.json` LIMIT 3;
    > 
+--------------+------------------+-------------+------------+--------------+---------------------+-----------+--------+
    > | employee_id  |    full_name     | first_name  | last_name  | position_id
    > |   position_title    | store_id  | depart |
    > 
+--------------+------------------+-------------+------------+--------------+---------------------+-----------+--------+
    > | 1            | Sheri Nowmer     | Sheri       | Nowmer     | 1
    > | President           | 0         | 1      |
    > | 2            | Derrick Whelply  | Derrick     | Whelply    | 2
    > | VP Country Manager  | 0         | 1      |
    > | 4            | Michael Spence   | Michael     | Spence     | 2
    > | VP Country Manager  | 0         | 1      |
    > 
+--------------+------------------+-------------+------------+--------------+---------------------+-----------+--------+
    > 3 rows selected (0.579 seconds)
    > 
    > On Thu, Mar 1, 2018 at 3:18 PM, Erol Akarsu <[email protected]> wrote:
    > 
    >> Padma,
    >> 
    >> I have not created any user. I just installed the system and run drill
    >> with  "sqlline.bat -u "jdbc:drill:zk=local"
    >> Therefore, what is shortest procedure to achieve what you have described
    >> in previous email?
    >> 
    >> Thanks
    >> 
    >> Erol Akarsu
    >> 
    >> On Thu, Mar 1, 2018 at 3:00 PM, Padma Penumarthy <[email protected]>
    >> wrote:
    >> 
    >>> Check if you have permissions to root directory or not.
    >>> You may have to specify the complete directory path (for which you have
    >>> permissions for) in the create view command.
    >>> 
    >>> For example:
    >>> 
    >>> 0: jdbc:drill:zk=local> create view 
dfs.root.`/Users/ppenumarthy/parquet/test-view`
    >>> as select * from dfs.root.`/Users/ppenumarthy/parquet/0_0_0.parquet`;
    >>> +-------+---------------------------------------------------
    >>> -------------------------------------+
    >>> |  ok   |                                        summary
    >>>                       |
    >>> +-------+---------------------------------------------------
    >>> -------------------------------------+
    >>> | true  | View '/Users/ppenumarthy/parquet/test-view' created
    >>> successfully in 'dfs.root' schema  |
    >>> +-------+---------------------------------------------------
    >>> -------------------------------------+
    >>> 1 row selected (0.148 seconds)
    >>> 0: jdbc:drill:zk=local>
    >>> 
    >>> 
    >>> Thanks
    >>> Padma
    >>> 
    >>> On Mar 1, 2018, at 11:37 AM, Erol Akarsu <[email protected]<mailto:eaka
    >>> [email protected]>> wrote:
    >>> 
    >>> Padma,
    >>> 
    >>> I have changed dfs storage plugin through web interface as below. But I 
am
    >>> getting same error response.
    >>> 
    >>> {
    >>> "type": "file",
    >>> "enabled": true,
    >>> "connection": "file:///",
    >>> "config": null,
    >>> "workspaces": {
    >>>   "root": {
    >>>     "location": "/",
    >>>     "writable": true,
    >>>     "defaultInputFormat": null,
    >>>     "allowAccessOutsideWorkspace": true
    >>>   },
    >>>   "tmp": {
    >>>     "location": "/tmp",
    >>>     "writable": true,
    >>>     "defaultInputFormat": null,
    >>>     "allowAccessOutsideWorkspace": true
    >>>   }
    >>> },
    >>> 
    >>> On Thu, Mar 1, 2018 at 1:15 PM, Padma Penumarthy <[email protected]
    >>> <mailto:[email protected]>>
    >>> wrote:
    >>> 
    >>> Make "writable": true for the workspace (dfs.root) in the storage plugin
    >>> configuration.
    >>> 
    >>> Thanks
    >>> Padma
    >>> 
    >>> 
    >>> On Mar 1, 2018, at 10:10 AM, Erol Akarsu <[email protected]<mailto:eaka
    >>> [email protected]><mailto:eaka
    >>> [email protected]<mailto:[email protected]>>> wrote:
    >>> 
    >>> Thanks Padma.
    >>> 
    >>> I am getting problem while creating view
    >>> 
    >>> 0: jdbc:drill:zk=local> create view mydonuts as SELECT * FROM
    >>> cp.`employee.json` LIMIT 3;
    >>> Error: VALIDATION ERROR: Root schema is immutable. Creating or dropping
    >>> tables/views is not allowed in root schema.Select a schema using 'USE
    >>> schema' command.
    >>> 
    >>> 
    >>> [Error Id: 68a31047-5a4e-4768-8722-55648d9a80f6 on 
DESKTOP-8OANV3A:31010]
    >>> (state=,code=0)
    >>> 0: jdbc:drill:zk=local>
    >>> 
    >>> On Thu, Mar 1, 2018 at 12:49 PM, Padma Penumarthy <[email protected]
    >>> <mailto:[email protected]>
    >>> <mailto:[email protected]>>
    >>> wrote:
    >>> 
    >>> Try creating a view and use describe.
    >>> 
    >>> https://urldefense.proofpoint.com/v2/url?u=https-3A__drill.
    >>> apache.org_docs_describe_&d=DwIBaQ&c=cskdkSMqhcnjZxdQVpwTXg&r=
    >>> XVqW14B9eGK9QR_fKKCb5H5LxKnqNMmq1U7RdNlhq1c&m=blTmu-WQJa5RUrxqG46o20B-a-
    >>> UP0yCHKGqcBo1ETlI&s=SxKqC4w_I5bbnNAtIvzO_9qptLP-KsHi0iIbjtzyrc8&e=
    >>> 
    >>> Thanks
    >>> Padma
    >>> 
    >>> 
    >>> On Mar 1, 2018, at 9:22 AM, Erol Akarsu <[email protected]<mailto:eaka
    >>> [email protected]><mailto:eaka
    >>> [email protected]<mailto:[email protected]>><mailto:eaka
    >>> [email protected]<mailto:[email protected]><mailto:[email protected]>>> wrote:
    >>> 
    >>> When Use limit 0 query,  I am getting only field names. I am looking for
    >>> json schema for input that will describe input type
    >>> 
    >>> 0: jdbc:drill:> select * from `clicks/clicks.json` limit 0;
    >>> 
    >>> +-----------+-------+-------+------------+-------------+
    >>> | trans_id  | date  | time  | user_info  | trans_info  |
    >>> +-----------+-------+-------+------------+-------------+
    >>> +-----------+-------+-------+------------+-------------+
    >>> No
    >>> 
    >>> On Thu, Mar 1, 2018 at 11:24 AM, Erol Akarsu <[email protected]<mailto:
    >>> [email protected]><mailto:
    >>> [email protected]<mailto:[email protected]>><mailto:
    >>> [email protected]<mailto:[email protected]><mailto:[email protected]>>>
    >>> wrote:
    >>> 
    >>> I am sorry Sorabh
    >>> Can you give an example? I am still learning Drill
    >>> Thanks
    >>> 
    >>> On Thu, Mar 1, 2018 at 11:11 AM Sorabh Hamirwasia <[email protected]
    >>> <mailto:[email protected]>
    >>> <mailto:[email protected]>
    >>> <mailto:[email protected]>>
    >>> wrote:
    >>> 
    >>> Hi Erol,
    >>> 
    >>> You can run limit 0 query from client to retrieve just the schema for
    >>> your input.
    >>> 
    >>> 
    >>> Thanks,
    >>> Sorabh
    >>> 
    >>> ________________________________
    >>> From: Erol Akarsu <[email protected]<mailto:[email protected]><mailto:
    >>> [email protected]><mailto:
    >>> [email protected]<mailto:[email protected]>>>
    >>> Sent: Thursday, March 1, 2018 5:28:52 AM
    >>> To: [email protected]<mailto:[email protected]><mailto:u
    >>> [email protected]><mailto:u
    >>> [email protected]<mailto:[email protected]>>
    >>> Subject: Accessing underlying scheme of input
    >>> 
    >>> I know Apache drill is creating a json schema for input data file or 
hdfs
    >>> input before user query on it.
    >>> I like to know whether or not Apache drill has API that will help user 
to
    >>> obtain that  derived schema for say an json file or excel file or hive
    >>> input.
    >>> I appreciate your help
    >>> 
    >>> Erol Akarsu
    >>> 
    >>> Sent from Mail for Windows 10
    >>> 
    >>> --
    >>> 
    >>> Erol Akarsu
    >>> 
    >>> 
    >>> 
    >>> 
    >>> --
    >>> 
    >>> Erol Akarsu
    >>> 
    >>> 
    >>> 
    >>> 
    >>> --
    >>> 
    >>> Erol Akarsu
    >>> 
    >>> 
    >>> 
    >>> 
    >>> --
    >>> 
    >>> Erol Akarsu
    >>> 
    >>> 
    >> 
    >> 
    >> --
    >> 
    >> Erol Akarsu
    >> 
    >> 
    > 
    > 
    > -- 
    > 
    > Erol Akarsu

Re: Accessing underlying scheme of input

Reply via email to