from:"Sudheesh Katkam"

Re: setting an administrator

2017-05-05 Thread Sudheesh Katkam

There are system options that define the list of users and list of groups that
are considered administrators. By default, the user running the drillbit is the
administrator.

System options can only be changed by administrators. So login as an
administrator through sqlline and run “ALTER SYSTEM SET …”, or login as an
administrator through web UI and change system options there.

Details are here: https://issues.apache.org/jira/browse/DRILL-3622 I agree this
should be better documented. So please open a ticket.

Also, system options are stored in ZooKeeper, which is why manually
creating/editing that znode worked.

On May 5, 2017, at 7:47 AM, Knapp, Michael
mailto:michael.kn...@capitalone.com>> wrote:

After a lot of source code digging, and some trial and error, I discovered I
can set admin users from the zookeeper CLI with this command:

create /drill/sys.options/security.admin.users
'{"kind":"STRING","type":"SYSTEM","name":"security.admin.users","num_val":"0","string_val":"bbt612","bool_val":"true","float_val":"0"}'

now why the heck this is not in the documentation beats me. I think the
developers wanted me to use sqlline to set this, but they left no documentation
whatsoever about how to establish a connection between sqlline and my zookeeper
persistent store.

On 5/4/17, 6:27 PM, "Knapp, Michael"
mailto:michael.kn...@capitalone.com>> wrote:

Hi,

I am trying to set drill administrators but it’s just not working. I have
setup a custom authenticator that uses a backend database for authentication,
and that is working. The only problem is I am a “user” not an administrator,
leaving me essentially powerless and drill useless.

First, I think the
instructions
are not clear, it is not clear to me if I should be executing the SET
statement from the web console or something else. I have tried this:

I updated my drill-override.conf, I have attempted setting
“drill.exec.security.admin.users” and “security.admin.users”. I have set them
to single values and also attempted putting the values in brackets like a list.
None of these combinations have worked.

It was unclear to me how I was supposed to run your SQL statements when I am
not an administrator in the first place. Then I guessed I should try it from
the sqlline, but that also is not working.

sqlline> ALTER SYSTEM SET `security.admin.users` = "my_id";
No current connection

Why is it saying that I have no current connection? What am I missing here?

Michael Knapp

The information contained in this e-mail is confidential and/or proprietary
to Capital One and/or its affiliates and may only be used solely in performance
of work or services for Capital One. The information transmitted herewith is
intended only for use by the individual or entity to which it is addressed. If
the reader of this message is not the intended recipient, you are hereby
notified that any review, retransmission, dissemination, distribution, copying
or other use of, or taking of any action in reliance upon this information is
strictly prohibited. If you have received this communication in error, please
contact the sender and delete the material from your computer.

The information contained in this e-mail is confidential and/or proprietary to
Capital One and/or its affiliates and may only be used solely in performance of
work or services for Capital One. The information transmitted herewith is
intended only for use by the individual or entity to which it is addressed. If
the reader of this message is not the intended recipient, you are hereby
notified that any review, retransmission, dissemination, distribution, copying
or other use of, or taking of any action in reliance upon this information is
strictly prohibited. If you have received this communication in error, please
contact the sender and delete the material from your computer.

Re: multiple users and passwords

2017-05-02 Thread Sudheesh Katkam

To clarify:

Passing *end user* credentials from one service to another *to reach the target 
service* is messy.

On May 2, 2017, at 1:06 PM, Sudheesh Katkam 
mailto:skat...@mapr.com>> wrote:

Drill supports impersonation (“outbound”) to HDFS and Hive; this works because 
the client API allows for inbound impersonation.

In your use-case, does the backend database allow Drill to impersonate the end 
user (“Joe”) i.e. does the database support inbound impersonation? If so, there 
still need to be some changes made in Drill to support that; please open a 
ticket in that case.

Passing credentials from one service to another is messy. Instead if service A 
supports inbound impersonation (or proxy users), then service A can verify 
service B’s credentials once, and allow service B to impersonate end users 
(maybe based on some policies, like Drill). This will avoid having to pass 
through the end user’s credentials.

- Sudheesh

On May 2, 2017, at 11:55 AM, Knapp, Michael 
mailto:michael.kn...@capitalone.com>> wrote:

Sorry I noticed that documentation/link after I sent the original message.  I 
also found the documentation on “Configuring User Impersonation” and 
“Configuring Inbound Impersonation” to be useful and relevant.

I am not sure that these will be adequate though.  Drill supports inbound 
impersonation, but I think I need the opposite, outbound impersonation.

For example, I can setup Drill to use LDAP, and “Joe” can login to the machine. 
 He may do a query joining the database with another source.  Drill can use 
impersonation to execute these queries as Joe.  Unfortunately though, Joe’s 
credentials for the backend database may not be the same as his LDAP 
credentials, and they may be different for the other data sources.  Joe could 
configure the storage plugins to use his database username/password, but 
wouldn’t that also make his password visible to all users?

I guess I can summarize this with one question: Can Drill support separate 
storage plugin configurations per user?

On 5/2/17, 2:36 PM, "Kunal Khatua" mailto:kkha...@mapr.com>> 
wrote:

  Have you had a look at this link?

  https://drill.apache.org/docs/configuring-user-authentication

  Configuring User Authentication - Apache 
Drill<https://drill.apache.org/docs/configuring-user-authentication/>
  drill.apache.org<http://drill.apache.org>
  Authentication is the process of establishing confidence of authenticity. A 
Drill client user is authenticated when a drillbit process running in a Drill 
cluster ...


  - Kunal

  
  From: Knapp, Michael 
mailto:michael.kn...@capitalone.com>>
  Sent: Tuesday, May 2, 2017 8:33:03 AM
  To: user@drill.apache.org<mailto:user@drill.apache.org>
  Cc: Chagani, Hassan; Swift, John
  Subject: multiple users and passwords

  Drill Developers and Supporters,

  I am hoping to use drill to query a SQL databaes.  There will be many 
different users accessing the drill web console, and each of them have separate 
credentials for accessing the database.  I have the requirement of supporting 
drill queries to the database using the credentials provided by the current 
user.  I am struggling to find a way to do this in drill because I noticed that:

  · The documentation instructs me to provide the username and password 
in the storage plugin, either in the ‘url’ field or as separate ‘username’ and 
‘password’ fields.

  · As far as I know, Drill does not support user logins or various 
permission models.

  So as I see it, if a person can reach the drill web console, then they can 
also see all of the storage plugin configurations.  That means they can see the 
passwords in clear text.  If I opened this up to multiple users, then each of 
them could see everybody else’s passwords.  I cannot simply create a system 
account to perform queries on behalf of others because we have auditing 
requirements.

  I also noticed that completed queries are logged in the “Profiles” tab on the 
console.  So if somehow I configure things such that credentials are passed in 
a query, they would still be visible to other users by viewing completed 
queries.  So I would also need to prevent that somehow.

  Does anybody know how I can provide drill with each user’s credentials 
without sharing them with every user?

  I don’t see any way to provide credentials in a select statement to my 
database, it looks like it can only be provided while forming a connection.

  I was thinking, maybe I can write a new storage plugin that wraps the RDBMS 
plugin, and consumes credentials by some other method.  I don’t see any 
documentation on how to write your own storage plugin.

  Any ideas or suggestions would be greatly appreciated.

  Michael Knapp
  

  The information contained in this e-mail is confidential and/or proprietary 
to Capital One and/or its affiliates and may onl

Re: multiple users and passwords

2017-05-02 Thread Sudheesh Katkam

Drill supports impersonation (“outbound”) to HDFS and Hive; this works because 
the client API allows for inbound impersonation.

In your use-case, does the backend database allow Drill to impersonate the end 
user (“Joe”) i.e. does the database support inbound impersonation? If so, there 
still need to be some changes made in Drill to support that; please open a 
ticket in that case.

Passing credentials from one service to another is messy. Instead if service A 
supports inbound impersonation (or proxy users), then service A can verify 
service B’s credentials once, and allow service B to impersonate end users 
(maybe based on some policies, like Drill). This will avoid having to pass 
through the end user’s credentials.

- Sudheesh

> On May 2, 2017, at 11:55 AM, Knapp, Michael  
> wrote:
> 
> Sorry I noticed that documentation/link after I sent the original message.  I 
> also found the documentation on “Configuring User Impersonation” and 
> “Configuring Inbound Impersonation” to be useful and relevant.
> 
> I am not sure that these will be adequate though.  Drill supports inbound 
> impersonation, but I think I need the opposite, outbound impersonation.  
> 
> For example, I can setup Drill to use LDAP, and “Joe” can login to the 
> machine.  He may do a query joining the database with another source.  Drill 
> can use impersonation to execute these queries as Joe.  Unfortunately though, 
> Joe’s credentials for the backend database may not be the same as his LDAP 
> credentials, and they may be different for the other data sources.  Joe could 
> configure the storage plugins to use his database username/password, but 
> wouldn’t that also make his password visible to all users?
> 
> I guess I can summarize this with one question: Can Drill support separate 
> storage plugin configurations per user?
> 
> On 5/2/17, 2:36 PM, "Kunal Khatua"  wrote:
> 
>Have you had a look at this link?
> 
>https://drill.apache.org/docs/configuring-user-authentication
> 
>Configuring User Authentication - Apache 
> Drill
>drill.apache.org
>Authentication is the process of establishing confidence of authenticity. 
> A Drill client user is authenticated when a drillbit process running in a 
> Drill cluster ...
> 
> 
>- Kunal
> 
>
>From: Knapp, Michael 
>Sent: Tuesday, May 2, 2017 8:33:03 AM
>To: user@drill.apache.org
>Cc: Chagani, Hassan; Swift, John
>Subject: multiple users and passwords
> 
>Drill Developers and Supporters,
> 
>I am hoping to use drill to query a SQL databaes.  There will be many 
> different users accessing the drill web console, and each of them have 
> separate credentials for accessing the database.  I have the requirement of 
> supporting drill queries to the database using the credentials provided by 
> the current user.  I am struggling to find a way to do this in drill because 
> I noticed that:
> 
>· The documentation instructs me to provide the username and 
> password in the storage plugin, either in the ‘url’ field or as separate 
> ‘username’ and ‘password’ fields.
> 
>· As far as I know, Drill does not support user logins or various 
> permission models.
> 
>So as I see it, if a person can reach the drill web console, then they can 
> also see all of the storage plugin configurations.  That means they can see 
> the passwords in clear text.  If I opened this up to multiple users, then 
> each of them could see everybody else’s passwords.  I cannot simply create a 
> system account to perform queries on behalf of others because we have 
> auditing requirements.
> 
>I also noticed that completed queries are logged in the “Profiles” tab on 
> the console.  So if somehow I configure things such that credentials are 
> passed in a query, they would still be visible to other users by viewing 
> completed queries.  So I would also need to prevent that somehow.
> 
>Does anybody know how I can provide drill with each user’s credentials 
> without sharing them with every user?
> 
>I don’t see any way to provide credentials in a select statement to my 
> database, it looks like it can only be provided while forming a connection.
> 
>I was thinking, maybe I can write a new storage plugin that wraps the 
> RDBMS plugin, and consumes credentials by some other method.  I don’t see any 
> documentation on how to write your own storage plugin.
> 
>Any ideas or suggestions would be greatly appreciated.
> 
>Michael Knapp
>
> 
>The information contained in this e-mail is confidential and/or 
> proprietary to Capital One and/or its affiliates and may only be used solely 
> in performance of work or services for Capital One. The information 
> transmitted herewith is intended only for use by the individual or entity to 
> which it is addressed. If t

[HANGOUT] Topics for 04/18/17

2017-04-17 Thread Sudheesh Katkam

Hi drillers,

Our bi-weekly hangout is tomorrow (04/18/17, 10 AM PT). If you have any 
suggestions for hangout topics, you can add them to this thread. We will also 
ask around at the beginning of the hangout for topics.

Hangout link: 
https://plus.google.com/hangouts/_/event/ci4rdiju8bv04a64efj5fedd0lc

Thank you,
Sudheesh

Re: Strange results with date_trunc 'QUARTER'

2017-04-03 Thread Sudheesh Katkam

Looks like a bug to me. Please open a ticket. A simple repro would be very 
useful.

https://issues.apache.org/jira/browse/DRILL

- Sudheesh

On Apr 3, 2017, at 2:11 PM, Joel Wilsson 
mailto:joel.wils...@gmail.com>> wrote:

Hi,

I'm seeing some strange results when trying to group by
date_trunc('QUARTER', ). I can work around it by doing
more or less the same thing as in DateTruncFunctions. Am I missing
something, or is this a bug?

0: jdbc:drill:> SELECT date_trunc('QUARTER',
`taxi_trips`.`dropoff_datetime`), COUNT(*) FROM `hive.default`.`taxi_trips`
GROUP BY date_trunc('QUARTER', `taxi_trips`.`dropoff_datetime`) ORDER BY
date_trunc('QUARTER', `taxi_trips`.`dropoff_datetime`);
+++
| EXPR$0 |   EXPR$1   |
+++
| 2012-01-01 00:00:00.0  | 21817  |
| 2013-01-01 00:00:00.0  | 173157926  |
| 2013-04-01 00:00:00.0  | 3  |
| 2013-07-01 00:00:00.0  | 2  |
| 2013-10-01 00:00:00.0  | 3  |
| 2014-01-01 00:00:00.0  | 8  |
| 2020-01-01 00:00:00.0  | 4  |
+++
7 rows selected (12.734 seconds)

The data is spread out over all months of 2013:

0: jdbc:drill:> SELECT date_trunc('MONTH',
`taxi_trips`.`dropoff_datetime`), COUNT(*) FROM `hive.default`.`taxi_trips`
GROUP BY date_trunc('MONTH', `taxi_trips`.`dropoff_datetime`) ORDER BY
date_trunc('MONTH', `taxi_trips`.`dropoff_datetime`);
++---+
| EXPR$0 |  EXPR$1   |
++---+
| 2012-12-01 00:00:00.0  | 21817 |
| 2013-01-01 00:00:00.0  | 14772657  |
| 2013-02-01 00:00:00.0  | 13990803  |
| 2013-03-01 00:00:00.0  | 15744402  |
| 2013-04-01 00:00:00.0  | 15108210  |
| 2013-05-01 00:00:00.0  | 15313848  |
| 2013-06-01 00:00:00.0  | 14355098  |
| 2013-07-01 00:00:00.0  | 13830436  |
| 2013-08-01 00:00:00.0  | 12613596  |
| 2013-09-01 00:00:00.0  | 14080300  |
| 2013-10-01 00:00:00.0  | 15009363  |
| 2013-11-01 00:00:00.0  | 14388420  |
| 2013-12-01 00:00:00.0  | 13950801  |
| 2014-01-01 00:00:00.0  | 8 |
| 2020-05-01 00:00:00.0  | 4 |
++---+
15 rows selected (12.25 seconds)

This workaround gives the correct results:

0: jdbc:drill:> SELECT date_trunc('YEAR', `taxi_trips`.`dropoff_datetime`)
+ ((extract(month from `taxi_trips`.`dropoff_datetime`)-1)/3) * interval
'3' MONTH, COUNT(*) FROM `hive.default`.`taxi_trips` GROUP BY
date_trunc('YEAR', `taxi_trips`.`dropoff_datetime`) + ((extract(month from
`taxi_trips`.`dropoff_datetime`)-1)/3) * interval '3' MONTH ORDER BY
date_trunc('YEAR', `taxi_trips`.`dropoff_datetime`) + ((extract(month from
`taxi_trips`.`dropoff_datetime`)-1)/3) * interval '3' MONTH;
++---+
| EXPR$0 |  EXPR$1   |
++---+
| 2012-10-01 00:00:00.0  | 21817 |
| 2013-01-01 00:00:00.0  | 44507862  |
| 2013-04-01 00:00:00.0  | 44777156  |
| 2013-07-01 00:00:00.0  | 40524332  |
| 2013-10-01 00:00:00.0  | 43348584  |
| 2014-01-01 00:00:00.0  | 8 |
| 2020-04-01 00:00:00.0  | 4 |
++---+
7 rows selected (13.261 seconds)

The data is read from an external Parquet table:

0: jdbc:drill:> describe `hive.default`.`taxi_trips`;
+-++--+
| COLUMN_NAME | DATA_TYPE  | IS_NULLABLE  |
+-++--+
| dropoff_datetime| TIMESTAMP  | YES  |
| dropoff_latitude| DOUBLE | YES  |
| dropoff_longitude   | DOUBLE | YES  |
| hack_license| CHARACTER VARYING  | YES  |
| medallion   | CHARACTER VARYING  | YES  |
| passenger_count | BIGINT | YES  |
| pickup_datetime | TIMESTAMP  | YES  |
| pickup_latitude | DOUBLE | YES  |
| pickup_longitude| DOUBLE | YES  |
| rate_code   | BIGINT | YES  |
| store_and_fwd_flag  | CHARACTER VARYING  | YES  |
| trip_distance   | DOUBLE | YES  |
| trip_time_in_secs   | BIGINT | YES  |
| vendor_id   | CHARACTER VARYING  | YES  |
+-++--+
14 rows selected (0.184 seconds)
0: jdbc:drill:>

Best regards,
 Joel

Re: S3 using IAM roles

2017-04-03 Thread Sudheesh Katkam

Glad you could figure this out! Can you open a ticket with details? 
https://issues.apache.org/jira/browse/DRILL

- Sudheesh

On Apr 3, 2017, at 12:42 PM, Knapp, Michael 
mailto:michael.kn...@capitalone.com>> wrote:

All,

In case others are in the same situation as I am, I will tell you how I solved 
this.  After A LLL of digging through source code, I discovered the 
following facts:
• Drill is using hadoop’s FileSystem to support S3 queries.  So any 
configuration items that work for that will also work if you place them in the 
core-site.xml file here.
• In the Hadoop-aws jar/source code, it uses these classes to get credentials:
o S3AFileSystem
o S3AUtils
o [Default]S3ClientFactory
• If you configure nothing, then naturally credentials will be searched in this 
order:
o BasicAWSCredentialsProvider – looks for access and secret in the core-site 
xml file
o EnvironmentVariableCredentialsProvider – looks for access and secret in 
environment variables.
o SharedInstanceProfileCredentialsProvider – tries to get credentials from the 
instance metadata, THIS IS THE ONE THAT CAN FIND IAM CREDENTIALS!

So to solve this problem I had to do these steps:
1. Make sure that core-site.xml DOES NOT set the access and secret key
2. Make sure that your S3 Storage configuration DOES NOT set the access and 
secret key from the Apache Drill web UI, Storage tab
3. In my case, I also needed server side encryption to be supported, there is a 
property you can add to core-site.xml for that.

Here is what my core-site.xml file eventually looked like:


 
   fs.s3a.server-side-encryption-algorithm
   YOUR_VALUE_HERE
 
 
   fs.s3a.connection.maximum
   100
 


When you query from drill, the format should look like this:
SELECT * FROM s3.`s3a://my-bucket/drill/nation.parquet` limit 3;

Also, if somebody needs to troubleshoot this, then modify the logback.xml, add 
these:


   
   
 

 
   
   
 

Then you can see log entries for these things in drillbit.log

I hope this may help other people who need to use IAM and/or server side 
encryption with drill.

I also hope that somebody will update the Drill documentation to explain how to 
do this, it could have saved me a day of work!

Michael Knapp



On 4/3/17, 1:13 PM, "Knapp, Michael" 
mailto:michael.kn...@capitalone.com>> wrote:

   Drill Developers,

   I am using IAM roles on EC2 instances, your documentation here:
   https://drill.apache.org/docs/s3-storage-plugin/

   instructs me to provide an access key and secret key, which I do not have 
since I am using IAM roles.

   I have been reviewing the source code a few hours now and still have not 
found a point in the code where you connect with S3.  I was surprised to find 
that you do not use the AWS SDK.

   Can you please tell me:

   1.   Does Drill support using IAM roles to provide credentials for S3 
access?

   2.   Where in the code does Drill establish a connection with S3?

   Michael Knapp
   

   The information contained in this e-mail is confidential and/or proprietary 
to Capital One and/or its affiliates and may only be used solely in performance 
of work or services for Capital One. The information transmitted herewith is 
intended only for use by the individual or entity to which it is addressed. If 
the reader of this message is not the intended recipient, you are hereby 
notified that any review, retransmission, dissemination, distribution, copying 
or other use of, or taking of any action in reliance upon this information is 
strictly prohibited. If you have received this communication in error, please 
contact the sender and delete the material from your computer.




The information contained in this e-mail is confidential and/or proprietary to 
Capital One and/or its affiliates and may only be used solely in performance of 
work or services for Capital One. The information transmitted herewith is 
intended only for use by the individual or entity to which it is addressed. If 
the reader of this message is not the intended recipient, you are hereby 
notified that any review, retransmission, dissemination, distribution, copying 
or other use of, or taking of any action in reliance upon this information is 
strictly prohibited. If you have received this communication in error, please 
contact the sender and delete the material from your computer.

Re: questions about storage plugin

2017-01-31 Thread Sudheesh Katkam

Can you check for relevant info in the drillbit.log file (either 
/var/log/drill/ or ${DRILL_HOME}/log)?

> On Jan 31, 2017, at 1:50 PM, Pravin Joshi  wrote:
> 
> I am trying to create storage plugin for snowflake and am getting Please
> retry: error (unable to create/ update storage) error. JDBC connection and
> credentials are verified outside of drill.
> 
> 
> Using following storage configuration. Any advice would be helpful.
> 
> {
>  "type": "jdbc",
>  "driver": "net.snowflake.client.jdbc.SnowflakeDriver",
>  "url": "jdbc:snowflake://",
>  "username":
>  "password":
>  "enabled": true
> }

Re: [HANGOUT] Topics for 01/24/17

2017-01-24 Thread Sudheesh Katkam

Join us here: 
https://plus.google.com/hangouts/_/event/ci4rdiju8bv04a64efj5fedd0lc

On Jan 23, 2017, at 6:43 PM, Sudheesh Katkam 
mailto:skat...@mapr.com>> wrote:

I meant 01/24/17, 10 AM PT.

On Jan 23, 2017, at 12:43 PM, Sudheesh Katkam 
mailto:skat...@mapr.com>> wrote:

Hi drillers,

Our bi-weekly hangout is tomorrow (01/23/17, 10 AM PT). If you have any 
suggestions for hangout topics, you can add them to this thread. We will also 
ask around at the beginning of the hangout for topics.

Thank you,
Sudheesh

Re: [HANGOUT] Topics for 01/24/17

2017-01-23 Thread Sudheesh Katkam

I meant 01/24/17, 10 AM PT.

> On Jan 23, 2017, at 12:43 PM, Sudheesh Katkam  wrote:
> 
> Hi drillers,
> 
> Our bi-weekly hangout is tomorrow (01/23/17, 10 AM PT). If you have any 
> suggestions for hangout topics, you can add them to this thread. We will also 
> ask around at the beginning of the hangout for topics.
> 
> Thank you,
> Sudheesh

[HANGOUT] Topics for 01/23/17

2017-01-23 Thread Sudheesh Katkam

Hi drillers,

Our bi-weekly hangout is tomorrow (01/23/17, 10 AM PT). If you have any 
suggestions for hangout topics, you can add them to this thread. We will also 
ask around at the beginning of the hangout for topics.

Thank you,
Sudheesh

Re: Impersonation with Drill Web Console or REST API

2016-12-21 Thread Sudheesh Katkam

Maybe the doc should say that Drill supports impersonation through web console. 
These clients use Java client library, just like JDBC.

Note that *inbound* impersonation is not supported yet because Drill does not 
expose an “impersonation_target” field through the web login form.

Thank you,
Sudheesh

> On Dec 21, 2016, at 10:08 AM, Akihiko Kusanagi  wrote:
> 
> Hi,
> 
> The 'Impersonation Support' table In the following page says that
> impersonation
> is not supported with Drill Web Console or REST API.
> http://drill.apache.org/docs/configuring-user-impersonation/
> 
> However, when authentication and impersonation are enabled, impersonation is
> in effect through Web UI.
> 
> $ cat drill-override.conf
> ...
> drill.exec: {
> ...
> impersonation: {
>   enabled: true
> },
> ...
> 
> Only mapr user has read permission for nation.parquet, and Drillbit is
> running as mapr user.
> 
> $ hadoop fs -ls /sample-data
> ...
> drwx--   - mapr mapr   1210 2016-01-11 19:58 nation.parquet
> ...
> 
> Then, login as the other user via Drill Web UI, and run this query:
> 
> select * from dfs.`/sample-data/nation.parquet`
> 
> This returns the following error, so it seems that impersonation is in
> effect.
> 
> Query Failed: An Error Occurred
> org.apache.drill.common.exceptions.UserRemoteException: SYSTEM ERROR:
> IOException: 2049.177.8452826 /sample-data/nation.parquet (Input/output
> error) Fragment 0:0 [Error Id: 91684467-8a4f-4fb8-8ad7-6ee04b7f8f53 on
> node3:31010]
> 
> When drill.exec.impersonation.enabled = false, the query above returns
> multiple rows.
> 
> Is this expected behavior? Does the document need to be updated?
> 
> Thanks,
> Aki

Re: [Drill 1.9.0] : [CONNECTION ERROR] :- (user client) closed unexpectedly. Drillbit down?

2016-12-21 Thread Sudheesh Katkam

Two more questions..

(1) How many nodes in your cluster?
(2) How many queries are running when the failure is seen?

If you have multiple large queries running at the same time, the load on the 
system could cause those failures (which are heartbeat related).

The two options I suggested decrease the parallelism of stages in a query, this 
implies lesser load but slower execution.

System level option affect all queries, and session level affect queries on a 
specific connection. Not sure what is preferred in your environment.

Also, you may be interested in metrics. More info here:

http://drill.apache.org/docs/monitoring-metrics/ 
<http://drill.apache.org/docs/monitoring-metrics/>

Thank you,
Sudheesh

> On Dec 21, 2016, at 4:31 AM, Anup Tiwari  wrote:
> 
> @sudheesh, yes drill bit is running on datanodeN/10.*.*.5:31010).
> 
> Can you tell me how this will impact to query and do i have to set this at
> session level OR system level?
> 
> 
> 
> Regards,
> *Anup Tiwari*
> 
> On Tue, Dec 20, 2016 at 11:59 PM, Chun Chang  wrote:
> 
>> I am pretty sure this is the same as DRILL-4708.
>> 
>> On Tue, Dec 20, 2016 at 10:27 AM, Sudheesh Katkam 
>> wrote:
>> 
>>> Is the drillbit service (running on datanodeN/10.*.*.5:31010) actually
>>> down when the error is seen?
>>> 
>>> If not, try lowering parallelism using these two session options, before
>>> running the queries:
>>> 
>>> planner.width.max_per_node (decrease this)
>>> planner.slice_target (increase this)
>>> 
>>> Thank you,
>>> Sudheesh
>>> 
>>>> On Dec 20, 2016, at 12:28 AM, Anup Tiwari 
>>> wrote:
>>>> 
>>>> Hi Team,
>>>> 
>>>> We are running some drill automation script on a daily basis and we
>> often
>>>> see that some query gets failed frequently by giving below error ,
>> Also i
>>>> came across DRILL-4708 <https://issues.apache.org/
>> jira/browse/DRILL-4708
>>>> 
>>>> which seems similar, Can anyone give me update on that OR workaround to
>>>> avoid such issue ?
>>>> 
>>>> *Stack Trace :-*
>>>> 
>>>> Error: CONNECTION ERROR: Connection /10.*.*.1:41613 <-->
>>>> datanodeN/10.*.*.5:31010 (user client) closed unexpectedly. Drillbit
>>> down?
>>>> 
>>>> 
>>>> [Error Id: 5089f2f1-0dfd-40f8-9fa0-8276c08be53f ] (state=,code=0)
>>>> java.sql.SQLException: CONNECTION ERROR: Connection /10.*.*.1:41613
>> <-->
>>>> datanodeN/10.*.*.5:31010 (user client) closed unexpectedly. Drillb
>>>> it down?
>>>> 
>>>> 
>>>> [Error Id: 5089f2f1-0dfd-40f8-9fa0-8276c08be53f ]
>>>>   at
>>>> org.apache.drill.jdbc.impl.DrillCursor.nextRowInternally(
>>> DrillCursor.java:232)
>>>>   at
>>>> org.apache.drill.jdbc.impl.DrillCursor.loadInitialSchema(
>>> DrillCursor.java:275)
>>>>   at
>>>> org.apache.drill.jdbc.impl.DrillResultSetImpl.execute(
>>> DrillResultSetImpl.java:1943)
>>>>   at
>>>> org.apache.drill.jdbc.impl.DrillResultSetImpl.execute(
>>> DrillResultSetImpl.java:76)
>>>>   at
>>>> org.apache.calcite.avatica.AvaticaConnection$1.execute(
>>> AvaticaConnection.java:473)
>>>>   at
>>>> org.apache.drill.jdbc.impl.DrillMetaImpl.prepareAndExecute(
>>> DrillMetaImpl.java:465)
>>>>   at
>>>> org.apache.calcite.avatica.AvaticaConnection.
>> prepareAndExecuteInternal(
>>> AvaticaConnection.java:477)
>>>>   at
>>>> org.apache.drill.jdbc.impl.DrillConnectionImpl.
>>> prepareAndExecuteInternal(DrillConnectionImpl.java:169)
>>>>   at
>>>> org.apache.calcite.avatica.AvaticaStatement.executeInternal(
>>> AvaticaStatement.java:109)
>>>>   at
>>>> org.apache.calcite.avatica.AvaticaStatement.execute(
>>> AvaticaStatement.java:121)
>>>>   at
>>>> org.apache.drill.jdbc.impl.DrillStatementImpl.execute(
>>> DrillStatementImpl.java:101)
>>>>   at sqlline.Commands.execute(Commands.java:841)
>>>>   at sqlline.Commands.sql(Commands.java:751)
>>>>   at sqlline.SqlLine.dispatch(SqlLine.java:746)
>>>>   at sqlline.SqlLine.runCommands(SqlLine.java:1651)
>>>>   at sqlline.Commands.run(Commands.java:1304)
>>>>   at sun.reflect.NativeMethodA

Re: [Drill 1.9.0] : [CONNECTION ERROR] :- (user client) closed unexpectedly. Drillbit down?

2016-12-20 Thread Sudheesh Katkam

Is the drillbit service (running on datanodeN/10.*.*.5:31010) actually down 
when the error is seen?

If not, try lowering parallelism using these two session options, before 
running the queries:

planner.width.max_per_node (decrease this)
planner.slice_target (increase this)

Thank you,
Sudheesh

> On Dec 20, 2016, at 12:28 AM, Anup Tiwari  wrote:
> 
> Hi Team,
> 
> We are running some drill automation script on a daily basis and we often
> see that some query gets failed frequently by giving below error , Also i
> came across DRILL-4708 
> which seems similar, Can anyone give me update on that OR workaround to
> avoid such issue ?
> 
> *Stack Trace :-*
> 
> Error: CONNECTION ERROR: Connection /10.*.*.1:41613 <-->
> datanodeN/10.*.*.5:31010 (user client) closed unexpectedly. Drillbit down?
> 
> 
> [Error Id: 5089f2f1-0dfd-40f8-9fa0-8276c08be53f ] (state=,code=0)
> java.sql.SQLException: CONNECTION ERROR: Connection /10.*.*.1:41613 <-->
> datanodeN/10.*.*.5:31010 (user client) closed unexpectedly. Drillb
> it down?
> 
> 
> [Error Id: 5089f2f1-0dfd-40f8-9fa0-8276c08be53f ]
>at
> org.apache.drill.jdbc.impl.DrillCursor.nextRowInternally(DrillCursor.java:232)
>at
> org.apache.drill.jdbc.impl.DrillCursor.loadInitialSchema(DrillCursor.java:275)
>at
> org.apache.drill.jdbc.impl.DrillResultSetImpl.execute(DrillResultSetImpl.java:1943)
>at
> org.apache.drill.jdbc.impl.DrillResultSetImpl.execute(DrillResultSetImpl.java:76)
>at
> org.apache.calcite.avatica.AvaticaConnection$1.execute(AvaticaConnection.java:473)
>at
> org.apache.drill.jdbc.impl.DrillMetaImpl.prepareAndExecute(DrillMetaImpl.java:465)
>at
> org.apache.calcite.avatica.AvaticaConnection.prepareAndExecuteInternal(AvaticaConnection.java:477)
>at
> org.apache.drill.jdbc.impl.DrillConnectionImpl.prepareAndExecuteInternal(DrillConnectionImpl.java:169)
>at
> org.apache.calcite.avatica.AvaticaStatement.executeInternal(AvaticaStatement.java:109)
>at
> org.apache.calcite.avatica.AvaticaStatement.execute(AvaticaStatement.java:121)
>at
> org.apache.drill.jdbc.impl.DrillStatementImpl.execute(DrillStatementImpl.java:101)
>at sqlline.Commands.execute(Commands.java:841)
>at sqlline.Commands.sql(Commands.java:751)
>at sqlline.SqlLine.dispatch(SqlLine.java:746)
>at sqlline.SqlLine.runCommands(SqlLine.java:1651)
>at sqlline.Commands.run(Commands.java:1304)
>at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>at
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>at
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>at java.lang.reflect.Method.invoke(Method.java:498)
>at
> sqlline.ReflectiveCommandHandler.execute(ReflectiveCommandHandler.java:36)
>at sqlline.SqlLine.dispatch(SqlLine.java:742)
>at sqlline.SqlLine.initArgs(SqlLine.java:553)
>at sqlline.SqlLine.begin(SqlLine.java:596)
>at sqlline.SqlLine.start(SqlLine.java:375)
>at sqlline.SqlLine.main(SqlLine.java:268)
> Caused by: org.apache.drill.common.exceptions.UserException: CONNECTION
> ERROR: Connection /10.*.*.1:41613 <--> datanodeN/10.*.*.5:31010 (user
> client) closed unexpectedly. Drillbit down?
> 
> 
> [Error Id: 5089f2f1-0dfd-40f8-9fa0-8276c08be53f ]
>at
> org.apache.drill.common.exceptions.UserException$Builder.build(UserException.java:543)
>at
> org.apache.drill.exec.rpc.user.QueryResultHandler$ChannelClosedHandler$1.operationComplete(QueryResultHandler.java:373)
>at
> io.netty.util.concurrent.DefaultPromise.notifyListener0(DefaultPromise.java:680)
>at
> io.netty.util.concurrent.DefaultPromise.notifyListeners0(DefaultPromise.java:603)
>at
> io.netty.util.concurrent.DefaultPromise.notifyListeners(DefaultPromise.java:563)
>at
> io.netty.util.concurrent.DefaultPromise.trySuccess(DefaultPromise.java:406)
>at
> io.netty.channel.DefaultChannelPromise.trySuccess(DefaultChannelPromise.java:82)
>at
> io.netty.channel.AbstractChannel$CloseFuture.setClosed(AbstractChannel.java:943)
>at
> io.netty.channel.AbstractChannel$AbstractUnsafe.doClose0(AbstractChannel.java:592)
>at
> io.netty.channel.AbstractChannel$AbstractUnsafe.close(AbstractChannel.java:584)
>at
> io.netty.channel.nio.AbstractNioByteChannel$NioByteUnsafe.closeOnRead(AbstractNioByteChannel.java:71)
>at
> io.netty.channel.nio.AbstractNioByteChannel$NioByteUnsafe.handleReadException(AbstractNioByteChannel.java:89)
>at
> io.netty.channel.nio.AbstractNioByteChannel$NioByteUnsafe.read(AbstractNioByteChannel.java:162)
>at
> io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:511)
>at
> io.netty.channel.nio.NioEventLoop.processSelectedKeysOptimized(NioEventLoop.java:468)
>at
> io.netty.chann

[ANNOUNCE] Apache Drill 1.9.0 Released

2016-11-29 Thread Sudheesh Katkam

On behalf of the Apache Drill community, I am happy to announce the release
of Apache Drill 1.9.0.

For information about Apache Drill, and to get involved, visit the project
website:

https://drill.apache.org/

This release introduces new features and enhancements, including
asynchronous Parquet reader, Parquet filter pushdown, dynamic UDF support
and HTTPD format plugin. In all, 70 issues have been resolved.

The binary and source artifacts are available here:

https://drill.apache.org/download/

Review the release notes for a complete list of fixes and enhancements:

https://drill.apache.org/docs/apache-drill-1-9-0-release-notes/

Thanks to everyone in the community who contributed to this release!

- Sudheesh

Re: Apache Drill Hangout Minutes - 11/1/16

2016-11-02 Thread Sudheesh Katkam

I am going to update the pull request so that both will be "ok".

This implies that username/password credentials will be sent to the server
twice, during handshake and during SASL exchange. And sending credentials
through handshake will be deprecated (and removed in a future release).

Thank you,
Sudheesh

On Wed, Nov 2, 2016 at 2:58 PM, Jacques Nadeau  wrote:

> Since I'm not that close to DRILL-4280, I wanted to clarify expectation:
>
>
> <1.9 Client  <==>  1.9 Server (ok)
>  1.9 Client  <==> <1.9 Server (fails)
>
> Is that correct?
>
>
>
>
>
>
> --
> Jacques Nadeau
> CTO and Co-Founder, Dremio
>
> On Tue, Nov 1, 2016 at 8:44 PM, Sudheesh Katkam 
> wrote:
>
> > Hi Laurent,
> >
> > That's right; this was mentioned in the design document.
> >
> > I am piggybacking on previous changes that break the "newer clients
> talking
> > to older servers" compatibility. For example, as I understand, some
> > resolved sub-tasks of DRILL-4714 [1] *implicitly* break this
> compatibility;
> > say the "newer" API that was introduced is used by an application which
> is
> > talking to an older server. The older server drops the connection, unable
> > to handle the message.
> >
> > In DRILL-4280, there is an *explicit* break in that specific
> compatibility,
> > and the error message is much cleaner with a version mismatch message.
> The
> > difference is that the C++ client (unlike the Java client) checks for the
> > server version as well, which make the compatibility break more visible.
> >
> > I am not sure about the plan of action in general about this
> compatibility.
> > However, I could work around the issue by advertising clients' SASL
> > capability to the server. What do you think?
> >
> > Thank you,
> > Sudheesh
> >
> > [1] https://issues.apache.org/jira/browse/DRILL-4714
> >
> > On Nov 1, 2016, at 7:49 PM, Laurent Goujon  wrote:
> >
> > Just for clarity, DRILL-4280 is a breaking-protocol change, so is the
> plan
> > to defer this change to a later release, or to defer bringing back
> > compatibility between newer clients and older servers to a later release?
> >
> > Laurent
> >
> > On Tue, Nov 1, 2016 at 3:43 PM, Zelaine Fong  wrote:
> >
> > Oops, mistake in my notes.  For the second item, I meant DRILL-4280, not
> > DRILL-1950.
> >
> > On Tue, Nov 1, 2016 at 3:40 PM, Zelaine Fong  wrote:
> >
> > Attendees: Paul, Padma, Sorabh, Boaz, Sudheesh, Vitalii, Roman, Dave O,
> > Arina, Laurent, Kunal, Zelaine
> >
> > I had to leave the hangout at 10:30, so my notes only cover the
> >
> > discussion
> >
> > up till then.
> >
> > 1) Variable width decimal support - Dave O
> >
> > Currently Drill only supports fixed width byte array storage of decimals.
> > Dave has submitted a pull request for DRILL-4834 to add support for
> >
> > storing
> >
> > decimals with variable width byte arrays.  Eventually, variable width can
> > replace fixed width, but the pull request doesn't cover that.  Dave would
> > like someone in the community to review his pull request.
> >
> > 2) 1.9 release - Sudheesh
> >
> > Sudheesh is collecting pull requests for the release.  Some have been
> > reviewed and are waiting to be merged.  Sudheesh plans to commit a batch
> > this Wed and another this Friday.  He's targeting having a release
> > candidate build available next Monday.
> >
> > Laurent asked about Sudheesh's pull request for DRILL-1950.  He asked
> > whether thought had been given to supporting newer Drill clients with
> >
> > older
> >
> > Drill servers.  Sudheesh indicated that doing this would entail a
> >
> > breaking
> >
> > change in the protocol, and the plan was to defer doing this for a later
> > release where we may want to make other breaking changes like this.
> >
>

Re: Apache Drill Hangout Minutes - 11/1/16

2016-11-01 Thread Sudheesh Katkam

Hi Laurent,

That's right; this was mentioned in the design document.

I am piggybacking on previous changes that break the "newer clients talking
to older servers" compatibility. For example, as I understand, some
resolved sub-tasks of DRILL-4714 [1] *implicitly* break this compatibility;
say the "newer" API that was introduced is used by an application which is
talking to an older server. The older server drops the connection, unable
to handle the message.

In DRILL-4280, there is an *explicit* break in that specific compatibility,
and the error message is much cleaner with a version mismatch message. The
difference is that the C++ client (unlike the Java client) checks for the
server version as well, which make the compatibility break more visible.

I am not sure about the plan of action in general about this compatibility.
However, I could work around the issue by advertising clients' SASL
capability to the server. What do you think?

Thank you,
Sudheesh

[1] https://issues.apache.org/jira/browse/DRILL-4714

On Nov 1, 2016, at 7:49 PM, Laurent Goujon  wrote:

Just for clarity, DRILL-4280 is a breaking-protocol change, so is the plan
to defer this change to a later release, or to defer bringing back
compatibility between newer clients and older servers to a later release?

Laurent

On Tue, Nov 1, 2016 at 3:43 PM, Zelaine Fong  wrote:

Oops, mistake in my notes.  For the second item, I meant DRILL-4280, not
DRILL-1950.

On Tue, Nov 1, 2016 at 3:40 PM, Zelaine Fong  wrote:

Attendees: Paul, Padma, Sorabh, Boaz, Sudheesh, Vitalii, Roman, Dave O,
Arina, Laurent, Kunal, Zelaine

I had to leave the hangout at 10:30, so my notes only cover the

discussion

up till then.

1) Variable width decimal support - Dave O

Currently Drill only supports fixed width byte array storage of decimals.
Dave has submitted a pull request for DRILL-4834 to add support for

storing

decimals with variable width byte arrays.  Eventually, variable width can
replace fixed width, but the pull request doesn't cover that.  Dave would
like someone in the community to review his pull request.

2) 1.9 release - Sudheesh

Sudheesh is collecting pull requests for the release.  Some have been
reviewed and are waiting to be merged.  Sudheesh plans to commit a batch
this Wed and another this Friday.  He's targeting having a release
candidate build available next Monday.

Laurent asked about Sudheesh's pull request for DRILL-1950.  He asked
whether thought had been given to supporting newer Drill clients with

older

Drill servers.  Sudheesh indicated that doing this would entail a

breaking

change in the protocol, and the plan was to defer doing this for a later
release where we may want to make other breaking changes like this.

Hangout starting now..

2016-10-04 Thread Sudheesh Katkam

Link: https://hangouts.google.com/hangouts/_/maprtech.com/drillbi-weeklyhangout 


Thank you,
Sudheesh

Re: [HANGOUT] Topics for 10/04/16

2016-10-04 Thread Sudheesh Katkam

Join us at this link:

https://hangouts.google.com/hangouts/_/maprtech.com/drillbi-weeklyhangout 
<https://hangouts.google.com/hangouts/_/maprtech.com/drillbi-weeklyhangout>

> On Oct 3, 2016, at 9:26 PM, Shadi Khalifa  wrote:
> 
> Hi,
> I have been working on integrating WEKA into Drill to support building and 
> scoring classification models. I have been successful in supporting all WEKA 
> classifiers and making them run in a distributed fashion over Drill 1.2. The 
> classifier accuracy is not affected by running in a distributed fashion and 
> the training and scoring times are getting a huge boost using Drill. A paper 
> on this has been published in the IEEE symposium on Big Data in June 2016 
> [available: 
> http://cs.queensu.ca/~khalifa/qdrill/QDrill_20160212IEEE_CameraReady.pdf] and 
> we are now in the process of publishing another paper in which QDrill 
> supports all WEKA algorithms. FYI, this can be easily extended to support 
> clustering and other types of WEKA algorithms. The architecture also allows 
> supporting other data mining libraries.
> The QDrill project website is  http://cs.queensu.ca/~khalifa/qdrill, the 
> project downloadable version on it is little bit old but I'm planning to 
> upload a more updated stable version within the next 10 days. I'm also using 
> an SVN repository and planning to move the project to GitHub to make it 
> easier to get the latest Drill versions and to may be integrate with Drill at 
> some point. 
> Unfortunately, I have another meeting tomorrow at the same time of the 
> hangout, but I would love to know your opinion and to discuss the process of 
> evaluating this extension and may be integrating it with Drill at some point. 
> Regards
> Shadi KhalifaPhD CandidateSchool of Computing Queen's University Canada
> I'm just a neuron in the society collective brain
> 
> 01001001 0010 01101100 0110 01110110 01100101 0010 01000101 
> 01100111 0001 0111 01110100 
> P Please consider your environmental responsibility before printing this 
> e-mail
> 
> 
> 
>On Monday, October 3, 2016 10:52 PM, Laurent Goujon  
> wrote:
> 
> 
> Hi,
> 
> I'm currently working on improving metadata support for both the JDBC
> driver and the C++ connector, more specifically the following JIRAs:
> 
> DRILL-4853: Update C++ protobuf source files
> DRILL-4420: Server-side metadata and prepared-statement support for C++
> connector
> DRILL-4880: Support JDBC driver registration using ServiceLoader
> DRILL-4925: Add tableType filter to GetTables metadata query
> DRILL-4730: Update JDBC DatabaseMetaData implementation to use new Metadata
> APIs
> 
> I  already opened multiple pull requests for those (the list is available
> at https://github.com/apache/drill/pulls/laurentgo)
> 
> I'm planning to join tomorrow hangout in case people have questions about
> those.
> 
> Cheers,
> 
> Laurent
> 
> On Mon, Oct 3, 2016 at 10:28 AM, Subbu Srinivasan 
> wrote:
> 
>> Can we close on https://github.com/apache/drill/pull/518 ?
>> 
>> On Mon, Oct 3, 2016 at 10:27 AM, Sudheesh Katkam 
>> wrote:
>> 
>>> Hi drillers,
>>> 
>>> Our bi-weekly hangout is tomorrow (10/04/16, 10 AM PT). If you have any
>>> suggestions for hangout topics, you can add them to this thread. We will
>>> also ask around at the beginning of the hangout for topics.
>>> 
>>> Thank you,
>>> Sudheesh
>>> 
>> 
> 
>

[HANGOUT] Topics for 10/04/16

2016-10-03 Thread Sudheesh Katkam

Hi drillers,

Our bi-weekly hangout is tomorrow (10/04/16, 10 AM PT). If you have any
suggestions for hangout topics, you can add them to this thread. We will
also ask around at the beginning of the hangout for topics.

Thank you,
Sudheesh

Re: Right outer join fails

2016-09-26 Thread Sudheesh Katkam

Hi Kathir,

I tried simple filter conditions with aliases.

This query did not return any result:
select city[0] as cityalias from dfs.tmp.`data.json` where cityalias = 1;

But, this query works fine:
select city[0] as cityalias from dfs.tmp.`data.json` where city[0] = 1;

So I suppose aliases are supported in join or filter conditions. There is an 
enhancement request for aliases in group by conditions [1]; please open an 
enhancement ticket for this issue and link it to [1].

Thank you,
Sudheesh

[1] https://issues.apache.org/jira/browse/DRILL-1248

> On Sep 21, 2016, at 2:24 PM, Kathiresan S  
> wrote:
> 
> Hi Sudheesh,
> 
> There is another related issue around this.
> 
> For the same data I've used for DRILL-4890
> <https://issues.apache.org/jira/browse/DRILL-4890>, below query doesn't
> return any result (which is supposed to return one row)
> 
> select city[0] as cityalias from dfs.tmp.`data.json` a join (select id as
> idalias from dfs.tmp.`cities.json`) b on a.*cityalias  *= b.idalias
> 
> However, the query below works fine
> 
> select city[0] as cityalias from dfs.tmp.`data.json` a join (select id as
> idalias from dfs.tmp.`cities.json`) b on a.*city[0] * = b.idalias
> 
> Using an alias for city[0] in the join condition makes it return no result.
> 
> Any idea, is this a known issue (is there any JIRA issue already tracked
> for this) or should a separate JIRA issue be filed for this?
> 
> *Files used for testing:*
> 
> *Json file 1: data.json*
> 
> { "name": "Jim","city" : [1,2]}
> 
> *Json file 2: cities.json*
> 
> {id:1,name:"Sendurai"}
> {id:2,name:"NYC"}
> 
> Thanks,
> Kathir
> 
> On Wed, Sep 14, 2016 at 8:23 AM, Kathiresan S 
> wrote:
> 
>> Hi Sudheesh,
>> 
>> I've filed a JIRA for this
>> 
>> https://issues.apache.org/jira/browse/DRILL-4890
>> 
>> Thanks,
>> Kathir
>> 
>> On Wed, Sep 14, 2016 at 8:09 AM, Kathiresan S <
>> kathiresanselva...@gmail.com> wrote:
>> 
>>> Hi Sudheesh,
>>> 
>>> Thanks for checking this out.
>>> I do get the same error what you get, when i run Drillbit on my Eclipse
>>> and run the same query from WebUI pointing to my local instance, and on top
>>> of this error, i do get the "QueryDataBatch was released twice" error as
>>> well.
>>> 
>>> But, in drillbit.log of one of the nodes on the cluster, where this
>>> failed, i don't see the IndexOutOfBoundsException. Somehow, the 
>>> IndexOutOfBoundsException
>>> log is getting suppressed and only the QueryDataBatch error is logged. But
>>> thats a separate issue.
>>> 
>>> I did run it from WebUI and its in RUNNING state forever (actually i
>>> started one yesterday and left the tab, its still in RUNNING state)
>>> 
>>> Sure, I'll file a JIRA and will provide the details here.
>>> 
>>> Thanks again!
>>> 
>>> Regards,
>>> Kathir
>>> 
>>> 
>>> On Tue, Sep 13, 2016 at 8:17 PM, Sudheesh Katkam 
>>> wrote:
>>> 
>>>> Hi Kathir,
>>>> 
>>>> I tried the same query in embedded mode, and I got a different error.
>>>> 
>>>> java.lang.IndexOutOfBoundsException: index: 0, length: 8 (expected:
>>>> range(0, 0))
>>>>at io.netty.buffer.DrillBuf.checkIndexD(DrillBuf.java:123)
>>>>at io.netty.buffer.DrillBuf.chk(DrillBuf.java:147)
>>>>at io.netty.buffer.DrillBuf.getLong(DrillBuf.java:493)
>>>>at org.apache.drill.exec.vector.BigIntVector$Accessor.get(BigIn
>>>> tVector.java:353)
>>>>at org.apache.drill.exec.vector.BigIntVector$Accessor.getObject
>>>> (BigIntVector.java:359)
>>>>at org.apache.drill.exec.vector.RepeatedBigIntVector$Accessor.g
>>>> etObject(RepeatedBigIntVector.java:297)
>>>>at org.apache.drill.exec.vector.RepeatedBigIntVector$Accessor.g
>>>> etObject(RepeatedBigIntVector.java:288)
>>>>at org.apache.drill.exec.vector.accessor.GenericAccessor.getObj
>>>> ect(GenericAccessor.java:44)
>>>>at org.apache.drill.exec.vector.accessor.BoundCheckingAccessor.
>>>> getObject(BoundCheckingAccessor.java:148)
>>>>at org.apache.drill.jdbc.impl.TypeConvertingSqlAccessor.getObje
>>>> ct(TypeConvertingSqlAccessor.java:795)
>>>>at org.apache.drill.jdbc.impl.AvaticaDrillSqlAccessor.getObject
>>>> (AvaticaDrillSqlAcces

Re: drill rest api converts all data types to string

2016-09-16 Thread Sudheesh Katkam

This is different.

> Nested data however is returned with the original data types intact.

.. which makes sense, looking at the code.

Thank you,
Sudheesh

> On Sep 16, 2016, at 11:18 AM, Jacques Nadeau  wrote:
> 
> FYI, it is already filed: https://issues.apache.org/jira/browse/DRILL-2373
> 
> --
> Jacques Nadeau
> CTO and Co-Founder, Dremio
> 
> On Fri, Sep 16, 2016 at 8:56 AM, Sudheesh Katkam 
> wrote:
> 
>> Hi Niek,
>> 
>> That is a bug; thank you for digging out the exact location. Please open a
>> ticket <https://issues.apache.org/jira/browse/DRILL>. Let’s start
>> tracking the details of the fix in that ticket.
>> 
>> Thank you,
>> Sudheesh
>> 
>>> On Sep 15, 2016, at 2:49 AM, Niek Bartholomeus 
>> wrote:
>>> 
>>> Hi,
>>> 
>>> I'm using the drill rest api to query my parquet files that were
>> generated by spark.
>>> 
>>> I noticed that numeric and boolean data types are all converted to
>> string in the results. Nested data however is returned with the original
>> data types intact.
>>> 
>>> Probably this is happening here: https://github.com/apache/drill/blob/
>> 2d9f9abb4c47d08f8462599c8d6076a61a1708fe/exec/java-exec/src/
>> main/java/org/apache/drill/exec/server/rest/QueryWrapper.java#L158
>>> 
>>> Is there any way how to fix this?
>>> 
>>> I'm using the latest version of drill.
>>> 
>>> Thanks in advance,
>>> 
>>> Niek.
>> 
>>

Re: Drill 1.8.0 Error: RESOURCE ERROR: Failed to create schema tree.

2016-09-16 Thread Sudheesh Katkam

This is how to get a verbose error:

First set the option:
> SET `exec.errors.verbose` = true;

And then run the query. The detailed output will point us to where the error 
occurred.

Thank you,
Sudheesh

> On Sep 15, 2016, at 9:12 PM, Abhishek Girish  
> wrote:
> 
> Hi Kartik,
> 
> Can you take a look at the logs (or turn on verbose errors) and share the
> relevant stack trace? Also what platform is this on?
> 
> -Abhishek
> 
> On Thu, Sep 15, 2016 at 4:26 PM, Kartik Bhatia  wrote:
> 
>> When I run the following
>> 0: jdbc:drill:zk=local> SELECT * FROM cp.`employee.json` LIMIT 5;
>> It gives me java expection with  Error: RESOURCE ERROR: Failed to create
>> schema tree.
>> 
>> ~~
>> This e-mail message from State Compensation Insurance Fund and all
>> attachments transmitted with it
>> may be privileged or confidential and protected from disclosure. If you
>> are not the intended recipient,
>> you are hereby notified that any dissemination, distribution, copying, or
>> taking any action based on it
>> is strictly prohibited and may have legal consequences. If you have
>> received this e-mail in error,
>> please notify the sender by reply e-mail and destroy the original message
>> and all copies.
>> ~~
>>

Re: drill rest api converts all data types to string

2016-09-16 Thread Sudheesh Katkam

Hi Niek,

That is a bug; thank you for digging out the exact location. Please open a 
ticket . Let’s start tracking the 
details of the fix in that ticket.

Thank you,
Sudheesh

> On Sep 15, 2016, at 2:49 AM, Niek Bartholomeus  wrote:
> 
> Hi,
> 
> I'm using the drill rest api to query my parquet files that were generated by 
> spark.
> 
> I noticed that numeric and boolean data types are all converted to string in 
> the results. Nested data however is returned with the original data types 
> intact.
> 
> Probably this is happening here: 
> https://github.com/apache/drill/blob/2d9f9abb4c47d08f8462599c8d6076a61a1708fe/exec/java-exec/src/main/java/org/apache/drill/exec/server/rest/QueryWrapper.java#L158
> 
> Is there any way how to fix this?
> 
> I'm using the latest version of drill.
> 
> Thanks in advance,
> 
> Niek.

Re: Support for Alluxio

2016-09-14 Thread Sudheesh Katkam

Based on the documentation [1], Alluxio not only uses HDFS as a underlying 
storage system but also provides a “Hadoop API” to clients, and Drill uses this 
API. So it should be possible.

Create a storage plugin named “alluxio” whose contents match “dfs” plugin, and 
then make changes. Not sure what exactly, but [2] should be helpful. See “S3 
Storage Plugin” as an example as well [3]. Once you get things to work, please 
contribute by adding a section to the Drill documentation.

Thank you,
Sudheesh

[1] http://www.alluxio.org/docs/master/en/File-System-API.html#hadoop-api 

[2] http://www.alluxio.org/docs/master/en/Configuration-Settings.html 

[3] http://drill.apache.org/docs/s3-storage-plugin/ 

> On Sep 14, 2016, at 10:07 AM, Edmon Begoli  wrote:
> 
> Is it possible to use Alluxio as a filesystem backend in Drill's storage
> formats, and if so how?
> 
> Thanks.

Re: Queries over Swift?

2016-09-14 Thread Sudheesh Katkam

AFAIK, there is no documentation. I am not sure anyone has tried it before. 
That said, from [1], Swift enables Apache Hadoop applications - including 
MapReduce jobs, read and write data to and from instances of the OpenStack 
Swift object store. And Drill uses the HDFS client library. So using Swift 
through Drill should be possible.

My guess.. Create storage plugin named “swift”, copy the contents from the 
“dfs” plugin. I am not sure what the contents of “swift” should be exactly; see 
[1] and [2]. The parameters and values mentioned in the “Configuring” section 
in [1] should be provided through the “config” map in the storage plugin (or 
maybe through conf/core-site.xml in the Drill installation directory).

Something like:
{
  "type": "file",
  "enabled": true,
  "connection": "swift://dmitry.privatecloud/out/results",
  "workspaces": {
...
  },
  "formats": {
...
  }
  "config": {
...
  }
}

A roundabout way could use Swift through S3 [3]. Again, I do not know the exact 
configuration details.

Once you get things to work, you can also add a section to the Drill docs based 
on your experience!

Thank you,
Sudheesh

[1] https://hadoop.apache.org/docs/stable2/hadoop-openstack/index.html 

[2] http://drill.apache.org/docs/s3-storage-plugin/ 

[3] https://github.com/openstack/swift3 

> On Sep 14, 2016, at 9:50 AM, MattK  wrote:
> 
> The Drill FAQ mentions that Swift can be queried as well as S3.
> 
> I have found an S3 plugin (https://drill.apache.org/docs/s3-storage-plugin/) 
> but nothing yet for docs, examples, or plugins for Swift.
> 
> Is there any documentation available?

Re: Quering Hbase with TimeRange, Version, Timestamp,etc

2016-09-14 Thread Sudheesh Katkam

AFAIK, not possible currently.

Thank you,
Sudheesh

> On Sep 14, 2016, at 5:15 AM, Ulf Andreasson @ MapR  
> wrote:
> 
> I am also curious regarding this, is it possible ?
> 
> 
> 
> Ulf Andreasson | Ericsson Global Alliance Solution Engineer, MapR.com | +46
> 72 700 2295
> 
> 
> On Fri, Mar 4, 2016 at 2:51 AM, Abhishek  wrote:
> 
>> In hbase shell we could use command like
>> 
>> get '/test/service','rk1',{COLUMN=>'fm1:col1',TIMERAGE=>[0,TS1]}
>> 
>> to query the record versions fall into [0,TS1) where TS1 is the timestamp
>> for each version (the ts1 we set when running the 'put' command).
>> 
>> Is there a way to do the same SQL query in Drill to query such data?
>> 
>> e.g. select * from hbase.`/test/table` as t where t.fm1.TIMESTAMP =
>> 1442X
>> 
>> or is there a native search mod in Drill that I could use hbase query
>> instead of sql?
>> 
>> --
>> If you tell the truth, you don't have to remember anything..
>> Regards,
>> Abhishek Agrawal
>>

Re: Right outer join fails

2016-09-13 Thread Sudheesh Katkam

Hi Kathir,

I tried the same query in embedded mode, and I got a different error.

java.lang.IndexOutOfBoundsException: index: 0, length: 8 (expected: range(0, 0))
at io.netty.buffer.DrillBuf.checkIndexD(DrillBuf.java:123)
at io.netty.buffer.DrillBuf.chk(DrillBuf.java:147)
at io.netty.buffer.DrillBuf.getLong(DrillBuf.java:493)
at 
org.apache.drill.exec.vector.BigIntVector$Accessor.get(BigIntVector.java:353)
at 
org.apache.drill.exec.vector.BigIntVector$Accessor.getObject(BigIntVector.java:359)
at 
org.apache.drill.exec.vector.RepeatedBigIntVector$Accessor.getObject(RepeatedBigIntVector.java:297)
at 
org.apache.drill.exec.vector.RepeatedBigIntVector$Accessor.getObject(RepeatedBigIntVector.java:288)
at 
org.apache.drill.exec.vector.accessor.GenericAccessor.getObject(GenericAccessor.java:44)
at 
org.apache.drill.exec.vector.accessor.BoundCheckingAccessor.getObject(BoundCheckingAccessor.java:148)
at 
org.apache.drill.jdbc.impl.TypeConvertingSqlAccessor.getObject(TypeConvertingSqlAccessor.java:795)
at 
org.apache.drill.jdbc.impl.AvaticaDrillSqlAccessor.getObject(AvaticaDrillSqlAccessor.java:179)
...

In this case, the Java client library is not able to consume the results sent 
from the server, and the query was CANCELLED (as seen in the query profile, on 
the web UI). Are you seeing the same?

I am not aware of any workarounds; this seems like a bug to me. Can you open a 
ticket ?

Thank you,
Sudheesh

> On Sep 13, 2016, at 7:10 AM, Kathiresan S  
> wrote:
> 
> Hi,
> 
> Additional info on this. Array column ('city' in the example) is the issue.
> 
> 1. When i select the just the first occurrence of the array column, the
> query works fine
> 
> select a.name,a.city[0],b.name from dfs.tmp.`data.json` a right join
> dfs.tmp.`cities.json` b on a.city[0]=b.id
> 
> Result
> Jim 1 Sendurai
> null null NYC
> 
> 
> 2. And when i do a repeated_count on the array column, it returns -2 on the
> second row
> 
> select a.name,repeated_count(a.city),b.name from dfs.tmp.`data.json` a
> right join dfs.tmp.`cities.json` b on a.city[0]=b.id
> 
> Result
> Jim 2 Sendurai
> null -2 NYC
> 
> Any idea/work around for this issue would be highly appreciated
> 
> Thanks,
> Kathir
> 
> 
> On Sat, Sep 10, 2016 at 9:56 PM, Kathiresan S 
> wrote:
> 
>> Hi,  A Query with right outer join fails while the inner and left outer
>> joins work for the same data. I've replicated the issue with some simple
>> data here and this happens in both 1.6.0 and 1.8.0
>> 
>> *Json file 1: data.json*
>> 
>> { "name": "Jim","city" : [1,2]}
>> 
>> *Json file 2: cities.json*
>> 
>> {id:1,name:"Sendurai"}
>> {id:2,name:"NYC"}
>> 
>> *Queries that work:*
>> 1.  select a.name,a.city,b.id,b.name from dfs.tmp.`data.json` a left
>> outer join dfs.tmp.`cities.json` b on a.city[0]=b.id
>> 
>> 2. select a.name,a.city,b.id,b.name from dfs.tmp.`data.json` a join
>> dfs.tmp.`cities.json` b on a.city[0]=b.id
>> 
>> *Query that fails:*
>> 
>> select a.name,a.city,b.id,b.name from dfs.tmp.`data.json` a right outer
>> join dfs.tmp.`cities.json` b on a.city[0]=b.id
>> 
>> *On the server side, i see below error trace :*
>> 
>> java.lang.IllegalStateException: QueryDataBatch was released twice.
>>at 
>> org.apache.drill.exec.rpc.user.QueryDataBatch.release(QueryDataBatch.java:56)
>> [drill-java-exec-1.6.0.jar:1.6.0]
>>at 
>> org.apache.drill.exec.rpc.user.QueryResultHandler.batchArrived(QueryResultHandler.java:167)
>> [drill-java-exec-1.6.0.jar:1.6.0]
>>at 
>> org.apache.drill.exec.rpc.user.UserClient.handleReponse(UserClient.java:110)
>> ~[drill-java-exec-1.6.0.jar:1.6.0]
>>at org.apache.drill.exec.rpc.BasicClientWithConnection.handle(
>> BasicClientWithConnection.java:46) ~[drill-rpc-1.6.0.jar:1.6.0]
>>at org.apache.drill.exec.rpc.BasicClientWithConnection.handle(
>> BasicClientWithConnection.java:31) ~[drill-rpc-1.6.0.jar:1.6.0]
>>at org.apache.drill.exec.rpc.RpcBus.handle(RpcBus.java:67)
>> ~[drill-rpc-1.6.0.jar:1.6.0]
>>at org.apache.drill.exec.rpc.RpcBus$RequestEvent.run(RpcBus.java:374)
>> ~[drill-rpc-1.6.0.jar:1.6.0]
>>at org.apache.drill.common.SerializedExecutor$
>> RunnableProcessor.run(SerializedExecutor.java:89)
>> [drill-rpc-1.6.0.jar:1.6.0]
>>at 
>> org.apache.drill.exec.rpc.RpcBus$SameExecutor.execute(RpcBus.java:252)
>> [drill-rpc-1.6.0.jar:1.6.0]
>>at 
>> org.apache.drill.common.SerializedExecutor.execute(SerializedExecutor.java:123)
>> [drill-rpc-1.6.0.jar:1.6.0]
>>at 
>> org.apache.drill.exec.rpc.RpcBus$InboundHandler.decode(RpcBus.java:285)
>> [drill-rpc-1.6.0.jar:1.6.0]
>>at 
>> org.apache.drill.exec.rpc.RpcBus$InboundHandler.decode(RpcBus.java:257)
>> [drill-rpc-1.6.0.jar:1.6.0]
>>at io.netty.handler.codec.MessageToMessageDecoder.channelRead(
>> MessageToMessageDecoder.java:89) [netty-codec-4.0.27.Final.jar:
>> 4.0.

Re: Authentication with PAM fails with Centos 7

2016-09-13 Thread Sudheesh Katkam

Hi Pradeeban,

I am not entirely sure what the problem is. The important part from the stack 
trace is here:

> Invalid user credentials: PAM profile 'pkathi2' validation failed


From code 
,
 this error message says that verification against a PAM profile named 
“pkathi2” failed. 
+ So in your case, looks like both username and PAM profile name are the same?
+ We have tested the feature in Centos 6 against “login” and “sudo” profiles. 
So maybe some PAM configuration issue specific to “pkathi2” and/or Centos 7?
+ Drill internally uses JPAM  as bridge to PAM. 
So as a last resort, you may need to ensure JPAM is working on Centos 7. If you 
are comfortable running tests in Java, try this test 
.

Thank you,
Sudheesh

> On Sep 12, 2016, at 10:04 AM, Pradeeban Kathiravelu  
> wrote:
> 
> Hi,
> I have configured Drill with authentication using PAM successfully several
> times in Ubuntu (14.04 and 16.04). However, when I try to do the same with
> Centos 7 (I have sudo access to this), it fails.
> 
> Please note that this is a simple embedded Drill instance, and it works
> without the authentication configured.
> 
> I followed the same steps -
> https://drill.apache.org/docs/configuring-user-authentication/ for
> authentication. The credentials are correct (despite the error logs
> suggesting them to be wrong below).
> 
> 
> $DRILL_HOME/bin/drill-embedded -n pkathi2 -p *
> Java HotSpot(TM) 64-Bit Server VM warning: ignoring option
> MaxPermSize=512M; support was removed in 8.0
> set 12, 2016 12:58:40 PM org.glassfish.jersey.server.ApplicationHandler
> initialize
> INFO: Initiating Jersey application, version Jersey: 2.8 2014-04-29
> 01:25:26...
> set 12, 2016 12:58:41 PM org.glassfish.jersey.internal.Errors logErrors
> WARNING: The following warnings have been detected: HINT: A HTTP GET
> method, public void
> org.apache.drill.exec.server.rest.LogInLogOutResources.logout(javax.servlet.http.HttpServletRequest,javax.servlet.http.HttpServletResponse)
> throws java.lang.Exception, returns a void type. It can be intentional and
> perfectly fine, but it is a little uncommon that GET method returns always
> "204 No Content".
> 
> Error: Failure in connecting to Drill:
> org.apache.drill.exec.rpc.RpcException: HANDSHAKE_VALIDATION : Status:
> AUTH_FAILED, Error Id: 3f791614-a251-40b2-aa36-d02200f4cb6e, Error message:
> Invalid user credentials: PAM profile 'pkathi2' validation failed
> (state=,code=0)
> java.sql.SQLException: Failure in connecting to Drill:
> org.apache.drill.exec.rpc.RpcException: HANDSHAKE_VALIDATION : Status:
> AUTH_FAILED, Error Id: 3f791614-a251-40b2-aa36-d02200f4cb6e, Error message:
> Invalid user credentials: PAM profile 'pkathi2' validation failed
>at
> org.apache.drill.jdbc.impl.DrillConnectionImpl.(DrillConnectionImpl.java:159)
>at
> org.apache.drill.jdbc.impl.DrillJdbc41Factory.newDrillConnection(DrillJdbc41Factory.java:64)
>at
> org.apache.drill.jdbc.impl.DrillFactory.newConnection(DrillFactory.java:69)
>at
> net.hydromatic.avatica.UnregisteredDriver.connect(UnregisteredDriver.java:126)
>at org.apache.drill.jdbc.Driver.connect(Driver.java:72)
>at sqlline.DatabaseConnection.connect(DatabaseConnection.java:167)
>at sqlline.DatabaseConnection.getConnection(DatabaseConnection.java:213)
>at sqlline.Commands.connect(Commands.java:1083)
>at sqlline.Commands.connect(Commands.java:1015)
>at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>at
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>at
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>at java.lang.reflect.Method.invoke(Method.java:483)
>at
> sqlline.ReflectiveCommandHandler.execute(ReflectiveCommandHandler.java:36)
>at sqlline.SqlLine.dispatch(SqlLine.java:742)
>at sqlline.SqlLine.initArgs(SqlLine.java:528)
>at sqlline.SqlLine.begin(SqlLine.java:596)
>at sqlline.SqlLine.start(SqlLine.java:375)
>at sqlline.SqlLine.main(SqlLine.java:268)
> Caused by: org.apache.drill.exec.rpc.RpcException: HANDSHAKE_VALIDATION :
> Status: AUTH_FAILED, Error Id: 3f791614-a251-40b2-aa36-d02200f4cb6e, Error
> message: Invalid user credentials: PAM profile 'pkathi2' validation failed
>at
> org.apache.drill.exec.client.DrillClient$FutureHandler.connectionFailed(DrillClient.java:503)
>at
> org.apache.drill.exec.rpc.user.QueryResultHandler$ChannelClosedHandler.connectionFailed(QueryResultHandler.java:389)
>at
> org.apache.drill.exec.rpc.BasicClient$ConnectionMultiListener$HandshakeSendHandler.success(BasicClient.java:266)
>at
> org.apache.drill.exec.rpc.BasicClient$ConnectionMultiListener$HandshakeSendHandler.success(BasicCli

Re: NULL values for DATE type columns using JDBC connector

2016-09-13 Thread Sudheesh Katkam

Hi Dan,

Per documentation , I don’t 
think sql4es is in the list of actively tested JDBC drivers. But a “WHERE 
something IS NOT NULL” returning NULL looks like a bug to me. Can you open a 
ticket ?

Thank you,
Sudheesh

> On Sep 12, 2016, at 8:04 AM, Dan Markhasin  wrote:
> 
> Hi all,
> 
> I'm using Drill to query ElasticSearch using the sql4es driver (
> https://github.com/Anchormen/sql4es) and I've run into an issue where Drill
> returns NULL values for Date columns:
> 
> 0: jdbc:drill:zk=local> select Date_01 from
> ES23.`data-generator-poc-async`.arm where Date_01 IS NOT NULL limit 1;
> +--+
> | Date_01  |
> +--+
> | null |
> +--+
> 
> The DESCRIBE command returns the correct data type (DATE).
> 
> Using other JDBC clients (Squirrel / WorkbenchJ) I am able to run the exact
> same query on the exact source, and get the correct data (which is in the
> form of 2016-03-21, for example).
> 
> Any idea why it's returning null?

Re: Query hangs on planning

2016-09-01 Thread Sudheesh Katkam

That setting is for off-heap memory. The earlier case hit heap memory limit.

> On Sep 1, 2016, at 11:36 AM, Zelaine Fong  wrote:
> 
> One other thing ... have you tried tuning the planner.memory_limit
> parameter?  Based on the earlier stack trace, you're hitting a memory limit
> during query planning.  So, tuning this parameter should help that.  The
> default is 256 MB.
> 
> -- Zelaine
> 
> On Thu, Sep 1, 2016 at 11:21 AM, rahul challapalli <
> challapallira...@gmail.com> wrote:
> 
>> While planning we use heap memory. 2GB of heap should be sufficient for
>> what you mentioned. This looks like a bug to me. Can you raise a jira for
>> the same? And it would be super helpful if you can also attach the data set
>> used.
>> 
>> Rahul
>> 
>> On Wed, Aug 31, 2016 at 9:14 AM, Oscar Morante 
>> wrote:
>> 
>>> Sure,
>>> This is what I remember:
>>> 
>>> * Failure
>>>   - embedded mode on my laptop
>>>   - drill memory: 2Gb/4Gb (heap/direct)
>>>   - cpu: 4cores (+hyperthreading)
>>>   - `planner.width.max_per_node=6`
>>> 
>>> * Success
>>>   - AWS Cluster 2x c3.8xlarge
>>>   - drill memory: 16Gb/32Gb
>>>   - cpu: limited by kubernetes to 24cores
>>>   - `planner.width.max_per_node=23`
>>> 
>>> I'm very busy right now to test again, but I'll try to provide better
>> info
>>> as soon as I can.
>>> 
>>> 
>>> 
>>> On Wed, Aug 31, 2016 at 05:38:53PM +0530, Khurram Faraaz wrote:
>>> 
 Can you please share the number of cores on the setup where the query
>> hung
 as compared to the number of cores on the setup where the query went
 through successfully.
 And details of memory from the two scenarios.
 
 Thanks,
 Khurram
 
 On Wed, Aug 31, 2016 at 4:50 PM, Oscar Morante 
 wrote:
 
 For the record, I think this was just bad memory configuration after
>> all.
> I retested on bigger machines and everything seems to be working fine.
> 
> 
> On Tue, Aug 09, 2016 at 10:46:33PM +0530, Khurram Faraaz wrote:
> 
> Oscar, can you please report a JIRA with the required steps to
>> reproduce
>> the OOM error. That way someone from the Drill team will take a look
>> and
>> investigate.
>> 
>> For others interested here is the stack trace.
>> 
>> 2016-08-09 16:51:14,280 [285642de-ab37-de6e-a54c-
>> 378aaa4ce50e:foreman]
>> ERROR o.a.drill.common.CatastrophicFailure - Catastrophic Failure
>> Occurred,
>> exiting. Information message: Unable to handle out of memory condition
>> in
>> Foreman.
>> java.lang.OutOfMemoryError: Java heap space
>>   at java.util.Arrays.copyOfRange(Arrays.java:2694)
>> ~[na:1.7.0_111]
>>   at java.lang.String.(String.java:203) ~[na:1.7.0_111]
>>   at java.lang.StringBuilder.toString(StringBuilder.java:405)
>> ~[na:1.7.0_111]
>>   at org.apache.calcite.util.Util.newInternal(Util.java:785)
>> ~[calcite-core-1.4.0-drill-r16-PATCHED.jar:1.4.0-drill-r16-PATCHED]
>>   at
>> org.apache.calcite.plan.volcano.VolcanoRuleCall.onMatch(
>> VolcanoRuleCall.java:251)
>> ~[calcite-core-1.4.0-drill-r16-PATCHED.jar:1.4.0-drill-r16-PATCHED]
>>   at
>> org.apache.calcite.plan.volcano.VolcanoPlanner.findBestExp(
>> VolcanoPlanner.java:808)
>> ~[calcite-core-1.4.0-drill-r16-PATCHED.jar:1.4.0-drill-r16-PATCHED]
>>   at
>> org.apache.calcite.tools.Programs$RuleSetProgram.run(
>> Programs.java:303)
>> ~[calcite-core-1.4.0-drill-r16-PATCHED.jar:1.4.0-drill-r16-PATCHED]
>>   at
>> org.apache.drill.exec.planner.sql.handlers.DefaultSqlHandler
>> .transform(DefaultSqlHandler.java:404)
>> ~[drill-java-exec-1.8.0-SNAPSHOT.jar:1.8.0-SNAPSHOT]
>>   at
>> org.apache.drill.exec.planner.sql.handlers.DefaultSqlHandler
>> .transform(DefaultSqlHandler.java:343)
>> ~[drill-java-exec-1.8.0-SNAPSHOT.jar:1.8.0-SNAPSHOT]
>>   at
>> org.apache.drill.exec.planner.sql.handlers.DefaultSqlHandler
>> .convertToDrel(DefaultSqlHandler.java:240)
>> ~[drill-java-exec-1.8.0-SNAPSHOT.jar:1.8.0-SNAPSHOT]
>>   at
>> org.apache.drill.exec.planner.sql.handlers.DefaultSqlHandler
>> .convertToDrel(DefaultSqlHandler.java:290)
>> ~[drill-java-exec-1.8.0-SNAPSHOT.jar:1.8.0-SNAPSHOT]
>>   at
>> org.apache.drill.exec.planner.sql.handlers.ExplainHandler.ge
>> tPlan(ExplainHandler.java:61)
>> ~[drill-java-exec-1.8.0-SNAPSHOT.jar:1.8.0-SNAPSHOT]
>>   at
>> org.apache.drill.exec.planner.sql.DrillSqlWorker.getPlan(Dri
>> llSqlWorker.java:94)
>> ~[drill-java-exec-1.8.0-SNAPSHOT.jar:1.8.0-SNAPSHOT]
>>   at
>> org.apache.drill.exec.work.foreman.Foreman.runSQL(Foreman.java:978)
>> ~[drill-java-exec-1.8.0-SNAPSHOT.jar:1.8.0-SNAPSHOT]
>>   at org.apache.drill.exec.work.foreman.Foreman.run(Foreman.
>> java:
>> 257)
>> ~[drill-java-exec-1.8.0-SNAPSHOT.jar:1.8.0-SNAPSHOT]
>>   at
>> java.util.concurrent.ThreadPoolEx

Re: Fetch queries status from drill prompt

2016-08-18 Thread Sudheesh Katkam

Profiles of running queries are stored in Zookeeper (or the configured 
transient store).

Thank you,
Sudheesh

> On Aug 18, 2016, at 11:23 AM, Anup Tiwari  wrote:
> 
> Thanks chun for info..
> 
> But can you tell me from where, running queries status come on profile
> user-interface(UI)? Because if it's coming on profile UI then it must have
> some back end file or something like that..
> On 18-Aug-2016 11:37 PM, "Chun Chang"  wrote:
> 
> Anup,
> 
> I believe only when a query is in a "terminal", i.e.
> cancelled/completed/failed state, then it is written to the
> drillbit_queries.json file on the foreman node. If what you want to do is
> monitoring queries running on your cluster, your best bet is to configure
> your cluster to store profile information on HDFS and monitor through query
> profile. Remember if you have a cluster, you will have a
> drillbit_queries.json file on very cluster node where drillbit is running.
> And each file only contains completed queries that were run on that node as
> foreman. You would have to aggregate to get the whole picture of your
> cluster. Even that, you will not see running queries.
> 
> Hope this helps.
> 
> On Thu, Aug 18, 2016 at 12:34 AM, Anup Tiwari 
> wrote:
> 
>> Hi All,
>> 
>> We want to see all types of queries which ran on drill cluster or
> currently
>> running from drill prompt, Can someone help us on this?
>> 
>> To achieve above , we read the drill documentation and set up a storage
>> plugin to access local file system and able to query
>> *"drillbit_queries.json"* log file, but in above file we are getting
> status
>> of all queries whose status is either "cancelled","completed" or "failed"
>> but missing "running". At the same time we check drill profile interface
>> where we can see running queries.
>> 
>> I am sure if we can see on User-Interface then it must be coming from
>> somewhere.
>> 
>> Kindly help me on this.
>> 
>> Regards,
>> *Anup Tiwari*
>> Software Engineer(BI-Team),PlayGames24x7 Pvt Ltd
>>

Re: [Drill-Issues] Drill-1.6.0: Drillbit not starting

2016-08-04 Thread Sudheesh Katkam

Can you check if there are any errors in the drillbit.out file? This file 
should be in the same directory as the log file.

Thank you,
Sudheesh

> On Aug 4, 2016, at 4:25 AM, Shankar Mane  wrote:
> 
> I am getting this error infrequently. Most of the time drill starts
> normally and sometimes this gives below error. I am running drill 1.6.0 in
> cluster mode. ZK has also setup.
> 
> Could some one please explain where the issue is ?
> 
> 
> 
> 2016-08-04 03:45:15,870 [main] INFO  o.a.d.e.s.s.PersistentStoreRegistry -
> Using the configured PStoreProvider class: 'org.apache.drill.exec.store.sy
> s.store.provider.ZookeeperPersistentStoreProvider'.
> 2016-08-04 03:45:16,430 [main] INFO  o.apache.drill.exec.server.Drillbit -
> Construction completed (1294 ms).
> 2016-08-04 03:45:28,250 [main] WARN  o.apache.drill.exec.server.Drillbit -
> Failure on close()
> java.lang.NullPointerException: null
>at
> org.apache.drill.exec.work.WorkManager.close(WorkManager.java:157)
> ~[drill-java-exec-1.6.0.jar:1.6.0]
>at
> org.apache.drill.common.AutoCloseables.close(AutoCloseables.java:76)
> ~[drill-common-1.6.0.jar:1.6.0]
>at
> org.apache.drill.common.AutoCloseables.close(AutoCloseables.java:64)
> ~[drill-common-1.6.0.jar:1.6.0]
>at org.apache.drill.exec.server.Drillbit.close(Drillbit.java:149)
> [drill-java-exec-1.6.0.jar:1.6.0]
>at org.apache.drill.exec.server.Drillbit.start(Drillbit.java:283)
> [drill-java-exec-1.6.0.jar:1.6.0]
>at org.apache.drill.exec.server.Drillbit.start(Drillbit.java:261)
> [drill-java-exec-1.6.0.jar:1.6.0]
>at org.apache.drill.exec.server.Drillbit.main(Drillbit.java:257)
> [drill-java-exec-1.6.0.jar:1.6.0]
> 2016-08-04 03:45:28,250 [main] INFO  o.apache.drill.exec.server.Drillbit -
> Shutdown completed (1819 ms).

Re: Connecting Drill to Azure Data Lake

2016-08-01 Thread Sudheesh Katkam

What failure(s) do you see?

Thank you,
Sudheesh

> On Jul 29, 2016, at 4:07 PM, Kevin Verhoeven  
> wrote:
> 
> Hi Drill Community,
> 
> Has anyone attempted to connect Drill to the Azure Data Lake? Microsoft has 
> implemented a WebHDFS API over Azure Data Lake, so Drill should be able to 
> connect. I'm guessing this will be similar to s3. My initial attempts have 
> failed, does anyone have any ideas or experience with this connection?
> 
> Regards,
> 
> Kevin
>

Re: tmp noexec

2016-07-26 Thread Sudheesh Katkam

epoll was supposed to be disabled as part of this PR [1] pending perf. tests; 
IIRC test clusters were busy then, and I lost track of this change. I’ll post 
an update soon.

Thank you,
Sudheesh

[1] https://github.com/apache/drill/pull/486 


> On Jul 26, 2016, at 6:54 PM, Jacques Nadeau  wrote:
> 
> I don't think this will fix your issue since this is a internal extraction.
> Try using: -Ddrill.exec.enable-epoll=false in your drill-env. That should
> (hopefully) disable the extraction of epoll drivers (which should actually
> be disabled by default I believe due to disconnection problems in heavy
> load cases).
> 
> 
> --
> Jacques Nadeau
> CTO and Co-Founder, Dremio
> 
> On Tue, Jul 26, 2016 at 7:20 AM, scott  wrote:
> 
>> Thanks Leon for the suggestion, but do you think this config change will
>> help with my startup problem? It looks like it changes operations for sort
>> after startup.
>> 
>> Scott
>> 
>> On Mon, Jul 25, 2016 at 3:55 PM, Leon Clayton 
>> wrote:
>> 
>>> 
>>> I move the /tmp off local disk into the distributed FS on a node local
>>> volume on MapR. Other file systems can be inserted.
>>> 
>>> Open up drill-override.conf on all of the nodes, and insert this :
>>> 
>>> sort: {
>>>purge.threshold : 100,
>>>external: {
>>>  batch.size : 4000,
>>>  spill: {
>>>batch.size : 4000,
>>>group.size : 100,
>>>threshold : 200,
>>>directories : [ "/var/mapr/local/Hostname/drillspill" ],
>>>fs : "maprfs:///"
>>>  }
>>>}
>>>  }
>>> 
 On 25 Jul 2016, at 16:44, scott  wrote:
 
 Hello,
 I've run into an issue where Drill will not start if mount permissions
>>> are
 set on /tmp to noexec. The permissions were set to noexec due to
>> security
 concerns. I'm using Drill version 1.7. The error I get when starting
>>> Drill
 is:
 
 Exception in thread "main" java.lang.UnsatisfiedLinkError:
 /tmp/libnetty-transport-native-epoll5743269078378802025.so:
 /tmp/libnetty-transport-native-epoll5743269078378802025.so: failed to
>> map
 segment from shared object: Operation not permitted
 
 Does anyone know of a way to configure Drill to use a different tmp
 location?
 
 Thanks,
 Scott
>>> 
>>> 
>>

Changes in Launch Scripts

2016-07-22 Thread Sudheesh Katkam

Hi all,

I just committed DRILL-4581 [1] that changes launch scripts.

The patch should be backward compatible. This email is just an FYI to start
using the new style of drill-env.sh file. The major usability change is
that Drill defaults have been moved from conf/drill-env.sh to
bin/drill-config.sh; changes to variables in drill-env.sh will override the
defaults.

See the ticket for the full list of changes.

Thank you,
Sudheesh

[1] https://issues.apache.org/jira/browse/DRILL-4581

[DISCUSS] New Feature: Kerberos Authentication

2016-07-22 Thread Sudheesh Katkam

Hi all,

I plan to work on DRILL-4280: Kerberos Authentication for Clients [1]. The
design document [2] is attached to the ticket. Please read and comment!

Thank you,
Sudheesh

[1] https://issues.apache.org/jira/browse/DRILL-4280
[2]
https://docs.google.com/document/d/1qSBV2Hi3KwaDFADZJm9me2Nq4RqnKBsyejxunFgmdDo

Re: Pushdown Capabilities with RDBMS

2016-07-15 Thread Sudheesh Katkam

Hi Marcus,

I am glad that you are exploring Drill! Per RDBMS storage plugin documentation 
[1], join pushdown is supported. So the scenario you described is likely a bug; 
can you open a ticket [2] with the details on how to reproduce the issue?

Thank you,
Sudheesh

[1] https://drill.apache.org/docs/rdbms-storage-plugin/ 

[2] https://issues.apache.org/jira/browse/DRILL 


> On Jul 15, 2016, at 1:48 PM, Marcus Rehm  wrote:
> 
> Hi all,
> 
> I started to teste Drill and I'm very excited about the possibilities.
> 
> By now I'm trying to map ours databases running on Oracle 11g. After try
> some queries I realized that the amount of time Drill takes to complete is
> bigger than a general sql client takes. Looking the execution plan I saw
> (or understood) that Drill is doing the join of tables and is not pushing
> it down to the database.
> 
> Is there any configuration required to it? How can I tell Drill to send to
> Oracle the task of doing the join?
> 
> Thanks in Advance.
> 
> Best regards,
> Marcus Rehm

Re: Olingo plugin

2016-07-12 Thread Sudheesh Katkam

If you have any questions, please ask on the dev list: d...@drill.apache.org 
<mailto:d...@drill.apache.org>

I am not an expert in this area, but other developers can help. And we welcome 
your contribution :)

Thank you,
Sudheesh

> On Jul 12, 2016, at 11:18 AM, Steve Warren  wrote:
> 
> Thanks! I'll have a look, I had found the contrib in github.
> 
> On Tue, Jul 12, 2016 at 11:12 AM, Sudheesh Katkam  <mailto:skat...@maprtech.com>>
> wrote:
> 
>> There is some documentation available [1].
>> 
>> There are five implementations in the contrib directory for reference [2].
>> 
>> Thank you,
>> Sudheesh
>> 
>> [1]
>> https://drill.apache.org/docs/apache-drill-contribution-ideas/#support-for-new-data-sources
>>  
>> <https://drill.apache.org/docs/apache-drill-contribution-ideas/#support-for-new-data-sources>
>> <
>> https://drill.apache.org/docs/apache-drill-contribution-ideas/#support-for-new-data-sources
>>  
>> <https://drill.apache.org/docs/apache-drill-contribution-ideas/#support-for-new-data-sources>
>>> 
>> [2] https://github.com/apache/drill/tree/master/contrib 
>> <https://github.com/apache/drill/tree/master/contrib> <
>> https://github.com/apache/drill/tree/master/contrib 
>> <https://github.com/apache/drill/tree/master/contrib>>
>> 
>>> On Jul 12, 2016, at 11:03 AM, Steve Warren  wrote:
>>> 
>>> I considered that, but couldn't find documentation on writing plugins. Is
>>> there any available?
>>> 
>>> On Tue, Jul 12, 2016 at 10:55 AM, Sudheesh Katkam 
>>> wrote:
>>> 
>>>> Hi Steve,
>>>> 
>>>> AFAIK, no such plans. Would you like to open a ticket, and work on it?
>>>> 
>>>> Thank you,
>>>> Sudheesh
>>>> 
>>>>> On Jul 12, 2016, at 10:10 AM, Steve Warren  wrote:
>>>>> 
>>>>> Are there plans to release an Olingo (odata) plugin?
>>>>> 
>>>>> https://olingo.apache.org/
>>>>> 
>>>>> --
>>>>> Confidentiality Notice and Disclaimer:  The information contained in
>> this
>>>>> e-mail and any attachments, is not transmitted by secure means and may
>>>> also
>>>>> be legally privileged and confidential.  If you are not an intended
>>>>> recipient, you are hereby notified that any dissemination,
>> distribution,
>>>> or
>>>>> copying of this e-mail is strictly prohibited.  If you have received
>> this
>>>>> e-mail in error, please notify the sender and permanently delete the
>>>> e-mail
>>>>> and any attachments immediately. You should not retain, copy or use
>> this
>>>>> e-mail or any attachment for any purpose, nor disclose all or any part
>> of
>>>>> the contents to any other person. MyVest Corporation, MyVest Advisors
>> and
>>>>> their affiliates accept no responsibility for any unauthorized access
>>>>> and/or alteration or dissemination of this communication nor for any
>>>>> consequence based on or arising out of the use of information that may
>>>> have
>>>>> been illegitimately accessed or altered.
>>>> 
>>>> 
>>> 
>>> --
>>> Confidentiality Notice and Disclaimer:  The information contained in this
>>> e-mail and any attachments, is not transmitted by secure means and may
>> also
>>> be legally privileged and confidential.  If you are not an intended
>>> recipient, you are hereby notified that any dissemination, distribution,
>> or
>>> copying of this e-mail is strictly prohibited.  If you have received this
>>> e-mail in error, please notify the sender and permanently delete the
>> e-mail
>>> and any attachments immediately. You should not retain, copy or use this
>>> e-mail or any attachment for any purpose, nor disclose all or any part of
>>> the contents to any other person. MyVest Corporation, MyVest Advisors and
>>> their affiliates accept no responsibility for any unauthorized access
>>> and/or alteration or dissemination of this communication nor for any
>>> consequence based on or arising out of the use of information that may
>> have
>>> been illegitimately accessed or altered.
>> 
>> 
> 
> -- 
> Confidentiality Notice and Disclaimer:  The information contained in this 
> e-mail and any attachments, is not transmitted by secure means and may also 
> be legally privileged and confidential.  If you are not an intended 
> recipient, you are hereby notified that any dissemination, distribution, or 
> copying of this e-mail is strictly prohibited.  If you have received this 
> e-mail in error, please notify the sender and permanently delete the e-mail 
> and any attachments immediately. You should not retain, copy or use this 
> e-mail or any attachment for any purpose, nor disclose all or any part of 
> the contents to any other person. MyVest Corporation, MyVest Advisors and 
> their affiliates accept no responsibility for any unauthorized access 
> and/or alteration or dissemination of this communication nor for any 
> consequence based on or arising out of the use of information that may have 
> been illegitimately accessed or altered.

Re: Olingo plugin

2016-07-12 Thread Sudheesh Katkam

There is some documentation available [1].

There are five implementations in the contrib directory for reference [2].

Thank you,
Sudheesh

[1] 
https://drill.apache.org/docs/apache-drill-contribution-ideas/#support-for-new-data-sources
 
<https://drill.apache.org/docs/apache-drill-contribution-ideas/#support-for-new-data-sources>
[2] https://github.com/apache/drill/tree/master/contrib 
<https://github.com/apache/drill/tree/master/contrib>

> On Jul 12, 2016, at 11:03 AM, Steve Warren  wrote:
> 
> I considered that, but couldn't find documentation on writing plugins. Is
> there any available?
> 
> On Tue, Jul 12, 2016 at 10:55 AM, Sudheesh Katkam 
> wrote:
> 
>> Hi Steve,
>> 
>> AFAIK, no such plans. Would you like to open a ticket, and work on it?
>> 
>> Thank you,
>> Sudheesh
>> 
>>> On Jul 12, 2016, at 10:10 AM, Steve Warren  wrote:
>>> 
>>> Are there plans to release an Olingo (odata) plugin?
>>> 
>>> https://olingo.apache.org/
>>> 
>>> --
>>> Confidentiality Notice and Disclaimer:  The information contained in this
>>> e-mail and any attachments, is not transmitted by secure means and may
>> also
>>> be legally privileged and confidential.  If you are not an intended
>>> recipient, you are hereby notified that any dissemination, distribution,
>> or
>>> copying of this e-mail is strictly prohibited.  If you have received this
>>> e-mail in error, please notify the sender and permanently delete the
>> e-mail
>>> and any attachments immediately. You should not retain, copy or use this
>>> e-mail or any attachment for any purpose, nor disclose all or any part of
>>> the contents to any other person. MyVest Corporation, MyVest Advisors and
>>> their affiliates accept no responsibility for any unauthorized access
>>> and/or alteration or dissemination of this communication nor for any
>>> consequence based on or arising out of the use of information that may
>> have
>>> been illegitimately accessed or altered.
>> 
>> 
> 
> -- 
> Confidentiality Notice and Disclaimer:  The information contained in this 
> e-mail and any attachments, is not transmitted by secure means and may also 
> be legally privileged and confidential.  If you are not an intended 
> recipient, you are hereby notified that any dissemination, distribution, or 
> copying of this e-mail is strictly prohibited.  If you have received this 
> e-mail in error, please notify the sender and permanently delete the e-mail 
> and any attachments immediately. You should not retain, copy or use this 
> e-mail or any attachment for any purpose, nor disclose all or any part of 
> the contents to any other person. MyVest Corporation, MyVest Advisors and 
> their affiliates accept no responsibility for any unauthorized access 
> and/or alteration or dissemination of this communication nor for any 
> consequence based on or arising out of the use of information that may have 
> been illegitimately accessed or altered.

Re: Olingo plugin

2016-07-12 Thread Sudheesh Katkam

Hi Steve,

AFAIK, no such plans. Would you like to open a ticket, and work on it?

Thank you,
Sudheesh

> On Jul 12, 2016, at 10:10 AM, Steve Warren  wrote:
> 
> Are there plans to release an Olingo (odata) plugin?
> 
> https://olingo.apache.org/
> 
> -- 
> Confidentiality Notice and Disclaimer:  The information contained in this 
> e-mail and any attachments, is not transmitted by secure means and may also 
> be legally privileged and confidential.  If you are not an intended 
> recipient, you are hereby notified that any dissemination, distribution, or 
> copying of this e-mail is strictly prohibited.  If you have received this 
> e-mail in error, please notify the sender and permanently delete the e-mail 
> and any attachments immediately. You should not retain, copy or use this 
> e-mail or any attachment for any purpose, nor disclose all or any part of 
> the contents to any other person. MyVest Corporation, MyVest Advisors and 
> their affiliates accept no responsibility for any unauthorized access 
> and/or alteration or dissemination of this communication nor for any 
> consequence based on or arising out of the use of information that may have 
> been illegitimately accessed or altered.

Re: Drill - Hive - Kerberos

2016-07-06 Thread Sudheesh Katkam

Can you set the following property to false in the Hive storage plugin 
configuration, and try again?

"hive.server2.enable.doAs": “false"

Thank you,
Sudheesh

> On Jul 5, 2016, at 11:29 AM, Joseph Swingle  wrote:
> 
> Yes.  Impersonation is enabled.:
> 
> drill.exec: {
> 
>  cluster-id: "hhe",
> 
>  zk.connect: "zk1:2181,zk22181,zk3:2181"
> 
>  impersonation: {
> 
>  enabled: true,
> 
>  max_chained_user_hops: 3
> 
>  }
> 
> }
> 
> On Mon, Jun 20, 2016 at 6:22 PM, Chun Chang  wrote:
> 
>> Did you enable impersonation? Check the drill-override.conf file to verify
>> that impersonation is enabled.
>> 
>> On Mon, Jun 20, 2016 at 5:17 AM, Joseph Swingle 
>> wrote:
>> 
>>> Yes secure cluster.  Strange that I can browse hdfs, and can get the
>>> metadata about hive database and tables.
>>> But every sql query to pull data from hive tables results in that error.
>>> 
>>> 
>>> 
>>> 
 On Jun 17, 2016, at 6:24 PM, Chun Chang  wrote:
 
 Hi Joseph,
 
 Are you running DRILL on a secure cluster? I had success with the
>>> following
 storage plugin configuration with MapR distribution, SQL standard
 authorization with Kerberos:
 
 hive storage plugin:
 
 {
 
 "type": "hive",
 
 "enabled": true,
 
 "configProps": {
 
  "hive.metastore.uris": "thrift://10.10.100.120:9083",
 
  "fs.default.name": "maprfs:///",
 
  "hive.server2.enable.doAs": "false",
 
  "hive.metastore.sasl.enabled": "true",
 
  "hive.metastore.kerberos.principal":
>>> "hive/bigdata-node120.bd@bd.lab"
 
 }
 
 }
 
 
 On Fri, Jun 17, 2016 at 1:28 PM, Joseph Swingle 
 wrote:
 
> I have a Hive Storage plugin configured (bottom).   I am using HDP 2.4
>>> w/
> Hive 1.2.1, Drill 1.6
> 
> I can connect just fine with Drill Explorer.  I can browse, and view
> content on hdfs just fine with Drill Explorer.  The .csv files etc,
>>> display
> fine.
> 
> I can browse to see the list of schemas in Hive just fine with Drill
> Explorer.  But every SQL query (for example “select * from foo )
>>> returns:
> Caused by: java.io.IOException: Failed to get numRows from HiveTable
>   at
> 
>>> 
>> org.apache.drill.exec.store.hive.HiveMetadataProvider.getStats(HiveMetadataProvider.java:113)
> ~[drill-storage-hive-core-1.6.0.jar:1.6.0]
>   at
> 
>>> org.apache.drill.exec.store.hive.HiveScan.getScanStats(HiveScan.java:224)
> ~[drill-storage-hive-core-1.6.0.jar:1.6.0]
>   ... 44 common frames omitted
> Caused by: org.apache.drill.common.exceptions.DrillRuntimeException:
> Failed to create input splits: Can't get Master Kerberos principal for
>>> use
> as renewer
>   at
> 
>>> 
>> org.apache.drill.exec.store.hive.HiveMetadataProvider.splitInputWithUGI(HiveMetadataProvider.java:264)
> ~[drill-storage-hive-core-1.6.0.jar:1.6.0]
>   at
> 
>>> 
>> org.apache.drill.exec.store.hive.HiveMetadataProvider.getTableInputSplits(HiveMetadataProvider.java:128)
> ~[drill-storage-hive-core-1.6.0.jar:1.6.0]
>   at
> 
>>> 
>> org.apache.drill.exec.store.hive.HiveMetadataProvider.getStats(HiveMetadataProvider.java:96)
> ~[drill-storage-hive-core-1.6.0.jar:1.6.0]
>   ... 45 common frames omitted
> Caused by: java.io.IOException: Can't get Master Kerberos principal
>> for
> use as renewer
>   at
> 
>>> 
>> org.apache.hadoop.mapreduce.security.TokenCache.obtainTokensForNamenodesInternal(TokenCache.java:116)
> ~[hadoop-mapreduce-client-core-2.7.1.jar:na]
>   at
> 
>>> 
>> org.apache.hadoop.mapreduce.security.TokenCache.obtainTokensForNamenodesInternal(TokenCache.java:100)
> ~[hadoop-mapreduce-client-core-2.7.1.jar:na]
>   at
> 
>>> 
>> org.apache.hadoop.mapreduce.security.TokenCache.obtainTokensForNamenodes(TokenCache.java:80)
> ~[hadoop-mapreduce-client-core-2.7.1.jar:na]
>   at
> 
>>> 
>> org.apache.hadoop.mapred.FileInputFormat.listStatus(FileInputFormat.java:206)
> ~[hadoop-mapreduce-client-core-2.7.1.jar:na]
>   at
> 
>>> 
>> org.apache.hadoop.mapred.FileInputFormat.getSplits(FileInputFormat.java:315)
> ~[hadoop-mapreduce-client-core-2.7.1.jar:na]
>   at
> 
>>> 
>> org.apache.drill.exec.store.hive.HiveMetadataProvider$1.run(HiveMetadataProvider.java:253)
> ~[drill-storage-hive-core-1.6.0.jar:1.6.0]
>   at
> 
>>> 
>> org.apache.drill.exec.store.hive.HiveMetadataProvider$1.run(HiveMetadataProvider.java:241)
> ~[drill-storage-hive-core-1.6.0.jar:1.6.0]
>   at java.security.AccessController.doPrivileged(Native Method)
> ~[na:1.8.0_45]
>   at javax.security.auth.Subject.doAs(Subject.java:422)
> ~[na:1.8.0_45]
>   at
> 
>>> 
>> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657)
> ~[hadoop-common-2.7.1.jar:na]
>

Re: Drill on Kerberos (Hive Storage plugin)

2016-07-05 Thread Sudheesh Katkam

Can you share the Hive storage plugin configuration?

Thank you,
Sudheesh

> On Jul 3, 2016, at 7:00 PM, Santosh Kulkarni  
> wrote:
> 
> Hi,
> 
> Drill query gives the following error for Hive storage plugin.
> 
> show databases;
> 
> Error: SYSTEM ERROR: GSSException: No valid credentials provided (Mechanism
> Level: Failed to find any Kerberos tgt)
> This worked before.
> 
> After renewing the Kerberos connection using kinit -R command, query shows
> results. Tried few hours later and it gave the same error again.
> 
> By default the system generates the user ticket. Ensured the ticket is not
> expired  using klist -e.
> 
> Hive works fine without any issues for the same query.
> 
> Any pointers how to fix this issue?
> 
> Thanks,
> 
> Santosh

Hangout Starting Now

2016-06-28 Thread Sudheesh Katkam

Hangout starting now: 
https://hangouts.google.com/hangouts/_/event/ci4rdiju8bv04a64efj5fedd0lc 


Join us!

Thank you,
Sudheesh

Suggestions for Hangout Topics for 06/28/16

2016-06-27 Thread Sudheesh Katkam

If you have any suggestions for hangout topics for tomorrow, you can add it to 
this thread. We will also ask around at the beginning of the hangout for any 
topics. The goal is to try to cover whatever possible during the 1 hr.

Thank you,
Sudheesh

Re: Oracle Query Problem

2016-05-31 Thread Sudheesh Katkam

Can you enable verbose logging and post the resulting error message? You can do 
this by executing the following statement, and then the failing query.

SET `exec.errors.verbose` = true;

Thank you,
Sudheesh

> On May 31, 2016, at 11:27 AM, SanjiV SwaraJ  wrote:
> 
> Hello I have Oralce Query for Selecting all the columns:-
> 
> SELECT tc.column_name, tc.owner, tc.table_name, tc.column_id, tc.nullable,
> tc.data_type, c.constraint_type, c.r_owner AS reference_owner,
> rcc.table_name AS reference_table, rcc.column_name AS reference_column_name
> FROM SYS.ALL_TAB_COLUMNS tc LEFT OUTER JOIN SYS.ALL_CONS_COLUMNS cc ON (
> tc.owner = cc.owner AND tc.table_name = cc.table_name AND tc.column_name =
> cc.COLUMN_NAME ) LEFT OUTER JOIN SYS.ALL_CONSTRAINTS c ON ( tc.owner =
> c.owner AND tc.table_name = c.table_name AND c.constraint_name =
> cc.constraint_name ) LEFT OUTER JOIN ALL_CONS_COLUMNS rcc ON ( c.r_owner =
> rcc.owner AND c.r_constraint_name = rcc.constraint_name ) WHERE
> tc.table_name = 'REPORTSETTING' AND tc.OWNER = 'NVN' ORDER BY tc.column_id;
> 
> *This query is working fine in Oracle DB, but while using same query in
> Drill, it giving error. Query for Drill is:-*
> 
> SELECT tc.column_name, tc.owner, tc.table_name,
> tc.column_id,tc.nullable,tc.data_type,c.constraint_type,c.r_owner AS
> reference_owner, rcc.table_name AS reference_table, rcc.column_name AS
> reference_column_name FROM OracleDB.SYS.ALL_TAB_COLUMNS tc LEFT OUTER JOIN
> OracleDB.SYS.ALL_CONS_COLUMNS cc ON ( tc.owner = cc.owner AND tc.table_name
> = cc.table_name AND tc.column_name = cc.COLUMN_NAME ) LEFT OUTER JOIN
> OracleDB.SYS.ALL_CONSTRAINTS c ON ( tc.owner = c.owner AND tc.table_name =
> c.table_name AND c.constraint_name = cc.constraint_name ) LEFT OUTER JOIN
> OracleDB.SYS.ALL_CONS_COLUMNS rcc ON ( c.r_owner = rcc.owner AND
> c.r_constraint_name = rcc.constraint_name ) WHERE tc.table_name =
> 'REPORTSETTING' AND tc.OWNER = 'NVN' ORDER BY tc.column_id ASC;
> 
> Following Error Showing:-
> 
> org.apache.drill.common.exceptions.UserRemoteException: DATA_READ ERROR:
> The JDBC storage plugin failed while trying setup the SQL query. sql SELECT
> * FROM (SELECT "t1"."OWNER", "t1"."TABLE_NAME", "t1"."COLUMN_NAME",
> "t1"."DATA_TYPE", "t1"."NULLABLE", "t1"."COLUMN_ID",
> "ALL_CONSTRAINTS"."CONSTRAINT_TYPE", "ALL_CONSTRAINTS"."R_OWNER",
> "ALL_CONSTRAINTS"."R_CONSTRAINT_NAME" FROM (SELECT "t0"."OWNER",
> "t0"."TABLE_NAME", "t0"."COLUMN_NAME", "t0"."DATA_TYPE",
> "t0"."DATA_TYPE_MOD", "t0"."DATA_TYPE_OWNER", "t0"."DATA_LENGTH",
> "t0"."DATA_PRECISION", "t0"."DATA_SCALE", "t0"."NULLABLE",
> "t0"."COLUMN_ID", "t0"."DEFAULT_LENGTH", "t0"."DATA_DEFAULT",
> "t0"."NUM_DISTINCT", "t0"."LOW_VALUE", "t0"."HIGH_VALUE", "t0"."DENSITY",
> "t0"."NUM_NULLS", "t0"."NUM_BUCKETS", "t0"."LAST_ANALYZED",
> "t0"."SAMPLE_SIZE", "t0"."CHARACTER_SET_NAME", "t0"."CHAR_COL_DECL_LENGTH",
> "t0"."GLOBAL_STATS", "t0"."USER_STATS", "t0"."AVG_COL_LEN",
> "t0"."CHAR_LENGTH", "t0"."CHAR_USED", "t0"."V80_FMT_IMAGE",
> "t0"."DATA_UPGRADED", "t0"."HISTOGRAM", "ALL_CONS_COLUMNS"."OWNER"
> "OWNER0", "ALL_CONS_COLUMNS"."CONSTRAINT_NAME",
> "ALL_CONS_COLUMNS"."TABLE_NAME" "TABLE_NAME0",
> "ALL_CONS_COLUMNS"."COLUMN_NAME" "COLUMN_NAME0",
> "ALL_CONS_COLUMNS"."POSITION", CAST("t0"."OWNER" AS VARCHAR(120) CHARACTER
> SET "ISO-8859-1") "$f36" FROM (SELECT "OWNER", "TABLE_NAME", "COLUMN_NAME",
> "DATA_TYPE", "DATA_TYPE_MOD", "DATA_TYPE_OWNER", "DATA_LENGTH",
> "DATA_PRECISION", "DATA_SCALE", "NULLABLE", "COLUMN_ID", "DEFAULT_LENGTH",
> "DATA_DEFAULT", "NUM_DISTINCT", "LOW_VALUE", "HIGH_VALUE", "DENSITY",
> "NUM_NULLS", "NUM_BUCKETS", "LAST_ANALYZED", "SAMPLE_SIZE",
> "CHARACTER_SET_NAME", "CHAR_COL_DECL_LENGTH", "GLOBAL_STATS", "USER_STATS",
> "AVG_COL_LEN", "CHAR_LENGTH", "CHAR_USED", "V80_FMT_IMAGE",
> "DATA_UPGRADED", "HISTOGRAM", CAST("COLUMN_NAME" AS VARCHAR(4000) CHARACTER
> SET "ISO-8859-1") "$f31" FROM "SYS"."ALL_TAB_COLUMNS" WHERE "TABLE_NAME" =
> 'REPORTSETTING' AND "OWNER" = 'NVN') "t0" LEFT JOIN
> "SYS"."ALL_CONS_COLUMNS" ON "t0"."OWNER" = "ALL_CONS_COLUMNS"."OWNER" AND
> "t0"."TABLE_NAME" = "ALL_CONS_COLUMNS"."TABLE_NAME" AND "t0"."$f31" =
> "ALL_CONS_COLUMNS"."COLUMN_NAME") "t1" LEFT JOIN "SYS"."ALL_CONSTRAINTS" ON
> "t1"."$f36" = "ALL_CONSTRAINTS"."OWNER" AND "t1"."TABLE_NAME" =
> "ALL_CONSTRAINTS"."TABLE_NAME" AND "t1"."CONSTRAINT_NAME" =
> "ALL_CONSTRAINTS"."CONSTRAINT_NAME") "t2" LEFT JOIN (SELECT
> "CONSTRAINT_NAME", "TABLE_NAME", "COLUMN_NAME", CAST("OWNER" AS
> VARCHAR(120) CHARACTER SET "ISO-8859-1") "$f5" FROM
> "SYS"."ALL_CONS_COLUMNS") "t3" ON "t2"."R_OWNER" = "t3"."$f5" AND
> "t2"."R_CONSTRAINT_NAME" = "t3"."CONSTRAINT_NAME" plugin OracleDB Fragment
> 0:0 [Error Id: 2a11fed2-ec79-4ef1-9d29-781af21274f6
> 
> *Please Tell me what i am doing wrong in this query?*
> 
> -- 
> Thanks & Regards.
> Sanjiv
> Swaraj

Hangout Topics For 05/31/16

2016-05-27 Thread Sudheesh Katkam

Hey y’all,

Let’s use this thread to pre-populate a list of topics to discuss on Tuesday’s 
hangout (05/31/16), so people can attend if they are interested in the 
mentioned topics. I will also collect topics at the beginning of the hangout.

+ DRILL-4280: Kerberos Authentication (Sudheesh)

Thank you,
Sudheesh

Re: Impersonation

2016-04-27 Thread Sudheesh Katkam

Hi Lunen,

Is the drillbit process running? If not, look for errors in logs in
*/opt/apache-drill-1.6.0/log*

In embedded mode, Zookeeper is not required. Try: *bin/sqlline –u
"jdbc:drill:zk=local" -n user1 -p user1PW*

This command starts and connects to a drillbit (which is embedded in the
sqlline process). Notice that *impersonation_target* is also not required.

Thank you,
Sudheesh


On Wed, Apr 27, 2016 at 1:51 AM, Lunen de Lange 
wrote:

> Hi All,
>
>
>
> I’m having issues getting impersonation to work.
>
> I'm trying to build in security on our Drill (1.6.0) system. I managed to
> get the security user authentication to work(JPam as explained in the
> documentation), but the impersonation does not seem to work. It seems to
> execute and fetch via the root user regardless of who has logged in via
> ODBC.
>
> My drill-override.conf file is configured as follows:
>
>   drill.exec: {
>
>   cluster-id: "drillbits1",
>
>   zk.connect: "localhost:2181",
>
>   impersonation: {
>
> enabled: true,
>
> max_chained_user_hops: 3
>
>   },
>
>   security.user.auth {
>
>   enabled: true,
>
>  packages += "org.apache.drill.exec.rpc.user.security",
>
>   impl: "pam",
>
>   pam_profiles: [ "sudo", "login" ]
>
>   }
>
> }
>
> We are also only using Drill on one server, therefore I'm running
> drill-embedded to start things up.
> I’m starting up my Zookeeper separately. (I can see the drill instance
> connect to ZK)
>
>
> When I try sqlline, with the following command, I get a “No
> DrillbitEndpoint can be found”
> root@machinename:/opt/apache-drill-1.6.0/bin# ./sqlline –u
> “jdbc:drill:schema=dfs;zk=localhost:2181;impersonation_target=user1” – n
> user1 –p user1PW
>
>
>
> I have also looked at doing my own built in security, but I'm not able to
> retrieve the username from a SQL query. I have tried the following without
> any luck:
> CURRENT_USER()
> USER()
> SESSION_USER()
> Any ideas on this approach?
>
> Kind regards,
> [image: Intenda Logo.jpg]
> *Lunen de Lange*
>
> *Big Data Developer/Project Manager*
>
> +44 (0)782 463 4516 ' | Tel: +44 (0)845 468 3632 | Direct: 01707 367 628
> |  http://www.intenda.net8 |[image: cid:image003.jpg@01CC67BA.79482150]
>  |[image:
> cid:image004.jpg@01CC67BA.79482150]
>  |[image:
> cid:image005.png@01CC67BD.E10828F0] 
>
> IMPORTANT NOTICE: This communication contains information that is
> confidential and may also be privileged. It is for the exclusive use of the
> intended recipient(s). Any unauthorized use, alteration or dissemination is
>
> prohibited. If you have received this communication in error, please
> return it with the title "received in error" to
> Mailto:distribu...@intenda.net * <*
> mailto:distribu...@intenda.net *>* then delete
> the email and destroy
>
> any copies of it. Any views expressed in this message are those of the
> individual sender and not necessarily those of Intenda UK Ltd. Intenda UK
> Ltd accepts no liability whatsoever for any loss whether it be direct,
>
> indirect or consequential, arising from information made available and
> actions resulting there from.
>
> P Think be4 u print
>
>
>
>
>

Re: high latency of querying hive datasource

2016-04-07 Thread Sudheesh Katkam

Can you gather the query profile from the web UI [1]?

This mailing list does not accept attachments; please put the profile in a 
shareable location (Dropbox?), and post a link.

Thank you,
Sudheesh

[1] https://drill.apache.org/docs/query-profiles/ 


> On Apr 7, 2016, at 2:16 PM, Tomek W  wrote:
> 
> Hello,
> I configured hive datasource. It is simple file system. These data are
> saved by spark application (/user/hive/warehaouse).
> 
> I am able from command line of drill query data. However, for 1200 rows
> table it consumes 20s.
> 
> Be honestly, I am launching embeded instance of your drill, but 20s. seems
> to be very much. Given that this table is very small.
> 
> How to improve this result?
> First of all, is it possible to  keep table in memory (RAM)
> 
> Thanks in advance,
> 
> Tom

Re: ConnectTimeoutException when starting drill

2016-03-19 Thread Sudheesh Katkam

Drill 1.6 introduces support 
 for Java 1.8, 
which is generally available today. Download here 
.

Thank you,
Sudheesh

> On Mar 3, 2016, at 1:40 PM, Rob Terpilowski  wrote:
> 
> I am attempting to get Apache Drill running on an Ubuntu box with Java 
> 1.8.0_51 as root.
> 
> I've downloaded and unzipped the drill tar.gz file.
> 
> When I attempt to run the ./bin/drill-embedded  command.  (I've attempted to 
> run the drill-localhost command as well)
> 
> The command sits for a little less than a minute at the line:
> INFO: Initiating Jersey application, version Jersey: 2.8 2014-04-29 
> 01:25:26...
> 
> The message is the followed by the following exceptions.  Any idea of where I 
> can begin looking for the issue.  I changed the port number from 31010 to a 
> number of other different ports with the same result.  
> 
> Any help would be appreciated.
> 
> Thanks,
> -Rob
> 
> 
> Error: Failure in connecting to Drill: 
> org.apache.drill.exec.rpc.RpcException: CONNECTION : 
> io.netty.channel.ConnectTimeoutException: connection timed out: 
> lisprod03.lynden.com/172.16.218.141:31010 (state=,code=0)
> java.sql.SQLException: Failure in connecting to Drill: 
> org.apache.drill.exec.rpc.RpcException: CONNECTION : 
> io.netty.channel.ConnectTimeoutException: connection timed out: 
> lisprod03.lynden.com/172.16.218.141:31010
>   at 
> org.apache.drill.jdbc.impl.DrillConnectionImpl.(DrillConnectionImpl.java:159)
>   at 
> org.apache.drill.jdbc.impl.DrillJdbc41Factory.newDrillConnection(DrillJdbc41Factory.java:64)
>   at 
> org.apache.drill.jdbc.impl.DrillFactory.newConnection(DrillFactory.java:69)
>   at 
> net.hydromatic.avatica.UnregisteredDriver.connect(UnregisteredDriver.java:126)
>   at org.apache.drill.jdbc.Driver.connect(Driver.java:72)
>   at sqlline.DatabaseConnection.connect(DatabaseConnection.java:167)
>   at sqlline.DatabaseConnection.getConnection(DatabaseConnection.java:213)
>   at sqlline.Commands.connect(Commands.java:1083)
>   at sqlline.Commands.connect(Commands.java:1015)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:497)
>   at 
> sqlline.ReflectiveCommandHandler.execute(ReflectiveCommandHandler.java:36)
>   at sqlline.SqlLine.dispatch(SqlLine.java:742)
>   at sqlline.SqlLine.initArgs(SqlLine.java:528)
>   at sqlline.SqlLine.begin(SqlLine.java:596)
>   at sqlline.SqlLine.start(SqlLine.java:375)
>   at sqlline.SqlLine.main(SqlLine.java:268)
> Caused by: org.apache.drill.exec.rpc.RpcException: CONNECTION : 
> io.netty.channel.ConnectTimeoutException: connection timed out: 
> lisprod03.lynden.com/172.16.218.141:31010
>   at 
> org.apache.drill.exec.client.DrillClient$FutureHandler.connectionFailed(DrillClient.java:448)
>   at 
> org.apache.drill.exec.rpc.BasicClient$ConnectionMultiListener$ConnectionHandler.operationComplete(BasicClient.java:237)
>   at 
> org.apache.drill.exec.rpc.BasicClient$ConnectionMultiListener$ConnectionHandler.operationComplete(BasicClient.java:200)
>   at 
> io.netty.util.concurrent.DefaultPromise.notifyListener0(DefaultPromise.java:680)
>   at 
> io.netty.util.concurrent.DefaultPromise.notifyListeners0(DefaultPromise.java:603)
>   at 
> io.netty.util.concurrent.DefaultPromise.notifyListeners(DefaultPromise.java:563)
>   at 
> io.netty.util.concurrent.DefaultPromise.tryFailure(DefaultPromise.java:424)
>   at 
> io.netty.channel.epoll.AbstractEpollStreamChannel$EpollStreamUnsafe$1.run(AbstractEpollStreamChannel.java:460)
>   at 
> io.netty.util.concurrent.PromiseTask$RunnableAdapter.call(PromiseTask.java:38)
>   at 
> io.netty.util.concurrent.ScheduledFutureTask.run(ScheduledFutureTask.java:120)
>   at 
> io.netty.util.concurrent.SingleThreadEventExecutor.runAllTasks(SingleThreadEventExecutor.java:357)
>   at io.netty.channel.epoll.EpollEventLoop.run(EpollEventLoop.java:254)
>   at 
> io.netty.util.concurrent.SingleThreadEventExecutor$2.run(SingleThreadEventExecutor.java:111)
>   at java.lang.Thread.run(Thread.java:745)
> Caused by: java.util.concurrent.ExecutionException: 
> io.netty.channel.ConnectTimeoutException: connection timed out: 
> lisprod03.lynden.com/172.16.218.141:31010
>   at io.netty.util.concurrent.AbstractFuture.get(AbstractFuture.java:47)
>   at 
> org.apache.drill.exec.rpc.BasicClient$ConnectionMultiListener$ConnectionHandler.operationComplete(BasicClient.java:213)
>   ... 12 more
> Caused by: io.netty.channel.ConnectTimeoutException: connection timed out: 
> lisprod03.lynden.com/172.16.218.141:31010
>   at 
> io.netty.channel.epoll.AbstractEpollStreamChannel$Epol

Re: [DISCUSS] New Feature: Drill Client Impersonation

2016-03-01 Thread Sudheesh Katkam

Thank you all for the feedback.

+ I am naming this feature User Delegation (since Client Impersonation can be 
confused with User Impersonation).
+ I updated the design document 
<https://docs.google.com/document/d/1g0KgugVdRbbIxxZrSCtO1PEHlvwczTLDb38k-npvwjA>.
+ I opened a pull request (#400 <https://github.com/apache/drill/pull/400>).

- Sudheesh

> On Feb 23, 2016, at 12:04 PM, Neeraja Rentachintala 
>  wrote:
> 
> Norris
> Quick comment on your point below. The username/password passed currently
> on the connection string is for authentication purposes and also used for
> impersonation in case of direct connection from BI tool to Drillbit. That
> continue to exist, but now the driver needs to be extended to pass an
> *'additional'* user name as part of connection and this represents the end
> user identity on behalf of which Drill will execute queries (there is an
> intermediate hop via the BI server which we are trying to support).
> Sudheesh doc has specifics on the proposal.
> 
> With regards to interfacing the impersonation feature, it looks like all
> you need is the username, which is already being pass down from the
> application to the client via the driver.
> 
> On Tue, Feb 23, 2016 at 11:52 AM, Norris Lee  wrote:
> 
>> ODBC does not have any standard way to change the user for a connection,
>> so like Sudheesh mentioned, I'm not sure how this would be exposed to the
>> application. I believe some other databases like SQLServer let you change
>> the user via SQL.
>> 
>> With regards to interfacing the impersonation feature, it looks like all
>> you need is the username, which is already being pass down from the
>> application to the client via the driver.
>> 
>> Norris
>> 
>> -Original Message-
>> From: Sudheesh Katkam [mailto:skat...@maprtech.com]
>> Sent: Tuesday, February 23, 2016 8:49 AM
>> To: user@drill.apache.org
>> Cc: dev 
>> Subject: Re: [DISCUSS] New Feature: Drill Client Impersonation
>> 
>>> Do you have an interface proposal? I didn't see that.
>> 
>> Are you referring to the Drill client interface to used by applications?
>> 
>>> Also, what do you think about my comment and Keys response about moving
>> pooling to the Driver and then making "connection" lightweight.
>> 
>> An API to change the user on a connection can be easily added later (for
>> now, we use a connection property). Since Drill connections are already
>> lightweight, this is not an immediate problem. Unlike OracleConnection <
>> https://docs.oracle.com/cd/B28359_01/java.111/b31224/proxya.htm#BABEJEIA>,
>> JDBC/ ODBC do not have a provision for proxy sessions in their
>> specification, so I am not entirely clear how we would expose “change user
>> on connection” to applications using these API.
>> 
>>> Connection level identity setting is only viable if the scalability
>> concerns I raised in the doc and Jacques indirectly raised are addressed.
>>> 
>>> Historically DB connections have been so expensive that most
>> applications created pools of connections and reused them across users.
>> That model doesn't work if each connection is tied to a single user. That's
>> why the typical implementation has provided for changing the identity on an
>> existing connection.
>>> 
>>> Now, if the Drill connection is a very lightweight object (possibly
>> mapping to a single heavier weight hidden process level object), then tying
>> identity to the connection is fine. I don't know enough about the Drill
>> architecture to comment on that but I think a good rule of thumb would be
>> "is it reasonable to keep 50+ Drill connections open where each has a
>> different user identity?" If the answer is no, then the design needs to
>> consider the scale. I'll also add that much further in the future if/when
>> Drill takes on more operational types of access that 50 connections will
>> rise to a much larger number.
>> 
>> 
>> Thank you,
>> Sudheesh
>> 
>>> On Feb 22, 2016, at 2:27 PM, Jacques Nadeau  wrote:
>>> 
>>> Got it, makes sense.
>>> 
>>> Do you have an interface proposal? I didn't see that.
>>> 
>>> Also, what do you think about my comment and Keys response about
>>> moving pooling to the Driver and then making "connection" lightweight.
>>> 
>>> --
>>> Jacques Nadeau
>>> CTO and Co-Founder, Dremio
>>> 
>>> On Mon, Feb 22, 2016 at 9:59 AM, Sudheesh Katkam
>>> 
>>> wrote:
>&

Re: Hangout Happening Now!

2016-02-23 Thread Sudheesh Katkam

Same here, I am unable to connect. I get a “Requesting to join the video 
call..” and then nothing happens.

> On Feb 23, 2016, at 10:12 AM, Parth Chandra  wrote:
> 
> Hey Jason,
> 
>  We're unable to connect to the hangout. Should we try restarting the
> hangout?
> 
> Parth
> 
> On Tue, Feb 23, 2016 at 10:04 AM, Jason Altekruse 
> wrote:
> 
>> https://plus.google.com/hangouts/_/dremio.com/drillhangout?authuser=1
>>

Re: [DISCUSS] New Feature: Drill Client Impersonation

2016-02-23 Thread Sudheesh Katkam

> Do you have an interface proposal? I didn't see that.

Are you referring to the Drill client interface to used by applications?

> Also, what do you think about my comment and Keys response about moving 
> pooling to the Driver and then making "connection" lightweight.

An API to change the user on a connection can be easily added later (for now, 
we use a connection property). Since Drill connections are already lightweight, 
this is not an immediate problem. Unlike OracleConnection 
<https://docs.oracle.com/cd/B28359_01/java.111/b31224/proxya.htm#BABEJEIA>, 
JDBC/ ODBC do not have a provision for proxy sessions in their specification, 
so I am not entirely clear how we would expose “change user on connection” to 
applications using these API.

> Connection level identity setting is only viable if the scalability concerns 
> I raised in the doc and Jacques indirectly raised are addressed.
> 
> Historically DB connections have been so expensive that most applications 
> created pools of connections and reused them across users. That model doesn't 
> work if each connection is tied to a single user. That's why the typical 
> implementation has provided for changing the identity on an existing 
> connection.
> 
> Now, if the Drill connection is a very lightweight object (possibly mapping 
> to a single heavier weight hidden process level object), then tying identity 
> to the connection is fine. I don't know enough about the Drill architecture 
> to comment on that but I think a good rule of thumb would be "is it 
> reasonable to keep 50+ Drill connections open where each has a different user 
> identity?" If the answer is no, then the design needs to consider the scale. 
> I'll also add that much further in the future if/when Drill takes on more 
> operational types of access that 50 connections will rise to a much larger 
> number.


Thank you,
Sudheesh

> On Feb 22, 2016, at 2:27 PM, Jacques Nadeau  wrote:
> 
> Got it, makes sense.
> 
> Do you have an interface proposal? I didn't see that.
> 
> Also, what do you think about my comment and Keys response about moving
> pooling to the Driver and then making "connection" lightweight.
> 
> --
> Jacques Nadeau
> CTO and Co-Founder, Dremio
> 
> On Mon, Feb 22, 2016 at 9:59 AM, Sudheesh Katkam 
> wrote:
> 
>> “… when creating this connection, as part of the connection properties
>> (JDBC, C++ Client), the application passes the end user’s identity (e.g.
>> username) …”
>> 
>> I had written the change user as a session option as part of the
>> enhancement only, where you’ve pointed out a better way. I addressed your
>> comments on the doc.
>> 
>> Thank you,
>> Sudheesh
>> 
>>> On Feb 22, 2016, at 9:49 AM, Jacques Nadeau  wrote:
>>> 
>>> Maybe I misunderstood the design document.
>>> 
>>> I thought this was how the user would be changed: "Provide a way to
>> change
>>> the user after the connection is made (details) through a session option"
>>> 
>>> Did I miss something?
>>> 
>>> 
>>> 
>>> 
>>> 
>>> 
>>> --
>>> Jacques Nadeau
>>> CTO and Co-Founder, Dremio
>>> 
>>> On Mon, Feb 22, 2016 at 9:06 AM, Neeraja Rentachintala <
>>> nrentachint...@maprtech.com> wrote:
>>> 
>>>> Jacques,
>>>> I think the current proposal by Sudheesh is an API level change to pass
>>>> this additional end user id during the connection establishment.
>>>> Can you elaborate what you mean by random query.
>>>> 
>>>> -Neeraja
>>>> 
>>>> On Sun, Feb 21, 2016 at 5:07 PM, Jacques Nadeau 
>>>> wrote:
>>>> 
>>>>> Sudheesh, thanks for putting this together. Reviewing Oracle
>>>> documentation,
>>>>> they expose this at the API level rather than through a random query. I
>>>>> think we should probably model after that rather than invent a new
>>>>> mechanism. This also means we can avoid things like query parsing,
>>>>> execution roundtrip, query profiles, etc to provide this functionality.
>>>>> 
>>>>> See here:
>>>>> 
>>>>> 
>> https://docs.oracle.com/cd/B28359_01/java.111/b31224/proxya.htm#BABEJEIA
>>>>> 
>>>>> --
>>>>> Jacques Nadeau
>>>>> CTO and Co-Founder, Dremio
>>>>> 
>>>>> On Fri, Feb 19, 2016 at 2:18 PM, Keys Botzum 
>>>> wrote:
>>>>> 
>>>>>>

Re: [DISCUSS] New Feature: Drill Client Impersonation

2016-02-22 Thread Sudheesh Katkam

“… when creating this connection, as part of the connection properties (JDBC, 
C++ Client), the application passes the end user’s identity (e.g. username) …”

I had written the change user as a session option as part of the enhancement 
only, where you’ve pointed out a better way. I addressed your comments on the 
doc.

Thank you,
Sudheesh

> On Feb 22, 2016, at 9:49 AM, Jacques Nadeau  wrote:
> 
> Maybe I misunderstood the design document.
> 
> I thought this was how the user would be changed: "Provide a way to change
> the user after the connection is made (details) through a session option"
> 
> Did I miss something?
> 
> 
> 
> 
> 
> 
> --
> Jacques Nadeau
> CTO and Co-Founder, Dremio
> 
> On Mon, Feb 22, 2016 at 9:06 AM, Neeraja Rentachintala <
> nrentachint...@maprtech.com> wrote:
> 
>> Jacques,
>> I think the current proposal by Sudheesh is an API level change to pass
>> this additional end user id during the connection establishment.
>> Can you elaborate what you mean by random query.
>> 
>> -Neeraja
>> 
>> On Sun, Feb 21, 2016 at 5:07 PM, Jacques Nadeau 
>> wrote:
>> 
>>> Sudheesh, thanks for putting this together. Reviewing Oracle
>> documentation,
>>> they expose this at the API level rather than through a random query. I
>>> think we should probably model after that rather than invent a new
>>> mechanism. This also means we can avoid things like query parsing,
>>> execution roundtrip, query profiles, etc to provide this functionality.
>>> 
>>> See here:
>>> 
>>> https://docs.oracle.com/cd/B28359_01/java.111/b31224/proxya.htm#BABEJEIA
>>> 
>>> --
>>> Jacques Nadeau
>>> CTO and Co-Founder, Dremio
>>> 
>>> On Fri, Feb 19, 2016 at 2:18 PM, Keys Botzum 
>> wrote:
>>> 
>>>> This is a great feature to add to Drill and I'm excited to see design
>> on
>>>> it starting.
>>>> 
>>>> The ability for an intermediate server that is likely already
>>>> authenticating end users, to send end user identity down to Drill adds
>> a
>>>> key element into an end to end secure design by enabling Drill and the
>>> back
>>>> end systems to see the real user and thus perform meaningful
>>> authorization.
>>>> 
>>>> Back when I was building many JEE applications I know the DBAs where
>> very
>>>> frustrated that the application servers blinded them to the identity of
>>> the
>>>> end user accessing important corporate data. When JEE application
>> servers
>>>> and databases finally added the ability to impersonate that addressed a
>>> lot
>>>> of security concerns. Of course this isn't a perfect solution and I'm
>>> sure
>>>> others will recognize that in some scenarios impersonation isn't the
>> best
>>>> approach, but having that as an option in Drill is very valuable.
>>>> 
>>>> Keys
>>>> ___
>>>> Keys Botzum
>>>> Senior Principal Technologist
>>>> kbot...@maprtech.com <mailto:kbot...@maprtech.com>
>>>> 443-718-0098
>>>> MapR Technologies
>>>> http://www.mapr.com <http://www.mapr.com/>
>>>>> On Feb 19, 2016, at 4:49 PM, Sudheesh Katkam 
>>>> wrote:
>>>>> 
>>>>> Hey y’all,
>>>>> 
>>>>> I plan to work on DRILL-4281 <
>>>> https://issues.apache.org/jira/browse/DRILL-4281>: support for
>>>> inbound/client impersonation. Please review the design document <
>>>> 
>>> 
>> https://docs.google.com/document/d/1g0KgugVdRbbIxxZrSCtO1PEHlvwczTLDb38k-npvwjA
>>>> ,
>>>> which is open for comments. There is also a link to proof-of-concept
>>>> (slightly hacky).
>>>>> 
>>>>> Thank you,
>>>>> Sudheesh
>>>> 
>>>> 
>>> 
>>

[DISCUSS] New Feature: Drill Client Impersonation

2016-02-19 Thread Sudheesh Katkam

Hey y’all,

I plan to work on DRILL-4281 
: support for inbound/client 
impersonation. Please review the design document 
,
 which is open for comments. There is also a link to proof-of-concept (slightly 
hacky).

Thank you,
Sudheesh

Re: Rest & web authentication

2015-12-07 Thread Sudheesh Katkam

Hi John,

MapR Drill 1.3 has the web authentication feature (DRILL-3201), whereas Apache 
Drill 1.3 does not.

Thank you,
Sudheesh

> On Dec 7, 2015, at 11:16 AM, John Omernik  wrote:
> 
> This is really odd to me. I have 1.3 (Compiled by MapR) running on my
> cluster right now.  I added libjpam.so to my java.library.path, and I have
> ldap configured on the nodes in pam.  When I hit the Web UI (via HTTPS,
> HTTP is disabled) I get prompt for user name and the authentication goes
> against the pam on the system, allowing me in.  Only the user running the
> drill bits has "admin" functions in Drill, other users only see a subset of
> tabs (Query, Profiles, Metrics) (the Drillbit user has Query, profiles,
> storage, metrics, and threads).
> 
> This is also true with the RestAPI.  I have to authenticate to use it at
> this point.   I just do forms authentication using the python requests
> module. (I feel like it should accept basic auth over SSL for API good
> ness, but my hack works)
> 
> Thus, I feel like it's been secured fairly well (SSL, Authentication, Users
> etc).  So I do not understand why DRILL-3201 says 1.5, and while it's
> stated that Web Authentication is not available yet...
> 
> 
> 
> 
> 
> On Mon, Dec 7, 2015 at 11:13 AM, Andries Engelbrecht <
> aengelbre...@maprtech.com> wrote:
> 
>> This topic came up a while ago and there was a bit of confusion.
>> 
>> See the discussion thread here for more info.
>> 
>> http://search-hadoop.com/m/qRVAXir9Do1qHni72&subj=How+to+setup+user+authentication+for+the+WebUI+
>> <
>> http://search-hadoop.com/m/qRVAXir9Do1qHni72&subj=How+to+setup+user+authentication+for+the+WebUI+
>>> 
>> 
>> Sudheesh listed the JIRA to track it.
>> 
>> --Andries
>> 
>> 
>> 
>>> On Dec 7, 2015, at 8:32 AM, Sudheesh Katkam 
>> wrote:
>>> 
>>> Hi Niko,
>>> 
>>> Web authentication is not available yet; DRILL-3201 <
>> https://issues.apache.org/jira/browse/DRILL-3201> is being reviewed. The
>> doc pages are ahead.
>>> 
>>> Thank you,
>>> Sudheesh
>>> 
>>>> On Dec 7, 2015, at 7:06 AM, Keys Botzum  wrote:
>>>> 
>>>> I think this answers your question (the answer is yes according to the
>> docs):
>>>> 
>>>> 
>> https://drill.apache.org/docs/configuring-web-console-and-rest-api-security/
>> <
>> https://drill.apache.org/docs/configuring-web-console-and-rest-api-security/
>>> 
>>>> 
>>>> Keys
>>>> ___
>>>> Keys Botzum
>>>> Senior Principal Technologist
>>>> kbot...@maprtech.com <mailto:kbot...@maprtech.com>
>>>> 443-718-0098
>>>> MapR Technologies
>>>> http://www.mapr.com <http://www.mapr.com/>
>>>>> On Dec 7, 2015, at 10:04 AM, Niko Arvilommi 
>> wrote:
>>>>> 
>>>>> Hi
>>>>> 
>>>>> Is it possible to secure the WEB UI and REST api with password or
>> similiar methods. Shared keys or something ?
>>>>> It seems to be too open if it not possible
>>>>> 
>>>>> Best Niko Arvilommi
>>>> 
>>> 
>> 
>>

Re: Rest & web authentication

2015-12-07 Thread Sudheesh Katkam

Hi Niko,

Web authentication is not available yet; DRILL-3201 
 is being reviewed. The doc 
pages are ahead.

Thank you,
Sudheesh

> On Dec 7, 2015, at 7:06 AM, Keys Botzum  wrote:
> 
> I think this answers your question (the answer is yes according to the docs):
> 
>   
> https://drill.apache.org/docs/configuring-web-console-and-rest-api-security/ 
> 
> 
> Keys
> ___
> Keys Botzum 
> Senior Principal Technologist
> kbot...@maprtech.com 
> 443-718-0098
> MapR Technologies 
> http://www.mapr.com 
>> On Dec 7, 2015, at 10:04 AM, Niko Arvilommi  wrote:
>> 
>> Hi
>> 
>> Is it possible to secure the WEB UI and REST api with password or similiar 
>> methods. Shared keys or something ?
>> It seems to be too open if it not possible
>> 
>> Best Niko Arvilommi
>

Re: SYSTEM ERROR: StackOverflowError on querying a Hive table

2015-12-07 Thread Sudheesh Katkam

1) What version of Drill are you using?
2) Can you enable verbose messaging (ALTER SESSION SET `exec.errors.verbose` = 
true;) and post the resulting error message? Also, please check if there are 
any relevant ERROR messages in the log file.

Thank you,
Sudheesh

> On Dec 6, 2015, at 11:28 PM, Durga Perla  wrote:
> 
> Hi,
>I am trying to query a hive table called customers, that I have created. i 
> am able to see the table and describe works too.
> But, when i run,
> 
>   select * from customers limit 10;
>   or
> Select firstname from customers limit 10;
> 
> I get the error, *Error: SYSTEM ERROR: StackOverflowError .
> *Any ideas on what is causing this?
> And, btw, I am able to query the table properly in hive.
> 
> Thanks & Regards,
> Durga Swaroop

Re: Announcing new committer: Kristine Hahn

2015-12-04 Thread Sudheesh Katkam

Congratulations and welcome, Kris!

> On Dec 4, 2015, at 9:19 AM, Jacques Nadeau  wrote:
> 
> The Apache Drill PMC is very pleased to announce Kristine Hahn as a new
> committer.
> 
> Kris has worked tirelessly on creating and improving the Drill
> documentation. She has been extraordinary in her engagement with the
> community and has greatly accelerated the speed to resolution of doc issues
> and improvements.
> 
> Welcome Kristine!

Re: How to unsubscribe from the mail group ?

2015-11-16 Thread Sudheesh Katkam

You can get digests; see https://mail-archives.apache.org/mod_mbox/drill-user/ 


> On Nov 15, 2015, at 6:24 PM, Kim Chew  wrote:
> 
> https://drill.apache.org/mailinglists/
> 
> On Sun, Nov 15, 2015 at 4:03 PM, ganesh  wrote:
> 
>> Hi,
>> 
>> I would like to un subscribe from the group . Can someone tell the
>> procedure. ?
>> 
>> Also incase I want Daily mail, instead of Individual mail from group .. is
>> there any such option ?
>> 
>> --
>> *Name: Ganesh Semalty*
>> *Location: Gurgaon,Haryana(India)*
>> *Email Id: g4ganeshsema...@gmail.com *
>> 
>> 
>> P
>> 
>> *Please consider the environment before printing this e-mail - SAVE TREE.*
>>

Re: NullPointers in type conversions

2015-09-22 Thread Sudheesh Katkam

See below:

> On Sep 21, 2015, at 8:22 AM, USC  wrote:
> 
> Hi,
> This is a system wide setting. Meaning, you need to say 
> 
> Alter system set `drill.exec.functions.cast_empty_string_to_null` = true;

To clarify, all options available through sys.options (except some used for 
testing) can be set at system and session level. So what Chris did (setting at 
session level) works.

Thank you,
Sudheesh

> 
> 
> Sent from my iPhone
> 
>> On Sep 21, 2015, at 7:18 AM, Christopher Matta  wrote:
>> 
>> I’m not sure if it worked, the result looks the same when casting as a
>> string (empty field, not a NULL value):
>> 
>> 0: jdbc:drill:> ALTER SESSION SET
>> `drill.exec.functions.cast_empty_string_to_null` = true;
>> +---+--+
>> |  ok   | summary  |
>> +---+--+
>> | true  | drill.exec.functions.cast_empty_string_to_null updated.  |
>> +---+--+
>> 1 row selected (1.606 seconds)
>> 0: jdbc:drill:> select cast(x.`row_key` as varchar(128)) as `row_key`,
>> CAST(x.`a`.`c1` as INTEGER) from maprfs.cmatta.`cmatta_test` x;
>> Error: SYSTEM ERROR: NumberFormatException:
>> 
>> Fragment 0:0
>> 
>> [Error Id: 33e94b4d-6450-40bf-9f2c-bbbfab9f5990 on
>> se-node10.se.lab:31010] (state=,code=0)
>> 0: jdbc:drill:> select cast(x.`row_key` as varchar(128)) as `row_key`,
>> CASE WHERE x.`a`.`c1` is not null CAST(x.`a`.`c1` as INTEGER) fr
>> Command canceled.`cmatta_test` x;
>> 0: jdbc:drill:> select 'hello' from sys.version;
>> +-+
>> | EXPR$0  |
>> +-+
>> | hello   |
>> +-+
>> 1 row selected (0.417 seconds)
>> 0: jdbc:drill:> select cast(NULL as INTEGER) from sys.version;
>> +-+
>> | EXPR$0  |
>> +-+
>> | null|
>> +-+
>> 1 row selected (0.4 seconds)
>> 0: jdbc:drill:> select cast(x.`row_key` as varchar(128)) as `row_key`,
>> CAST(CAST(x.`a`.`c1` as varchar(64)) as INTEGER) from maprfs.cma
>> tta.`cmatta_test` x;
>> Error: SYSTEM ERROR: NumberFormatException:
>> 
>> Fragment 0:0
>> 
>> [Error Id: 71593a43-54ac-4e1d-b3d8-21a2d4d4acd6 on
>> se-node10.se.lab:31010] (state=,code=0)
>> 0: jdbc:drill:> select cast(x.`row_key` as varchar(128)) as `row_key`,
>> CAST(x.`a`.`c1` as varchar(64)) from maprfs.cmatta.`cmatta_test`
>> x;
>> +--+-+
>> | row_key  | EXPR$1  |
>> +--+-+
>> | row1 | 1   |
>> | row2 | |
>> | row3 | 5   |
>> | row4 | 7   |
>> +--+-+
>> 4 rows selected (0.54 seconds)
>> 0: jdbc:drill:>
>> 
>> Is this how it’s expected to work?
>> 
>> 
>> Chris Matta
>> cma...@mapr.com
>> 215-701-3146
>> 
>>> On Fri, Sep 18, 2015 at 9:56 PM, Jacques Nadeau  wrote:
>>> 
>>> Does this system option not work:
>>> 
>>> ALTER SESSION SET `drill.exec.functions.cast_empty_string_to_null` = true;
>>> 
>>> The reason the bug was marked INVALID is that SQL engines (not sure about
>>> the spec) don't allow casting from empty string to number. The system
>>> option above is supposed to allow changing this behavior from the SQL
>>> standard for your type of situation. That being said, I see the docs say
>>> "not supported in this release". Not sure why that is there. Can you give
>>> it a try?
>>> 
>>> That being said, it seems like the original issue was a NPE not a NFE. That
>>> definitely seems like something else.
>>> 
>>> 
>>> --
>>> Jacques Nadeau
>>> CTO and Co-Founder, Dremio
>>> 
>>> On Thu, Sep 17, 2015 at 10:53 AM, Christopher Matta 
>>> wrote:
>>> 
 Here is my attempt at building a reproduction, btw, it seems like this is
 the same issue as DRILL-862
  where Jacques
>>> determined
 the error to be invalid. Is trying to cast an empty string, or null value
 to an integer invalid? What's the workaround?
 
 Data
 
 row1,1,2
 row2,,4
 row3,5,6
 row4,7,8
 
 Create Table
 
 $ maprcli table create -path /user/cmatta/projects/cmatta_test
 $ maprcli table cf create -path /user/cmatta/projects/cmatta_test
>>> -cfname a
 
 Load into Hbase table:
 
 hbase org.apache.hadoop.hbase.mapreduce.ImportTsv
 -Dimporttsv.separator=',' -Dimporttsv.columns=HBASE_ROW_KEY,a:c1,a:c2
 /user/cmatta/projects/cmatta_test
 maprfs:///user/cmatta/projects/testdata_hbase_null
 
 Query (error):
 
 0: jdbc:drill:> select cast(x.`row_key` as varchar(128)) as `row_key`,
 CAST(x.`a`.`c1` as INTEGER) from maprfs.cmatta.`cmatta_test` x;
 Error: SYSTEM ERROR: NumberFormatException:
 
 Fragment 0:0
 
 [Error Id: cbcb3327-3699-4191-9c26-9b95c9922690 on
 se-node11.se.lab:31010] (state=,code=0)
 
 Query that works on the column (c2) that doesn’t have a NULL value:
 
 0: jdbc:drill:> select cast(x.`row_key` as varchar(128)) as `row

Re: Drill view schema returned in results set

2015-09-22 Thread Sudheesh Katkam

HI Michael,

The error message was fixed as part of DRILL-3583 
. The problem is that the 
query uses the SUM function on a column with string or boolean type.

Thank you,
Sudheesh
 
> On Sep 22, 2015, at 3:54 AM,  
>  wrote:
> 
> I actually solved this with advice from Leon Clayton at MapR  by creating a 
> new workspace for views alone and keeping it separate from the data. This 
> obviously makes sense but it would be nice if Drill could be aware of any 
> created views (to be able to ignore them) in case someone accidently creates 
> the view in the same area as their data causing erroneous results.
> 
> 
> 
> -Original Message-
> From: England, Michael (IT/UK) 
> Sent: 22 September 2015 10:14
> To: user@drill.apache.org
> Subject: RE: Drill view schema returned in results set
> 
> I have noticed that Drill keeps the view definition in the same directory as 
> the data. Should this not be ignored by default instead of having to specify 
> what files to query?
> 
> 
> 
> -Original Message-
> From: England, Michael (IT/UK) 
> Sent: 22 September 2015 09:53
> To: user@drill.apache.org
> Subject: Drill view schema returned in results set
> 
> Hi,
> 
> I was trying to run a SUM() function on a view but was getting the following 
> error:
> 
> ERROR [HY000] [MapR][Drill] (1040) Drill failed to execute the query: select 
> sum(x) from `mike`.`view`.`test_view ` [30027]Query execution error. 
> Details:[ SYSTEM ERROR: CompileException: Line 33, Column 177: Unknown 
> variable or type "logger"
> 
> Fragment 0:0
> 
> [Error Id: 68c7c5b8-8875-4834-8977-c514a08ca8b0 on x.mydomain.com:31010] ]
> 
> I ran a distinct on the column to check all of the data types in the column 
> were an integer but saw that one row contained no data. I orginally thought 
> that it was my data causing the issue, but on further analysis I saw the view 
> schema being returned in the first column which seems to be causing the issue:
> 
> select * from `mike`.`view`.`myview` where isnumeric(x) = 0
> 
> {
>  "name" : "test_view ",
>  "sql" : "SELECT `columns`[0] AS `x`, `columns`[1] AS `x1`, `columns`[2] AS 
> `x2`, `columns`[3] AS `x3`, `columns`[4] AS `x4`, `columns`[5] AS `x5`, 
> `columns`[6] AS `severity`, `columns`[7] AS `x6`\nFROM `mike`.`view`.`.`",
>  "fields" : [ {
>"name" : "x",
>"type" : "ANY",
>"isNullable" : true
>  }, {
>"name" : "x1",
>"type" : "ANY",
>"isNullable" : true
>  }, {
>"name" : "x2",
>"type" : "ANY",
>"isNullable" : true
>  }, {
>"name" : "x3",
>"type" : "ANY",
>"isNullable" : true
>  }, {
>"name" : "x4",
>"type" : "ANY",
>"isNullable" : true
>  }, {
>"name" : "x5",
>"type" : "ANY",
>"isNullable" : true
>  }, {
>"name" : "x6",
>"type" : "ANY",
>"isNullable" : true
>  }, {
>"name" : "x7",
>"type" : "ANY",
>"isNullable" : true
>  } ],
>  "workspaceSchemaPath" : [ ]
> }
> 
> Is this expected behaviour or does it look like a bug. Has anyone come across 
> this before?
> 
> Thanks,
> Michael
> 
> 
> This e-mail (including any attachments) is private and confidential, may 
> contain proprietary or privileged information and is intended for the named 
> recipient(s) only. Unintended recipients are strictly prohibited from taking 
> action on the basis of information in this e-mail and must contact the sender 
> immediately, delete this e-mail (and all attachments) and destroy any hard 
> copies. Nomura will not accept responsibility or liability for the accuracy 
> or completeness of, or the presence of any virus or disabling code in, this 
> e-mail. If verification is sought please request a hard copy. Any reference 
> to the terms of executed transactions should be treated as preliminary only 
> and subject to formal written confirmation by Nomura. Nomura reserves the 
> right to retain, monitor and intercept e-mail communications through its 
> networks (subject to and in accordance with applicable laws). No 
> confidentiality or privilege is waived or lost by Nomura by any 
> mistransmission of this e-mail. Any reference to "Nomura" is a reference to 
> any entity in the Nomura Holdings, Inc. group. Please read our Electronic 
> Communications Legal Notice which forms part of this e-mail: 
> http://www.Nomura.com/email_disclaimer.htm
> 
> 
> 
> This e-mail (including any attachments) is private and confidential, may 
> contain proprietary or privileged information and is intended for the named 
> recipient(s) only. Unintended recipients are strictly prohibited from taking 
> action on the basis of information in this e-mail and must contact the sender 
> immediately, delete this e-mail (and all attachments) and destroy any hard 
> copies. Nomura will not accept responsibility or liability for the accuracy 
> or completeness of, or the presence of any virus or disabling code in, this 
> e-mail. If verification is sought please request a hard copy. Any refere

Re: Help with optimizing a query

2015-08-10 Thread Sudheesh Katkam

Hi Yousef,

If possible, could you put the profile in a publicly accessible location 
(Dropbox, etc) and post the link here?

Thank you,
Sudheesh

> On Aug 10, 2015, at 3:43 PM, Yousef Lasi  wrote:
> 
> We're running a 4 file join on a set of parquet files, the largest of which 
> is about 20 GB in size. The query plan seems to indicate that most, if not 
> all the time for the query (>30 minutes) is spent on the first two major 
> fragments. The physical plan looks like the output for these 2 fragments 
> below. I am not sure how to best interpret the results to optimize the query. 
> It's pretty clear based on the plan output as well as the actual system 
> resources utilized during execution that we are not CPU, Memory or I/O bound. 
>  That doesn't leave a whole lot left to chase down. Any suggestions on where 
> to look?
>   
> 
>00-00Screen : rowType = RecordType(ANY ACTION, ANY ID_TRAN_COLL, 
> ANY ID_ACC, ANY ID_IMNT_STD, ANY CD_TYP_PSN, ANY DT_ACQS, ANY CD_TX_ACQS, ANY 
> DMT_FED_COST_ACQ_PSN, ANY QY_SETT, ANY DMT_FED_COST_SETT_LCL, ANY 
> DMT_UNIT_SETT_PSN, ANY QY_TOT, ANY DMT_FED_COST_TOT_CLNT, ANY 
> DMT_FED_COST_TOT_LCL, ANY DMT_UNIT_TOT_PSN, ANY TRD_FEDERAL_TAX_COST_LOCL, 
> ANY SET_FEDERAL_TAX_COST_LOCL, ANY DMT_TRD_GIFT_DATE, ANY 
> DMT_TRD_FMKT_VAL_LOCL, ANY DMT_TRD_FMKT_VAL_BASE, ANY DMT_SET_GIFT_DATE, ANY 
> DMT_SET_FMKT_VAL_LOCL, ANY DMT_SET_FMKT_VAL_BASE, ANY 
> DMT_TRD_LOT_LEVEL_COV_IND, ANY DMT_SET_LOT_LEVEL_COV_IND, ANY 
> DMT_TRD_TRANSFER_CODE, ANY DMT_SET_TRANSFER_CODE, ANY 
> DMT_TRD_WASH_SALE_ACQ_DATE, ANY DMT_SET_WASH_SALE_ACQ_DATE, ANY 
> DMT_TRD_LOT_DISALLOWED_DAYS, ANY DMT_SET_LOT_DISALLOWED_DAYS, ANY 
> DMT_TRD_LOT_FMKT_USAGE_IND, ANY DMT_SET_LOT_FMKT_USAGE_IND, ANY 
> DMT_SET_UGL_ST, ANY DMT_SET_UGL_ST_LCL, ANY DMT_SET_UGL_LT, ANY 
> DMT_SET_UGL_LT_LCL, ANY DMT_TRD_UGL_ST, ANY DMT_TRD_UGL_ST_LCL, ANY 
> DMT_TRD_UGL_LT, ANY DMT_TRD_UGL_LT_LCL, ANY ID_ACC_ALT, ANY CD_MTH_CSTNG, ANY 
> CD_CCY_TRD_PRIM, ANY RT_SPOT, ANY IMNT_CLSS_PRICE_FCTR_RT, ANY ID_IMNT_CL, 
> ANY DMT_TOT_MV_PSN, ANY TRD_INT_LOCL, ANY DMT_INCM_EXPCTD, ANY 
> DMT_SETT_MV_PSN, ANY CD_CL_OMNI): rowcount = 4.061288430004E7, cumulative 
> cost = {7.33055744629E8 rows, 1.5711522516974182E10 cpu, 0.0 io, 
> 2.5162863282995203E13 network, 1.723515700962E10 memory}, id = 117134 
> 00-01  Project(ACTION=[$0], ID_TRAN_COLL=[$1], ID_ACC=[$2], 
> ID_IMNT_STD=[$3], CD_TYP_PSN=[$4], DT_ACQS=[$5], CD_TX_ACQS=[$6], 
> DMT_FED_COST_ACQ_PSN=[$7], QY_SETT=[$8], DMT_FED_COST_SETT_LCL=[$9], 
> DMT_UNIT_SETT_PSN=[$10], QY_TOT=[$11], DMT_FED_COST_TOT_CLNT=[$12], 
> DMT_FED_COST_TOT_LCL=[$13], DMT_UNIT_TOT_PSN=[$14], 
> TRD_FEDERAL_TAX_COST_LOCL=[$15], SET_FEDERAL_TAX_COST_LOCL=[$16], 
> DMT_TRD_GIFT_DATE=[$17], DMT_TRD_FMKT_VAL_LOCL=[$18], 
> DMT_TRD_FMKT_VAL_BASE=[$19], DMT_SET_GIFT_DATE=[$20], 
> DMT_SET_FMKT_VAL_LOCL=[$21], DMT_SET_FMKT_VAL_BASE=[$22], 
> DMT_TRD_LOT_LEVEL_COV_IND=[$23], DMT_SET_LOT_LEVEL_COV_IND=[$24], 
> DMT_TRD_TRANSFER_CODE=[$25], DMT_SET_TRANSFER_CODE=[$26], 
> DMT_TRD_WASH_SALE_ACQ_DATE=[$27], DMT_SET_WASH_SALE_ACQ_DATE=[$28], 
> DMT_TRD_LOT_DISALLOWED_DAYS=[$29], DMT_SET_LOT_DISALLOWED_DAYS=[$30], 
> DMT_TRD_LOT_FMKT_USAGE_IND=[$31], DMT_SET_LOT_FMKT_USAGE_IND=[$32], 
> DMT_SET_UGL_ST=[$33], DMT_SET_UGL_ST_LCL=[$34], DMT_SET_UGL_LT=[$35], 
> DMT_SET_UGL_LT_LCL=[$36], DMT_TRD_UGL_ST=[$37], DMT_TRD_UGL_ST_LCL=[$38], 
> DMT_TRD_UGL_LT=[$39], DMT_TRD_UGL_LT_LCL=[$40], ID_ACC_ALT=[$41], 
> CD_MTH_CSTNG=[$42], CD_CCY_TRD_PRIM=[$43], RT_SPOT=[$44], 
> IMNT_CLSS_PRICE_FCTR_RT=[$45], ID_IMNT_CL=[$46], DMT_TOT_MV_PSN=[$47], 
> TRD_INT_LOCL=[$48], DMT_INCM_EXPCTD=[$49], DMT_SETT_MV_PSN=[$50], 
> CD_CL_OMNI=[$51]) : rowType = RecordType(ANY ACTION, ANY ID_TRAN_COLL, ANY 
> ID_ACC, ANY ID_IMNT_STD, ANY CD_TYP_PSN, ANY DT_ACQS, ANY CD_TX_ACQS, ANY 
> DMT_FED_COST_ACQ_PSN, ANY QY_SETT, ANY DMT_FED_COST_SETT_LCL, ANY 
> DMT_UNIT_SETT_PSN, ANY QY_TOT, ANY DMT_FED_COST_TOT_CLNT, ANY 
> DMT_FED_COST_TOT_LCL, ANY DMT_UNIT_TOT_PSN, ANY TRD_FEDERAL_TAX_COST_LOCL, 
> ANY SET_FEDERAL_TAX_COST_LOCL, ANY DMT_TRD_GIFT_DATE, ANY 
> DMT_TRD_FMKT_VAL_LOCL, ANY DMT_TRD_FMKT_VAL_BASE, ANY DMT_SET_GIFT_DATE, ANY 
> DMT_SET_FMKT_VAL_LOCL, ANY DMT_SET_FMKT_VAL_BASE, ANY 
> DMT_TRD_LOT_LEVEL_COV_IND, ANY DMT_SET_LOT_LEVEL_COV_IND, ANY 
> DMT_TRD_TRANSFER_CODE, ANY DMT_SET_TRANSFER_CODE, ANY 
> DMT_TRD_WASH_SALE_ACQ_DATE, ANY DMT_SET_WASH_SALE_ACQ_DATE, ANY 
> DMT_TRD_LOT_DISALLOWED_DAYS, ANY DMT_SET_LOT_DISALLOWED_DAYS, ANY 
> DMT_TRD_LOT_FMKT_USAGE_IND, ANY DMT_SET_LOT_FMKT_USAGE_IND, ANY 
> DMT_SET_UGL_ST, ANY DMT_SET_UGL_ST_LCL, ANY DMT_SET_UGL_LT, ANY 
> DMT_SET_UGL_LT_LCL, ANY DMT_TRD_UGL_ST, ANY DMT_TRD_UGL_ST_LCL, ANY 
> DMT_TRD_UGL_LT, ANY DMT_TRD_UGL_LT_LCL, ANY ID_ACC_ALT, ANY CD_MTH_CSTNG, ANY 
> CD_CCY_TRD_PRIM, ANY RT_SPOT, ANY IMNT_CLSS_PRICE_FCTR_RT, ANY ID_IMNT_CL, 
> ANY DMT_TOT_MV_PSN, ANY TRD_INT_LOCL, ANY DMT_INCM_EXPCTD, ANY 
> DMT_SETT_MV_PSN, ANY CD_CL_OMNI): rowcount = 4.061288430004E

Re: Resetting an option

2015-08-10 Thread Sudheesh Katkam

Correction: currently any user can SET or RESET an option for session and 
system.

> On Aug 10, 2015, at 2:20 PM, Sudheesh Katkam  wrote:
> 
> Hey y‘all,
> 
> Re DRILL-1065 <https://issues.apache.org/jira/browse/DRILL-1065>, at system 
> level (ALTER system RESET …), resetting an option would mean changing the 
> value to the default provided by Drill. But, at session level (ALTER session 
> RESET …), would resetting an option mean:
> (a) changing the value to the default provided by Drill? or,
> (b) changing the value to the system value, that an admin could’ve changed?
> 
> (b) would not allow non-admin users to know what the default is (easily). 
> However, for a given option, (a) would allow a non-admin user to know what 
> the default is (by resetting) and what the system setting is (from 
> sys.options). Opinions?
> 
> Thank you,
> Sudheesh

Resetting an option

2015-08-10 Thread Sudheesh Katkam

Hey y‘all,

Re DRILL-1065 , at system 
level (ALTER system RESET …), resetting an option would mean changing the value 
to the default provided by Drill. But, at session level (ALTER session RESET 
…), would resetting an option mean:
(a) changing the value to the default provided by Drill? or,
(b) changing the value to the system value, that an admin could’ve changed?

(b) would not allow non-admin users to know what the default is (easily). 
However, for a given option, (a) would allow a non-admin user to know what the 
default is (by resetting) and what the system setting is (from sys.options). 
Opinions?

Thank you,
Sudheesh

Re: pending queries jamming the system

2015-08-03 Thread Sudheesh Katkam

Hi Stefan,

Can you create a JIRA for this? Please attach to the JIRA:
(1) thread dumps of the three Drillbits (you can get this using jstack), and 
(2) json query profiles of the PENDING queries (you can get this from the "Full 
JSON Profile" at the bottom of the profile page).

Thank you,
Sudheesh

> On Aug 3, 2015, at 9:00 AM, Stefán Baxter  wrote:
> 
> Hi,
> 
> I have a small cluster of 3 drillbits running. It's been working just fine
> until it stopped working altogether. I notice a few "pending" queries and
> when I try to cancel them, via the admin, they either report that they
> don't know where they are running or the cancelling process freezes.
> 
> What is a) the easiest way to delete all pending tasks? and b) make sure
> that this does not happen?
> 
> All the best,
> -Stefan

Hangout happening now!

2015-07-28 Thread Sudheesh Katkam

Come join the Drill community as we discuss what has been happening lately and 
what is in the pipeline. All are welcome, if you know about Drill, want to know 
more or just want to listen in.

Link: https://plus.google.com/hangouts/_/event/ci4rdiju8bv04a64efj5fedd0lc 


Agenda for now:
1) Parth: https://issues.apache.org/jira/browse/DRILL-3228 
 and related issues (handle 
schema change)

Thanks!

Re: Operation category READ is not supported in state standby at org.apache.hadoop.hdfs.server.namenode.ha.Standby

2015-07-16 Thread Sudheesh Katkam

Can you try just “thrift://nn2:9083 ” (and not include the 
failover namenode) for “hive.metastore.uris” property?

Thank you,
Sudheesh

> On Jul 16, 2015, at 1:43 AM, Arthur Chan  wrote:
> 
> Anyone has idea what I would be wrong in setup Drill?
> 
> On Tue, Jul 14, 2015 at 4:21 PM, Arthur Chan 
> wrote:
> 
>> Hi,
>> 
>> I have HDFS HA with two namenodes (nn1 and nn2 respectively)
>> 
>> 
>> When the namenode nn1 is failover to nn2,  when querying HIVE, I got the
>> following error:
>> 
>> Query Failed: An Error Occurred
>> 
>> org.apache.drill.common.exceptions.UserRemoteException: SYSTEM ERROR:
>> org.apache.hadoop.ipc.RemoteException: Operation category READ is not
>> supported in state standby at
>> org.apache.hadoop.hdfs.server.namenode.ha.StandbyState.checkOperation(StandbyState.java:87)
>> at
>> org.apache.hadoop.hdfs.server.namenode.NameNode$NameNodeHAContext.checkOperation(NameNode.java:1719)
>> at
>> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkOperation(FSNamesystem.java:1350)
>> at
>> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getFileInfo(FSNamesystem.java:4132)
>> at
>> org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.getFileInfo(NameNodeRpcServer.java:838)
>> at
>> org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.getFileInfo(ClientNamenodeProtocolServerSideTranslatorPB.java:821)
>> at
>> org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)
>> at
>> org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:619)
>> at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:962) at
>> org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2039) at
>> org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2035) at
>> java.security.AccessController.doPrivileged(Native Method) at
>> javax.security.auth.Subject.doAs(Subject.java:422) at
>> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1628)
>> at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2035
>> 
>> 
>> {
>> 
>>  "type": "hive",
>> 
>>  "enabled": true,
>> 
>>  "configProps": {
>> 
>>"hive.metastore.uris": "thrift://nn1:9083,thrift://nn2:9083",
>> 
>>"hive.metastore.sasl.enabled": "false"
>> 
>>  }
>> 
>> }
>> 
>> 
>> Any idea to resolve the issue? Please help!
>>

Re: empty results

2015-07-16 Thread Sudheesh Katkam

Is it returning empty results only though REST API? Did you try sqlline?

Do you have a simple repro? If so, can you file a ticket?

Thank you,
Sudheesh

> On Jul 16, 2015, at 10:25 AM, Stefán Baxter  wrote:
> 
> Hi,
> 
> What can be happening if a drillbit (local) starts returning empty results
> (from a Parquet query) and does not return proper results unless it's
> restarted?
> 
> (I noticed this started happening when I began using the REST API but I
> have no direct link to that)
> 
> Regards,
> -Stefán

Re: Rest API

2015-07-16 Thread Sudheesh Katkam

See inline.

> On Jul 16, 2015, at 4:03 AM, Preetham Nadig  
> wrote:
> 
> Hello,
> I am planning to use DRILL for one of my projects and I have been working 
> with it for couple of weeks

That’s awesome!

> One of the things I would like to do is access DRILL over a REST API, I have 
> successfully queried over a web client.
> But is it possible to send a more than one query in a a rest call or a user 
> defined function kind of functionality?

It’s one query per REST call. Here’s the REST API reference: 
https://docs.google.com/document/d/1mRsuWk4Dpt6ts-jQ6ke3bB30PIwanRiCPfGxRwZEQME/edit

You should be able to reference a UDF from you query over REST API. Reference 
to writing custom UDFs: 
http://drill.apache.org/docs/develop-custom-functions-introduction/ 

> If not what alternate options exist to use drill programmatically so that it 
> acts as the interface between data source and a end user application?

You can use the Drill JDBC and OBDC drivers. Here’s a simple example using 
ODBC: https://www.mapr.com/blog/using-drill-programmatically-python-r-and-perl 

> 
> Thanks,
> Regards,
> Preetham

Thank you,
Sudheesh

Re: using the REST API

2015-07-16 Thread Sudheesh Katkam

See inline.

> On Jul 16, 2015, at 4:36 AM, Stefán Baxter  wrote:
> 
> Hi,
> 
> I have a few questions regarding the rest API.
> 
>   - Is it possible that the rest api (query.json) should return numeric
>   values as strings?

There is a ticket for this: https://issues.apache.org/jira/browse/DRILL-2373 


>   - count(*)  being an example of that
>   - calls for conversion on the browser side
>   - I find no obvious setting for this
> 
>   - Is there any other serialization/encoding available apart from JSON?
>   (like Protobuff)

Not currently.

>   - naming the REST endpoint after a serialization may be a hint here
> 
>   - Is it possible to use gzipped response to minimize the total delivery
>   time?

That could be an enhancement.

> 
> I'm using this interface instead of the JDBC driver and would like to do
> everything needed to speed it up.

AFAIK, there is not much you can to to speed REST API up.
Just in case, here’s a link on how to use Drill using ODBC: 
https://www.mapr.com/blog/using-drill-programmatically-python-r-and-perl 


> Regards,
> -Stefan


Thank you,
Sudheesh

Re: Flatten Output Json

2015-07-16 Thread Sudheesh Katkam

Does this help?
http://drill.apache.org/docs/flatten/ 

Thank you,
Sudheesh

> On Jul 15, 2015, at 10:55 PM, Usman Ali  wrote:
> 
> Hi,
>  Drill sqlline displays output in a nice format. I am guessing it must
> be flattening the output json before printing it. Is there any function
> available in source code of drill to flatten the response json?
> 
> Regards,
> Usman Ali

Re: Set Drill Response Format to CSV Through Rest APIs

2015-07-16 Thread Sudheesh Katkam

Currently we support only JSON through REST API.

Thank you,
Sudheesh

> On Jul 15, 2015, at 9:26 PM, Usman Ali  wrote:
> 
> Hi,
> Is there any way to set response format of drill to csv  instead of
> json using Rest APIs? If yes, then what other response formats are
> available in drill.
> 
> Regards,
> Usman Ali

Re: Issues on querying hbase table on REST

2015-06-18 Thread Sudheesh Katkam

Hi,

Are you sending the correct Content-Type in the request header? Here’s a link 
that might help you: 
http://docs.brightcove.com/en/video-cloud/player-management/guides/postman.html 

 and the Drill REST API doc: 
https://docs.google.com/document/d/1mRsuWk4Dpt6ts-jQ6ke3bB30PIwanRiCPfGxRwZEQME/edit
 


Thank you,
Sudheesh

> On Jun 18, 2015, at 4:17 AM, Nayan Paul  wrote:
> 
> Hi I have integrated drill on Cloudera cluster and  have a hbase table that
> I have to access through REST from PHP.
> 
> I can query my hbase table from apache drill query editor , but when I use
> Postman to test json response from the request then i get 415 internet error
> 
> response from Postman
> 
> 
> 
> 
> 
> Error 415 Unsupported Media Type
> 
> HTTP ERROR 415
> Problem accessing /query.json. Reason:
> Unsupported Media TypePowered by
> Jetty://
> 
> 
> 
> 
> Regards,
> 
> ___
> 
> 
> Nayan Paul | Phone No: +91-9831814333 | email: nayan.j.p...@gmail.com

Re: JAVA API for Drill

2015-06-01 Thread Sudheesh Katkam

Adding to Hanifi’s comment. Loot at QueryWrapper#run method and 
QueryWrapper$Listener
https://github.com/hnfgns/incubator-drill/blob/master/exec/java-exec/src/main/java/org/apache/drill/exec/server/rest/QueryWrapper.java
 


- Sudheesh

> On Jun 1, 2015, at 2:29 PM, Hanifi Gunes  wrote:
> 
> Have a look at QuerySubmitter
> .
> It does boilerplate for posting queries on top DrillClient. All remains is
> to attach a result listener to perform your custom logic.
> 
> -Hanifi
> 
> On Mon, Jun 1, 2015 at 2:19 PM, Norris Lee  wrote:
> 
>> Hi Nishith,
>> 
>> As far as I know, I don't think there is any documentation on that.
>> Hopefully the function names are relatively self-explanatory. If not, feel
>> free to ask on this list for clarification.
>> 
>> Norris
>> 
>> -Original Message-
>> From: Nishith Maheshwari [mailto:nsh...@gmail.com]
>> Sent: Monday, June 01, 2015 4:19 AM
>> To: user@drill.apache.org
>> Subject: Re: JAVA API for Drill
>> 
>> Thanks Norris
>> 
>> Is there any documentation regarding the usage of these libraries and
>> functions? As in which function does what.
>> 
>> Regards,
>> Nishith
>> 
>> On Wed, May 27, 2015 at 10:06 PM, Norris Lee  wrote:
>> 
>>> Hi Nishith,
>>> 
>>> Take a look at the DrillClient.java and .cpp/.hpp classes of the
>>> project for the  Java and C++ libraries respectively.
>>> 
>>> Norris
>>> 
>>> -Original Message-
>>> From: Nishith Maheshwari [mailto:nsh...@gmail.com]
>>> Sent: Wednesday, May 27, 2015 1:45 AM
>>> To: user@drill.apache.org
>>> Subject: Re: JAVA API for Drill
>>> 
>>> Thank you Martin and Rajkumar for your prompt responses.
>>> 
>>> I am actually looking if some API is available which provides this
>>> functionality. In the documentation it is mentioned in :
>>> https://drill.apache.org/docs/architecture-introduction -
>>> 
>>> *You can connect to Apache Drill through the following interfaces:*
>>> 
>>>   - *Drill shell*
>>>   - *Drill Web UI*
>>>   - *ODBC
>>>   <
>>> https://drill.apache.org/docs/odbc-jdbc-interfaces#using-odbc-to-acces
>>> s-apache-drill-from-bi-tools
 **
>>>   - *JDBC *
>>>   - *C++ API*
>>> 
>>> 
>>> and in http://drill.apache.org/faq/ -
>>> *What clients are supported?*
>>> 
>>>   - *BI tools via the ODBC and JDBC drivers (eg, Tableau, Excel,
>>>   MicroStrategy, Spotfire, QlikView, Business Objects)*
>>>   - *Custom applications via the REST API*
>>>   - *Java and C applications via the dedicated Java and C libraries*
>>> 
>>> 
>>> It would be great if you/somebody can point me to the C++ api or the
>>> dedicated JAVA library or API as mentioned in the documentation.
>>> 
>>> Thanks and regards,
>>> Nishith Maheshwari
>>> 
>>> 
>>> 
>>> On Wed, May 27, 2015 at 12:44 PM, Rajkumar Singh 
>>> wrote:
>>> 
 Do you try drill-jdbc driver? I will suggest you to use java jdbc
 connectivity to query drill using the drill-jdbc driver.I have not
 tried this to query HBASE using drill but it should work if you have
 correctly configured the HBase Storage plugin with the DRILL.
 
 Thanks
 
 Rajkumar Singh
 
 
 
> On May 27, 2015, at 12:09 PM, Nishith Maheshwari
> 
 wrote:
> 
> Hi,
> I wanted to create a java application to connect and query over a
> HBase database using Drill, but was unable to find any
> documentation regarding this.
> Is there a JAVA api through which Drill can be accessed? I did see
> a
 small
> mention of C++ and JAVA api in the documentation but there was no
> other link or information regarding the same.
> 
> Regards,
> Nishith Maheshwari
 
 
>>> 
>>

Re: Monitoring long / stuck CTAS

2015-05-29 Thread Sudheesh Katkam

See below:

> On May 27, 2015, at 12:17 PM, Matt  wrote:
> 
> Attempting to create a Parquet backed table with a CTAS from an 44GB tab 
> delimited file in HDFS. The process seemed to be running, as CPU and IO was 
> seen on all 4 nodes in this cluster, and .parquet files being created in the 
> expected path.
> 
> In however in the last two hours or so, all nodes show near zero CPU or IO, 
> and the Last Modified date on the .parquet have not changed. Same time delay 
> shown in the Last Progress column in the active fragment profile.

Did you happen to notice the Last Update column in the profile? If so, was 
there a time delay in that too?

> 
> What approach can I take to determine what is happening (or not)?
>

Re: Drill logical plan optimization

2015-05-28 Thread Sudheesh Katkam

Hi Rajkumar,

Here are some links:
http://drill.apache.org/docs/performance-tuning-introduction/ 
 (Performance 
Tuning Guide)
http://drill.apache.org/docs/query-plans/ 


Did you mean optimize physical plan?

Thanks,
Sudheesh

> On May 27, 2015, at 11:14 PM, Rajkumar Singh  wrote:
> 
> Hi
> 
> I am looking for some measures/params to looked upon to optimize the drill 
> logical query plan if i want to resubmit it through the Drill UI, Could you 
> please points me some docs so that I can go through it.
> 
> Rajkumar Singh
> MapR Technologies
> 
>

Re: Announcing new committer: Hanifi Gunes

2015-04-16 Thread Sudheesh Katkam

Congratulations, Hanifi!

> On Apr 16, 2015, at 2:29 PM, Jacques Nadeau  wrote:
> 
> The Apache Drill PMC is very pleased to announce Hanifi Gunes as a new
> committer.
> 
> He has been providing great contributions over the past six months.
> Additionally, has quickly become the go to expert for value vectors.
> 
> Welcome Hanifi!

77 matches

Mail list logo