CSV(MS Excel created) - error while casting last column to Integer

2015-12-02 Thread Rahul Raj
Hi, When querying a CSV created with MS Excel, Drill 1.2 throws number format exception while casting the last column as integer. No issues with any other integer columns.Once saved as a unix file, the query works fine. Seems to a the line separator issue. Is it possible to get a more

JDBC metadata from storage plugin

2016-08-11 Thread Rahul Raj
Hi, Does drill allow querying JDBC metadata from a storage plugin? I am trying to get the list of tables/views and their respective columns. If not, is it possible to develop rest apis within drill to achive this? Regards, Rahul -- This email and any files transmitted with it are

Re: Extra Column are showing in output while querying files

2017-01-31 Thread Rahul Raj
Do you have a jdbc plugin enabled? https://issues.apache.org/ jira/browse/DRILL-4903 Rahul On Jan 31, 2017 20:42, "Sanjiv Kumar" wrote: > Hello I am using Drill latest version (1.9) in window and in embedded mode. > > I have csv file in my local drive (D://). While

Drill contribution guidelines

2017-01-30 Thread Rahul Raj
The current guidelines does not mention how to create a pull request. What is the preferred approach - a fork or creating a branch in the same repo? Rahul -- This email and any files transmitted with it are confidential and intended solely for the use of the individual or entity to whom

Kudu plugin

2017-01-25 Thread Rahul Raj
Any experiences with Kudu storage plugin? I could see that there has been no activity on kudu storage almost for a year. Getting an error -" out-of-order key" for a query select v,count(k) from kudu.test group by v where k is the primary key. This happens only when the aggregation is done on

Re: Kudu plugin

2017-01-26 Thread Rahul Raj
Created DRILL-5229. I will work on this. On Thu, Jan 26, 2017 at 1:01 PM, Chunhui Shi <c...@mapr.com> wrote: > Could you file a JIRA and work on this update? Thanks. > > ____ > From: Rahul Raj <rahul@option3consulting.com> > Sent: Wednes

Re: Kudu plugin

2017-01-27 Thread Rahul Raj
Can someone point me to a storage plugin design document? I guess Kudu plugin has no design information attached. On Fri, Jan 27, 2017 at 1:18 PM, Rahul Raj <rahul@option3consulting.com> wrote: > Created DRILL-5229. I will work on this. > > On Thu, Jan 26, 2017 at 1:01 PM,

Slow query on parquet imported from SQL Server while the external SQL server is down.

2016-11-25 Thread Rahul Raj
I have created a parquet file using CTAS from a MS SQL Server. The query on parquet is getting stuck in STARTING state for a long time before returning the results. We could see that drill was trying to connect to the MS SQL server from which the data was imported. The MSSQL server was down,

Implicit file columns

2016-12-12 Thread Rahul Raj
>From Drill 1.7, a select star query returns some implicit file columns(like FQN,fileName,filePath,suffix). I could not find a documentation related to this. This feature has been added with the commit https://github.com/apache/drill/pull/491. Can someone explain why it is required? Is there any

Re: Implicit file columns

2016-12-12 Thread Rahul Raj
f there is a way to disable it without changing the code. > > Kind regards > Arina > > On Mon, Dec 12, 2016 at 4:08 PM, Rahul Raj <rahul.raj@option3consulting. > com> > wrote: > > > From Drill 1.7, a select star query returns some implicit file > columns(like > >

Re: Incorrect column name with OVER clause on Drill 1.8

2016-12-03 Thread Rahul Raj
Ignore the '${}' in the table name ${purchases_by_item_date}, it gets substituted as a valid name. Rahul On Sun, Dec 4, 2016 at 12:36 PM, Rahul Raj <rahul@option3consulting.com> wrote: > The following query: > > SELECT > bill_date, > sum(sell_amt) over() as `cum_

Incorrect column name with OVER clause on Drill 1.8

2016-12-03 Thread Rahul Raj
The following query: SELECT bill_date, sum(sell_amt) over() as `cum_purchases_amt` FROM ${purchases_by_item_date} on a parquet file returns column name as '$1' instead of cum_purchases_amt. Any ways to override the name? Drill 1.6 also shows the same behaviour. Rahul -- This email and

Re: Incorrect column name with OVER clause on Drill 1.8

2016-12-04 Thread Rahul Raj
---+* > *| ** a** | ** b ** |* > *+--+-+* > *| *2452508 * | *628122 > * |**+--+-----+* > > On Sat, Dec 3, 2016 at 11:16 PM, Rahul Raj <rahul.raj@option3consulting. > com> > wrote: > > > Ignore the '${}' in the table name ${

Re: Last Column showing blank in csv file

2016-12-05 Thread Rahul Raj
Notes on line delimiter configuration There must not be quotes around lineDelimiter - select * from table(dfs.`my_table`(type=>'text', lineDelimiter=>'\r\n')) *fieldDelimiter needs to be specified explicitly, else the columns are not separated properly* select columns[0] as `A`, CAST(columns[1]

Re: Incorrect column name with OVER clause on Drill 1.8

2016-12-04 Thread Rahul Raj
Update: The query results are picking up the aliases now. This happened without restarting the system. 1) What could have caused the situation? 2) What are those extra columns? Rahul On Sun, Dec 4, 2016 at 4:25 PM, Rahul Raj <rahul@option3consulting.com> wrote: > I was usin

Re: Slow query on parquet imported from SQL Server while the external SQL server is down.

2016-11-30 Thread Rahul Raj
at 12:12 AM, Abhishek Girish < > > abhishek.gir...@gmail.com > > > wrote: > > > > > Can you attempt to disable to jdbc plugin (configured with SQLServer) > and > > > try the query (on parquet) when SQL Server is offline? > > > > > >

Re: Error while applying interval on a postgresql query

2017-03-20 Thread Rahul Raj
gt; > | 1048 | > > +---+ > > 2 rows selected (0.21 seconds) > > > > Thanks, > > > > Boaz > > > > On 3/20/17, 8:10 AM, "Rahul Raj" <rahul@option3consulting.com> wrote: > > > > Hi, > > > >

Error while applying interval on a postgresql query

2017-03-20 Thread Rahul Raj
Hi, Drill 1.9 gives error while applying interval function on a postgresql table. The below two queries error out. Not sure about the other databases. select `id` from (select * from config_1.public.project_release) where CAST(DATE_ADD(`start_date`,interval '19800' second(5)) AS DATE) = DATE

Reading Drill generated Timestamp from spark

2017-04-01 Thread Rahul Raj
I'm unable to read drill generated timestamp column inside a spark program. Drill 1.10 has support for reading int96 as timestamp. Is it possible to generate the same from drill? Is there any mechanism to read drills int64 from spark? Rahul -- This email and any files transmitted with it

Wrong results on repeated DATE_ADD

2017-03-09 Thread Rahul Raj
Hi, On Drill 1.9, DATE_ADD(DATE_ADD ...) results in the inner most value getting added up N times. Its seen on MINUTE/SECOND/HOUR interval values, It works fine on DAY interval See the results below; I have trimmed the sqlline results for brevity. SELECT DATE_ADD(TIME '12:23:34',INTERVAL '1'

Re: Non-ascii characters still fails on Drill 1.11

2017-08-11 Thread Rahul Raj
I have not upgraded the drill client jar to 1.11. Will upgrade and confirm. Regards, Rahul On Fri, Aug 11, 2017 at 10:32 AM, Rahul Raj <rahul@option3consulting.com > wrote: > Hi, > > https://issues.apache.org/jira/browse/DRILL-4039 was fixed in 1.11. It > failed on the fol

Re: Non-ascii characters still fails on Drill 1.11

2017-08-11 Thread Rahul Raj
Issue persist even with JDBC driver 1.11.0. On Fri, Aug 11, 2017 at 11:35 AM, Rahul Raj <rahul@option3consulting.com > wrote: > I have not upgraded the drill client jar to 1.11. Will upgrade and confirm. > > Regards, > Rahul > > On Fri, Aug 11, 2017 at 10:32

Re: Non-ascii characters still fails on Drill 1.11

2017-08-11 Thread Rahul Raj
> > On Fri, Aug 11, 2017 at 9:18 AM, Rahul Raj <rahul.raj@option3consulting. > com> > wrote: > > > Issue persist even with JDBC driver 1.11.0. > > > > On Fri, Aug 11, 2017 at 11:35 AM, Rahul Raj <rahul.raj@option3consulting. > > com > > >

Non-ascii characters still fails on Drill 1.11

2017-08-10 Thread Rahul Raj
Hi, https://issues.apache.org/jira/browse/DRILL-4039 was fixed in 1.11. It failed on the following query below: SELECT `team_long_name` FROM dfs.user.`football/latest` WHERE `team_long_name` = 'Górnik Łęczna' LIMIT 500 org.apache.drill.common.exceptions.UserRemoteException: SYSTEM ERROR:

Error when timestamp IN clause contains more elements

2017-07-14 Thread Rahul Raj
I am getting the error below when there are more than 19 elements in the IN clause: DrillRuntimeException: Join only supports implicit casts between 1. Numeric data 2. Varchar, Varbinary data 3. Date, Timestamp data Left type: TIMESTAMP, Right type: BIGINT. Add explicit casts to avoid this error.

Re: Error when timestamp IN clause contains more elements

2017-07-17 Thread Rahul Raj
readability purposes) to do casts in such > cases. > > > > Alternatively (not recommended in IMO) is to raise the threshold for the > > property "in_subquery_threshold" in sys.options from 20 to a higher > value. > > > > -Original Mes

Kudu Drill deployment

2017-07-07 Thread Rahul Raj
In a recent discussion in Kudu mailing list it was mentioned that impala expects a Kudu tablet server to be co-located with an Impala daemon to reduce network transfer. Scanners transfer data through a fast localhost-scoped connection. How does this work with Drill? What are the deployment

Re: In-memory cache in Drill

2017-05-10 Thread Rahul Raj
The documentation says a temporary table does not outlive it's session. What happens when drill connections are wrapped in a connection pool? Should we drop them after each query in this case? Regards, Rahul On May 10, 2017 10:15 PM, "Michael Shtelma" wrote: > yes, for sure

Re: Drill Cluster without HDFS/MapR-FS?

2017-05-10 Thread Rahul Raj
Any experience of running drill on GlusterFS or similar storage systems? How much performance loss would incur because of unavailability of data locality? Regards, Rahul On Wed, May 10, 2017 at 11:11 AM, Abhishek Girish wrote: > Do you wish to use Drill in distributed mode

Column alias are ignored when Storage Plugin is enabled

2017-06-08 Thread Rahul Raj
Drill ignores column aliases when a JDBC storage plugin is enabled. If I execute 'select destination as x from ...some.csv' the column name appears as 'destination' instead of 'x' while JDBC storage plugin is enabled. On disabling the storage plugin, drill returns the results with aliased name

Re: Error while applying interval on a postgresql query

2017-06-07 Thread Rahul Raj
E as DATE) org.apache.drill.common.exceptions.UserRemoteException: SYSTEM ERROR: AssertionError: 1000: INTERVAL_DAY_TIME [Error Id: 4a2f277f-d7d9-4306-8bfc-8a58e30d3991 on vpc12.o3c.in:31010] Regards, Rahul On Tue, Mar 21, 2017 at 8:57 AM, Rahul Raj <rahul@option3consulting.com> wrote: > This happens on a JDBC Storage

Re: Error while applying interval on a postgresql query

2017-06-09 Thread Rahul Raj
| 1997-10-31 | > +-+-+ > 6 rows selected (0.299 seconds) > > Where the plan has the predicate’s filter above the scan. With the jdbc > plugin the predicate is probably pushed down, and there may be some bug > there related to intervals. > > Boaz > >

Re: Column alias are ignored when Storage Plugin is enabled

2017-06-09 Thread Rahul Raj
.org/jira/browse/DRILL-5538 > > > Please go ahead and file a bug. If it is related, they'll be linked and > resolved together. > > > ~ Kunal > > > From: Rahul Raj <rahul@option3consulting.com> > Sent: Thursday, June 8, 2017 12:12:47

Re: Drill Profile page takes too much time to load

2017-08-29 Thread Rahul Raj
t; > Thanks, > Padma > > > On Aug 29, 2017, at 7:46 AM, Rahul Raj <rahul@option3consulting.com> > wrote: > > > > Hi, > > > > The drill profile list page(<>:8047/profiles) takes few minutes to > load > > in one of the installati

Drill Profile page takes too much time to load

2017-08-29 Thread Rahul Raj
Hi, The drill profile list page(<>:8047/profiles) takes few minutes to load in one of the installation. There was a considerable amount of processor(20%) and memory(15-20%) usage during this time. Immediately after displaying the results, values return to normal. Could this be because of the

Re: Drill Profile page takes too much time to load

2017-08-31 Thread Rahul Raj
> > On Aug 29, 2017, at 10:41 PM, Rahul Raj <rahul@option3consulting.com> > wrote: > > > > We had more than 200,000 profiles stored :) > > > > Regards, > > Rahul. > > > > On Wed, Aug 30, 2017 at 5:03 AM, Padma Penumarthy <ppenumar...

Queries getting CANCELED

2017-10-17 Thread Rahul Raj
I have a web app that generates CSV files using Drill. When the CSV size gets larger, the query state moves to CANCELED and results are always partial/truncated. The same happens with larger parquet files too and works fine with smaller data sets. Code snippet is similar to: try(Connection

Re: Queries getting CANCELED

2017-10-18 Thread Rahul Raj
ll are you on ? > > > Thanks, > > Khurram > > > From: Rahul Raj <rahul@option3consulting.com> > Sent: Tuesday, October 17, 2017 7:09:35 PM > To: user@drill.apache.org > Subject: Queries getting CANCELED > > I have a web app that generates CSV

Error while applying date interval(DRILL-5578)

2017-11-27 Thread Rahul Raj
Hi, I had reported an issue(DRILL-5578 ) on drill failure while applying date functions on a JDBC table or on a View. Any updates on this issue? Are there any workarounds? Regards, Rahul -- This email and any files transmitted with it are

Drill session and jdbc connections

2017-12-13 Thread Rahul Raj
Hi, How is a drill session related to a drill jdbc connection instance? What happens in a pool of connections when one connection changes the store.format? I am seeing some mix-ups where a parquet row is written as an array of multiple records(rather than multiple columns) when another thread

Re: Drill session and jdbc connections

2017-12-13 Thread Rahul Raj
l, run the RESET command > to ensure the default state is set. > > https://drill.apache.org/docs/reset/ > > > > -Original Message- > From: Rahul Raj [mailto:rahul@option3consulting.com] > Sent: Wednesday, December 13, 2017 2:17 AM > To: user@drill.apache.org > Su

Re: Drill session and jdbc connections

2017-12-14 Thread Rahul Raj
Let me know if the code is fine. Regards, Rahul On Thu, Dec 14, 2017 at 1:53 PM, Rahul Raj <rahul@option3consulting.com> wrote: > Sorry, I confused. I am using a pool of connections, code snippet below: > > try(Connection conn = pool.getConnection()){ >

Re: Drill session and jdbc connections

2017-12-14 Thread Rahul Raj
;kkha...@mapr.com> wrote: > That will (IMO) not solve the problem, since different threads will be > setting and resetting the store format. My suggestion would be to use a > pool of connections and each thread work off one connection, and returning > it to the pool when done r

Re: Time series storage with parquet

2017-11-01 Thread Rahul Raj
nually > remove the > old files and move the new files. > It does impact the metadata caching mechanism. > You need to regenerate metadata cache. > > Thanks > Padma > > > > On Oct 31, 2017, at 5:08 AM, Rahul Raj <rahul@option3consulting.com> > wrote: &g

Re: Error reading int96 fields

2017-12-06 Thread Rahul Raj
ooks like a bug. > Could you please open the jira ticket and provide the query and dataset to > reproduce the issue? > > Thanks > > Kind regards > Vitalii > > On Tue, Dec 5, 2017 at 2:01 PM, Rahul Raj <rahul@option3consulting.com > > > wrote: > >

Error reading int96 fields

2017-12-05 Thread Rahul Raj
I am getting the error - SYSTEM ERROR : ClassCastException: prg.apache.drill.exec.vector.TimeStampVector cannot be cast to org.apache.drill.exec.vector.VariableWidthVector while trying to read a spark INT96 datetime field on Drill 1.11 in spite of setting the property

Time series storage with parquet

2017-10-31 Thread Rahul Raj
Hi, I have few questions on modeling a time series use case with parquet and drill. I have seen the topic discussed at https://issues.apache.org/jira/browse/DRILL-3534. My requirements are: * Keep the parquet files partitioned by year and month * For the current month, the data needs to be

Re: Drill on Windows

2018-05-13 Thread Rahul Raj
gt; To: user@drill.apache.org > Subject: Re: Drill on Windows > > Hi Tim, > I would appreciate this as well. Thank you for offering. > -Chris > > On Fri, Apr 20, 2018 at 4:54 AM, Rahul Raj <rahul.raj@option3consulting. > com> > wrote: > > > Thanks for t

Re: Drill on Windows

2018-05-13 Thread Rahul Raj
There is a 'run' option provided with the script to achieve this, ignore my previous mail. # Use this when launching Drill from your own script that manages the # process, such as (roll-your-own) YARN, Mesos, supervisord, etc. - Rahul On Sun, May 13, 2018 at 10:53 PM, Rahul Raj <ra

Drill Jdbc filter pushdown

2018-02-22 Thread Rahul Raj
I am working on a fix for DRILL-5578 which is due to CALCITE-2188. Query fails when there is a DATE/INTERVAL arithmetic on the where part - select * from actor where last_update - INTERVAL '20' SECOND > TIMESTAMP '2005-10-17 00:00:00" I have made the changes for ANSI and MySql dialect of

JdbcStoragePlugin pool properties

2018-07-10 Thread Rahul Raj
Currently the jdbc connection pool properties(like size, max-active, min-active) are not configurable while creating a Jdbc Storage Plugin. Drill creates a dbcp basic data source thus accepting the default values. Can this be added as a feature? I can submit a pull request for the change.

Re: Drill Blog on Medium.com

2018-03-12 Thread Rahul Raj
I would like to see an article on creating a new sample storage plugin with details on the different components involved, like the internal drill memory representation, data types etc. I dont think the existing plugins are self explanatory for a beginner. Regards, Rahul On Tue, Mar 6, 2018 at

Drill on Windows

2018-04-18 Thread Rahul Raj
Is there any reason why Drill does not run on Windows as standalone? I can only see a windows batch file for sqlline. Will it not work if we get the shell scripts translated to windows as cmd/batch files? Regards, Rahul -- _*** This email and any files transmitted with it are confidential and

Re: Drill on Windows

2018-04-20 Thread Rahul Raj
hich is not commonly done. >>> >>> Other more obvious factors, IMO, are the overhead in creating and >>> maintaining *nix shell scripts in Batch files. Linux (Bash) scripts are >>> much more powerful with a lot of capabilities to make use of a rich >>>

Source for drill-calcite

2018-03-28 Thread Rahul Raj
Is Drill fork of Calcite maintained at https://github.com/mapr/incubator-calcite/? I assume that the required calcite branch for Drill 13.0 is DrillCalcite1.15.0. I would like to test a newer patch from calcite on Drill 13.0. Regards, Rahul -- This email and any files transmitted with it

Re: Drill JDBC Plugin limit queries

2018-10-13 Thread Rahul Raj
s.apache.org/mod_mbox/drill-user/201808.mbox/browser > [2] https://drill.apache.org/docs/rdbms-storage-plugin/ > > On Sat, Oct 13, 2018 at 2:21 PM Rahul Raj wrote: > > > Regarding the heap out of error, it could be that the jdbc driver is > > prefetching

Drill JDBC Plugin limit queries

2018-10-12 Thread Rahul Raj
Hi, Drill does not push the LIMIT queries to external databases and I assume it could be more related to Calcite. This leads to out of memory situations while querying large table to view few records. Is there something that could be improved here? One solutions would be to push filters down to

Re: Drill JDBC Plugin limit queries

2018-10-19 Thread Rahul Raj
/org/apache/drill/exec/store/jdbc/JdbcRecordReader.java#L187 > So it is definitely should be improved. > > *Note:* Changed mailing list to devs. > > On Sun, Oct 14, 2018 at 6:30 AM Rahul Raj wrote: > > > Vitalii, > > > > Created documentation ticket DRILL-6794 >

Re: Drill JDBC Plugin limit queries

2018-10-19 Thread Rahul Raj
ype(INTEGER actor_id, VARCHAR(45) first_name, VARCHAR(45) last_name, TIMESTAMP(3) last_update): rowcount = 100.0, cumulative cost = {100.0 rows, 100.0 cpu, 0.0 io, 0.0 network, 0.0 memory}, id = 164 Regards, Rahul On Fri, Oct 19, 2018 at 8:47 PM Rahul Raj wrote: > I will make the changes and

Re: Drill JDBC Plugin limit queries

2018-10-19 Thread Rahul Raj
ul On Fri, Oct 19, 2018 at 10:07 PM Rahul Raj wrote: > Vitalii, > > I made both the changes, it did not work and a full scan was issued as > shown in the plan below. > > 00-00Screen : rowType = RecordType(INTEGER actor_id, VARCHAR(45) > first_name, VARCHAR(45) last

Re: Drill JDBC Plugin limit queries

2018-10-13 Thread Rahul Raj
1808.mbox/%3C0d36e0e6e8dc1e77bbb67bbfde5f5296e290c075.camel%40omnicell.com%3E > > On Sat, Oct 13, 2018 at 3:56 PM Rahul Raj wrote: > > > Should I create tickets to track these issues or should I create a ticket > > to update the documentation? > > > > Rah

Re: Calcite 1.17

2018-09-24 Thread Rahul Raj
18, 2018 at 4:56 PM Rahul Raj wrote: > Arina, > > All the changes are present in the new branch. I will test on Drill master > and update you. > > Regards, > Rahul > > On Tue, Sep 18, 2018 at 4:43 PM Arina Yelchiyeva < > arina.yelchiy...@gmail.com> wrote: > >

Calcite 1.17

2018-09-18 Thread Rahul Raj
Hi, Are there any plans to merge Calcite 1.17 to Drill? Fix for DRILL-5578 is available on Calcite 1.17. Regards, Rahul -- _*** This email and any files transmitted with it are confidential and intended solely for the use of the individual or

Re: Calcite 1.17

2018-09-18 Thread Rahul Raj
ve Calcite 1.17. > > > > Kind regards, > > Arina > > > > On Tue, Sep 18, 2018 at 1:25 PM Rahul Raj < > rahul@option3consulting.com> > > wrote: > > > >> Hi, > >> > >> Are there any plans to merge Calcite 1.17 to Drill? Fix fo

Re: Calcite 1.17

2018-09-18 Thread Rahul Raj
e/tree/DrillCalcite1.17.0 > Current Drill master is using 1.17.0-drill-r1. > > Kind regards, > Arina > > On Tue, Sep 18, 2018 at 1:49 PM Rahul Raj wrote: > > > Arina, > > > > I will do that. > > I went through the latest pull requests on DrillCalcite, but could not

Query Compilation error with 80+ CASE statements

2019-02-26 Thread Rahul Raj
Hi, I am getting compilation error on Drill 1.15 when query contains a large number of case statements. I have included the query below. Query works fine when few case statements are removed. org.apache.drill.common.exceptions.UserRemoteException: SYSTEM ERROR: CompileException: File

Re: Query Compilation error with 80+ CASE statements

2019-02-26 Thread Rahul Raj
RuntimeException("IllegalArgumentException : null values in non nullable fields"); } else { out = input; } } Any thoughts on this? Are there any naming conventions while developing a UDF? Regards, Rahul On Wed, Feb 27, 2019 at 12:14 PM Rahul Raj wrote: > Hi, >

Re: Query Compilation error with 80+ CASE statements

2019-02-27 Thread Rahul Raj
> Sent: Thursday, February 28, 2019 5:55 AM > To: user > Subject: Re: Query Compilation error with 80+ CASE statements > > Rahul, > > Can you please share plans for both queries (one with fewer which succeeds > and one which fails). Also the verbose error. > > On Tue, Fe

Drill does not release DB connections when a storage plugin is deleted

2019-02-13 Thread Rahul Raj
Drill does not release the database connection when we disable or delete a storage plugin. It can be seen with "lsof -i -p <> | grep <>" that the number of socket connections to db host keep increasing if we keep updating an existing storage plugin. JdbcStoragePlugin does not override

Creating/reading empty Parquet files

2019-07-04 Thread Rahul Raj
We have a use case involving a union across multiple parquet files. One of the source parquet files used in union was generated using an empty CTAS (query didn't select any record) and resulted in creating no output files. The union query failed in this case because of the absence of the source