Re: hive connection as generic jdbc

2018-03-11 Thread Kunal Khatua
in > the drill-override.conf file. After searching in google I found that this > issue occurs when there is a version mismatch or if one class is present in > two or more JAR files. I have not very much idea in Java, so can you please > let me know any particular JAR which has to be remov

Re: [Drill 1.12.0] : RESOURCE ERROR: Not enough memory for internal partitioning and fallback mechanism for HashAgg to use unbounded memory is disabled

2018-03-11 Thread Kunal Khatua
Here is the background of your issue: https://drill.apache.org/docs/sort-based-and-hash-based-memory-constrained-operators/#spill-to-disk HashAgg introduced a Spill-to-disk capability in 1.11.0 that allows for Drill to run a query's HashAgg in a memory constrained environment. The memory required

Re: Drill-SQL Server Performance Issues

2018-03-09 Thread Kunal Khatua
1. Drill connects to JDBC sources using the JDBC Storage plugin and you should probably try that.That said, the JDBC storage plugin, AFAIK, does not do push down of filters to the database. But you could try this to confirm. 2. What you need to look at first is the query profile for

Re: unable to add storahe plugin for rdbms

2018-03-08 Thread Kunal Khatua
This looks incorrect: drill.exec.sys.store.provider.local.path = "mysql-connector-java-5.1. 45-bin.jar" Refer this https://drill.apache.org/docs/storage-plugin-registration/# storage-plugin-configuration-persistence and provide a path that will allow you to persist the storage plugins' info.

Re: hive connection as generic jdbc

2018-03-07 Thread Kunal Khatua
You should be able to connect to a Hive cluster via JDBC. However, the benefit of using Drill co-located on the same cluster is that Drill can directly access the data based on locality information from Hive and process across the  distributed FS cluster. With JDBC, any filters you have, will

Re: Way to "pivot"

2018-03-06 Thread Kunal Khatua
Not until now :) Can you file a JIRA so that we can track it? On Tue, Mar 6, 2018 at 11:40 AM, John Omernik wrote: > Perfect. That works for me because I have a limited number of values, I > could see that getting out of hand if the values were unknown. Has there > been any

Re: Default value for parameter 'planner.width.max_per_node'

2018-02-21 Thread Kunal Khatua
Looks like DRILL-5547: Linking config options with system option manager (  https://github.com/apache/drill/commit/a51c98b8bf210bbe9d3f4018361d937252d1226d ) introduced a change in the computation, which is based on the number of cores.

RE: Fixed-width files

2018-02-20 Thread Kunal Khatua
| 1| 12 | 123 | +--+--+--+ 1 row selected (0.587 seconds) 0: jdbc:drill:schema=dfs> Thanks, Arjun ____ From: Kunal Khatua <kkha...@mapr.com> Sent: Wednesday, February 21, 2018 12:37 AM To: user@drill.apache.org Subject: RE: Fixe

RE: Fixed-width files

2018-02-20 Thread Kunal Khatua
Xg=Q3Oz5l4W5TvDHNLpOqMYE2AgtKWFE937v89GEHyOVDU=69ohaJkyhIdPzNBy3ZsqNCTa19XysjZzgmn_XPJ2yXQ=-0LdlBnmAXaipanP87yJezn5HPEHQIQVX5izxnNTYFY= >> >> >> >> >> On Monday, February 19, 2018, 12:52:42 PM PST, Kunal Khatua < >> kkha...@mapr.com&

RE: Fixed-width files

2018-02-19 Thread Kunal Khatua
As long as you have delimiters, you should be able to import it as a regular CSV file. Using views that define the fixed-width nature should help operators downstream work more efficiently. -Original Message- From: Flavio Pompermaier [mailto:pomperma...@okkam.it] Sent: Monday,

Looking for user feedback on DRILL-5741

2018-02-16 Thread Kunal Khatua
Hi everyone We're working on simplifying the Drill memory configuration process, where by, users no more have the need for getting into the specifics of Heap and Direct memory allocations. Here is the original JIRA https://issues.apache.org/jira/browse/DRILL-5741 The idea is to simply provide

RE: Drill 1.10 Memory Leak with CTAS + partition by

2018-02-14 Thread Kunal Khatua
Numerous fixes have gone into Drill since then. Can you try with Drill 1.12 ? -Original Message- From: Siva Gudavalli [mailto:gudavalli.s...@yahoo.com.INVALID] Sent: Wednesday, February 14, 2018 6:23 PM To: user@drill.apache.org Subject: Drill 1.10 Memory Leak with CTAS + partition by

RE: Reading drill(1.10.0) created parquet table in hive(2.1.1) using external table

2018-02-14 Thread Kunal Khatua
eb 13, 2018 10:59 PM, Kunal Khatua kkha...@mapr.com wrote: Can you share what the error is? Without that, it is anybody's guess on what the issue is. -Original Message- From: Anup Tiwari [mailto:anup.tiw...@games24x7.com] Sent: Tuesday, February 13, 2018 6:19 AM To: user@drill

RE: Reading drill(1.10.0) created parquet table in hive(2.1.1) using external table

2018-02-13 Thread Kunal Khatua
Can you share what the error is? Without that, it is anybody's guess on what the issue is. -Original Message- From: Anup Tiwari [mailto:anup.tiw...@games24x7.com] Sent: Tuesday, February 13, 2018 6:19 AM To: user@drill.apache.org Subject: Reading drill(1.10.0) created parquet table in

RE: Documentation Update for Drill-4286

2018-02-12 Thread Kunal Khatua
This made it into Drill 1.12, but there is some additional work to be done. We are hoping to have it completed in time for 1.13 release, along with documentation. -Original Message- From: John Omernik [mailto:j...@omernik.com] Sent: Monday, February 12, 2018 12:44 PM To: user

RE: MapR Drill 1.12 Mismatch between Native and Library Versions

2018-02-08 Thread Kunal Khatua
It might be to do with the way you've installed Drill. If you've built and deployed Drill, odds are that the client will be different. With the RPM installation, however, the installer has symlinks to make the mapr-client libraries required by Drill be pointing to the libraries available in

RE: Apache Phoenix integration

2018-02-06 Thread Kunal Khatua
ng a CDH5 profile for Drill without any problem. What do you think? Is there any possibility to have Drill and Drillix merged soon? Best, Flavio On Mon, Feb 5, 2018 at 9:10 PM, Kunal Khatua <kkha...@mapr.com> wrote: > Hi Flavio > > I'm wondering whether you tried modifying the pom.x

RE: Apache Phoenix integration

2018-02-05 Thread Kunal Khatua
com/v2/url?u=https-3A__issues.apache.org_jira_browse_PHOENIX-2D4523=DwIBaQ=cskdkSMqhcnjZxdQVpwTXg=-cT6otg6lpT_XkmYy7yg3A=BQrc4m6Ki2hXeQbOe8XIbthhiEgygxzD16DmbvRBW-I=zx8vNOk1VKYBp4KV1Iya4XU_JyLRlHKSxQvA1WcFHW4= On Fri, Feb 2, 2018 at 7:21 PM, Kunal Khatua <kkha...@mapr.com> wrote: > That's great,

RE: Creating a Tableau extracts with Drill 1.12 uses unlimited memory

2018-02-02 Thread Kunal Khatua
anning speed, metadata operations, etc. Also a good check to see if the data is healthy. You may consider looking at some pointers here . https://community.mapr.com/community/exchange/blog/2017/01/25/drill-best-practices-for-bi-and-analytical-tools --Andries On 1/28/18, 6:18 PM, "Ku

RE: Apache Phoenix integration

2018-02-02 Thread Kunal Khatua
e the data. >> Without writing a plugin for each storage system I'd like to leverage >> Apache Drill as a broker towards all of them. >> >> Right now I've tried to puth the phoenix-core.jar into the >> jar/3rdparty folder but when I try to create the Phoenix storage I

RE: Apache Phoenix integration

2018-02-01 Thread Kunal Khatua
The JDBC storage plugin allows Drill to leverage any SQL system that has JDBC drivers, so it should work. That said, the JDBC storage plugin is a community developed storage plugin, so it might not be fully tested. If you are looking to simply have the Phoenix JDBC driver bundled into the

RE: "no current connection" error

2018-01-30 Thread Kunal Khatua
That is basically indicating that while the embedded Drillbit failed to startup, but the SQLLine instance did (with no current connection to a running embedded Drillbit). Try to see the logs on why the Drillbit JVM isn’t coming up. From: Bo Qiang [mailto:bo.qi...@nielsen.com] Sent: Tuesday,

RE: Fwd: Creating a Tableau extracts with Drill 1.12 uses unlimited memory

2018-01-28 Thread Kunal Khatua
er_node = 10479720202 Drillbit.log attached (I think I have the correct selection included). Thanks On Fri, Jan 26, 2018 at 2:41 PM, Kunal Khatua <kkha...@mapr.com<mailto:kkha...@mapr.com>> wrote: What is the system memory and what are the allocations for heap and direct

RE: Fwd: Creating a Tableau extracts with Drill 1.12 uses unlimited memory

2018-01-26 Thread Kunal Khatua
e view is simple: SELECT * FROM s3://myparquet.parquet (14GB) planner.memory.max_query_memor y_per_node = 10479720202 Drillbit.log attached (I think I have the correct selection included). Thanks On Fri, Jan 26, 2018 at 2:41 PM, Kunal Khatua <kkha...@mapr.com<mailto:kkha...@mapr.com>

RE: Creating a Tableau extracts with Drill 1.12 uses unlimited memory

2018-01-25 Thread Kunal Khatua
What is the system memory and what are the allocations for heap and direct? The memory crash might be occurring due to insufficient heap. The limits parameter applies to the direct memory and not Heap. Can you share details in the logs from the crash? -Original Message- From: Timothy

RE: [Drill] Connect on newtwork file

2018-01-25 Thread Kunal Khatua
For a network file, you want to use something like an NFS mount point and query the file via that. -Original Message- From: Thiago Hernandes de souza [mailto:thiago.so...@cedrotech.com] Sent: Thursday, January 25, 2018 10:59 AM To: user@drill.apache.org Subject: [Drill] Connect on

RE: Queries getting stuck in RUNNING state occasionally

2018-01-25 Thread Kunal Khatua
Hi Lalit Your profile hints that it is stuck in the Major Fragment 06-xx-xx, which is fed data from 16-xx-xx via 11-Exchange. Looking at the operators’ overview and the similarity with other major fragments, only this one seems to be stuck at completing the sort. Could you provide the JStack

RE: Timeframe on Apache Drill 1.12 in MapR Package?

2018-01-03 Thread Kunal Khatua
MEP 4.1 will be carrying Drill-1.12 and is expected in early February 2018 -Original Message- From: John Omernik [mailto:j...@omernik.com] Sent: Wednesday, January 03, 2018 7:06 AM To: user Subject: Timeframe on Apache Drill 1.12 in MapR Package? Hey all, just

RE: Drill Push to Tableau, Error -

2017-12-17 Thread Kunal Khatua
. Hope that helps. Let us know how else you are using Drill. Thanks ~ Kunal -Original Message- From: Kunal Khatua [mailto:kkha...@mapr.com] Sent: Sunday, December 17, 2017 9:30 AM To: user@drill.apache.org Subject: FW: Drill Push to Tableau, Error - Forwarded from d

FW: Drill Push to Tableau, Error -

2017-12-17 Thread Kunal Khatua
Forwarded from d...@drill.apache.org From: Spinn, Brandi [mailto:brandi.sp...@siriusxm.com] Sent: Friday, December 15, 2017 1:46 PM To: d...@drill.apache.org Subject: Drill Push to Tableau, Error - Hello, We are currently running a project which is utilizing the Drill push to Tableau function

RE: Drill session and jdbc connections

2017-12-13 Thread Kunal Khatua
the store format. Thanks for your inputs, Regards, Rahul On Wed, Dec 13, 2017 at 10:59 PM, Kunal Khatua <kkha...@mapr.com> wrote: > A Drill session is isolated and bound to a connection. Your > 'getConnection()' method might be fetching connections from a pool, > where the settin

RE: Drill session and jdbc connections

2017-12-13 Thread Kunal Khatua
A Drill session is isolated and bound to a connection. Your 'getConnection()' method might be fetching connections from a pool, where the settings haven't been reset. If the connections are shared, you will continue to have this problem. If you are returning a connection back to the pool, run

RE: [1.9.0] : UserException: SYSTEM ERROR: IllegalReferenceCountException: refCnt: 0 and then SYSTEM ERROR: IOException: Failed to shutdown streamer

2017-12-11 Thread Kunal Khatua
in trail mail also as you have mentioned :- *This is a system error and the message appears to hint that Drill shutdown a prematurely , *I have checked on all nodes and drill-bit is running properly. Note :- We are using Drill 1.10.0. Regards, *Anup Tiwari* On Thu, Dec 7, 2017 at 10:33 PM, Kunal

Re: [1.9.0] : UserException: SYSTEM ERROR: IllegalReferenceCountException: refCnt: 0 and then SYSTEM ERROR: IOException: Failed to shutdown streamer

2017-12-07 Thread Kunal Khatua
What is it that you were trying to do when you encountered this? This is a system error and the message appears to hint that Drill shutdown a prematurely and is unable to account for that Kunal From: Anup Tiwari Sent: Wednesday, December 6, 7:46 PM Subject: Re: [1.9.0] : UserException: SYSTEM

RE: sqlline parquet to tsv filesize imabalance causing slow sqoop export to MS sql server

2017-11-16 Thread Kunal Khatua
It might be that your parallelization is causing it to generate 4 files, where only <= 3 files are sufficient. Try experimenting with the planner. width .max_per_query to a value of 3 ... that might help. https://drill.apache.org/docs/configuration-options-introduction/ -Original

RE: Drill Capacity

2017-11-07 Thread Kunal Khatua
Hi Yun The new release might not address this issue as we don't have a repro for this. Any chance you can provide a sample anonymized data set. The JSON data doesn't have to be meaningful, but we need to be able to reproduce it to ensure that we are indeed addressing the issue you faced.

RE: Formatting results

2017-11-04 Thread Kunal Khatua
This should do it: https://drill.apache.org/docs/data-type-conversion/#to_char-syntax -Original Message- From: Brandon [mailto:etu...@gmail.com] Sent: Saturday, November 04, 2017 5:13 AM To: user@drill.apache.org Subject: Formatting results Is it possible to format query results, like

RE: Drill Capacity

2017-11-02 Thread Kunal Khatua
Hi Yun Andries solution should address your problem. However, do understand that, unlike CSV files, a JSON file cannot be processed in parallel, because there is no clear record delimiter (CSV data usually has a new-line character to indicate the end of a record). So, the larger a file gets,

RE: Apache drill : Error while creating storage plugin for Oracle DB

2017-10-31 Thread Kunal Khatua
gt;> >> Can you try with CURL as well ? >> >> curl -v -X POST -H "Content-Type: application/json" -d >> '{"name":"oracle1", "config": {"type": "jdbc", "enabled": true,"driver": >> "oracle

RE: Apache drill : Error while creating storage plugin for Oracle DB

2017-10-31 Thread Kunal Khatua
There are other logs that might be reporting the error. Look at the other logs in the Drill UI ... one of them that can carry more information would be drillbit.out -Original Message- From: Akshay Joshi [mailto:joshiakshay0...@gmail.com] Sent: Tuesday, October 31, 2017 11:06 AM To:

RE: Drill performance question

2017-10-30 Thread Kunal Khatua
I second Ted's suggestion! Since we haven't seen what your profile's operator overview, we can't say for sure why the performance isn't good. On the top of my head ,these are most likely things happening that make your performance so bad: 1. All the CSV files are being read and rows rejected

RE: Regarding where 1=0 clause

2017-10-24 Thread Kunal Khatua
Could you file a JIRA for this? It seems trivial enough to fix. -Original Message- From: Amit Garg [mailto:fromami...@gmail.com] Sent: Monday, October 23, 2017 11:21 PM To: user@drill.apache.org Subject: Regarding where 1=0 clause I am using Apache Drill JDBC storage plugin. When I

RE: Exception while reading parquet data

2017-10-15 Thread Kunal Khatua
.0.jar:1.11.0] > > at org.apache.drill.exec.store.parquet.columnreaders. > > PageReader.readPage(PageReader.java:216) > > ~[drill-java-exec-1.11.0.jar:1.11.0] > > at org.apache.drill.exec.store.parquet.columnreaders. > > PageReader.nextInternal(PageReader

RE: Exception while reading parquet data

2017-10-11 Thread Kunal Khatua
If this resolves the issue, could you share some additional details, such as the metadata of the Parquet files, the OS, etc.? Details describing the setup is also very helpful in identifying what could be the cause of the error. We had observed some similar DATA_READ errors in the early

RE: Access to Drill 1.9.0

2017-10-06 Thread Kunal Khatua
Just curious... any reason why you're looking to try Drill 1.9.0, considering that is nearly a year old ? -Original Message- From: Rob Wu [mailto:robw...@gmail.com] Sent: Friday, October 06, 2017 10:35 PM To: user@drill.apache.org Subject: Re: Access to Drill 1.9.0 Hi Chetan, You can

Re: need help to decrypt error code

2017-09-28 Thread Kunal Khatua
This appears to be the brief client side error. What does the Drillbit 's log show in the stacktrace? From: Divya Gehlot Sent: Thursday, September 28, 2017 12:43:38 AM To: user@drill.apache.org Subject: need help to decrypt error code

RE: Error Messages that are difficult to parse.

2017-09-25 Thread Kunal Khatua
Not sure what is going on, but my hunch is that the outermost wrapping SQL is probably using the final projections to eliminate some of the columns early on, which "helps" avoid the NumberFormat exception. Perhaps adding back the other columns, one by one, should narrow down the source of the

RE: Querying Parquet files in 2.0 format

2017-09-21 Thread Kunal Khatua
Drill currently is supporting Parquet 1.8.1 https://github.com/apache/drill/blob/master/pom.xml#L37 AFAIK a lot of major projects are still on Parquet 1.x versions. Is there something specific in 2.0 that you need? Else, Parquet 1.8.1 should suffice. You needn’t downgrade all the way down to

RE: User client timeout with results > 2M rows

2017-09-21 Thread Kunal Khatua
utor.java:357) > [netty-common-4.0.27.Final.jar:4.0.27.Final] > at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:357) > [netty-transport-4.0.27.Final.jar:4.0.27.Final] > at > io.netty.util.concurrent.SingleThreadEventExecutor$2.run(SingleThreadE > ventExecutor.java:

RE: User client timeout with results > 2M rows

2017-09-20 Thread Kunal Khatua
> 2M rows Yes it takes about 2-3min for the timeout to appear the query itself should finish in that time. The files are not that big for debugging. I have, but I couldn't find anything relevant or helpful in my situation so far. On Wed, 20 Sep 2017 at 20:41 Kunal Khatua <kkha...@mapr.com&

Re: User client timeout with results > 2M rows

2017-09-20 Thread Kunal Khatua
Do you know in how much time does this timeout occur? There might be some tuning needed to increase a timeout. Also, I think this (S3 specifically) has been seen before... So you might find a solution within the mailing list archives. Did you try looking there? From: Alan Höng Sent:

RE: Apache DRILL v1.11.0 handling Postgres citext columns with Inconsistency

2017-09-18 Thread Kunal Khatua
It's odd that adding just a term_count column is causing an error but the other 2 columns (created, updated) don't seem to be... and gets resolved on removing the cast. Can you provide the stack trace and error message? Also, what are the data types for the other columns? -Original

RE: Spark exception crashing application

2017-09-18 Thread Kunal Khatua
The exceptions/error (NoClassDefFoundError) appear to indicate that there is a possible mismatch between the library versions of Spark within Drill and the platform you are running on.. Can you start by verifying the versions of the Spark libraries you have and then try to build Drill (edit

RE: Problems using Postgres datasource

2017-09-01 Thread Kunal Khatua
I'm not very familiar with the details of Postgres, but I do so see people occassionally asking about it Have you checked the mailing list archives? You might find your answers there. -Original Message- From: Gonzalo Ortiz Jaureguizar [mailto:golthir...@gmail.com] Sent: Friday,

RE: Drill Profile page takes too much time to load

2017-08-29 Thread Kunal Khatua
This might be because that installation is possibly hosting a lot of Drill profiles. Do you know how many profiles you have residing in the underlying persistent store? When a query is executed, the Foreman Drillbit (i.e. the node from where the query is submitted) writes the final profile to

RE: Drill selects column with the same name of a different table

2017-08-21 Thread Kunal Khatua
Could you share the profile ( *.sys.drill file or the http://:8047/profiles/.json ) ? This might be a bug with the JDBC Storage plugin. A quick way to validate this would be to have the similar data as 2 text/parquet tables and have Drill read from that. If we don't see an issue, then it is

RE: Unable to SELECT from parquet file with Hadoop 2.7.4

2017-08-18 Thread Kunal Khatua
Kunal -Original Message- From: Kunal Khatua [mailto:kkha...@mapr.com] Sent: Friday, August 18, 2017 1:07 PM To: user@drill.apache.org Subject: RE: Unable to SELECT from parquet file with Hadoop 2.7.4 Glad that worked ! Not sure why the APIs in 2.7.4+ are not backward compatible

RE: Unable to SELECT from parquet file with Hadoop 2.7.4

2017-08-18 Thread Kunal Khatua
with Drill 1.11 (with -Dhadoop.version=2.7.4) Hadoop 2.8.0 with Drill 1.11 (default pom.xml) Hadoop 3.0.0-alpha4 with Drill 1.11 (default pom.xml) Kind regards, Michele On Fri, Aug 18, 2017 at 8:06 PM, Kunal Khatua <kkha...@mapr.com> wrote: > Interesting. I'm presuming this works if th

RE: Unable to SELECT from parquet file with Hadoop 2.7.4

2017-08-18 Thread Kunal Khatua
Interesting. I'm presuming this works if the parquet file is in a directory, right? Was Drill built with Hadoop 2.7.4 dependencies or did you use the default 2.7.1 that is there in the POM.XML ? A workaround for now would be to query on an enclosing directory, until someone looks at the issue

RE: Merge and save parquet files in Drill

2017-08-18 Thread Kunal Khatua
If you are creating lot of small files within each partition, that is because different writers write a file for each partition. One workaround would be to reduce the number of writer fragments. We can achieve that by setting the parameter to a lower value, so that you generate larger files.

RE: set up Apache Drill on Windows server in distributed mode

2017-08-10 Thread Kunal Khatua
rill:zk=NameNode,DataNode1,DataNode2 (The filename, directory name, or vol ume label syntax is incorrect) apache drill 1.11.0 "got drill?" sqlline> Note : Removed the actual IP for security purpose. Appreciate the help . Thanks, Divya On 11 August 2017 at 08:20, Kunal Khat

RE: set up Apache Drill on Windows server in distributed mode

2017-08-10 Thread Kunal Khatua
Most people have used Apache Drill on Windows primarily in Embedded mode because no one appears to have tried for more than 1 Drillbit. That said, you should be able to run Apache Drill in a distributed mode as well, since it is Java-based and would not need to rely on anything more than a

RE: delimiter in column values

2017-08-07 Thread Kunal Khatua
. There are table functions in Drill that guide it with additional inputs on how to manage the preparation of the table. Can you please share the link ? Thanks, Divya On 3 August 2017 at 21:06, Kunal Khatua <kkha...@mapr.com> wrote: > A couple of things... > > 1. Your deli

RE: pass parameters to Drill query in file

2017-08-07 Thread Kunal Khatua
, Divya On 3 August 2017 at 12:30, Kunal Khatua <kkha...@mapr.com> wrote: > If you just want to run the query with the parameters like "alter > session", you can write those session-altering SQL as the initial set > of queries before the actual query in the same file. &g

RE: unable to run drill sql in shell script

2017-08-07 Thread Kunal Khatua
You seem to be missing the jars in the classpath. Notice the line before the error... "Calculating Drill classpath" What you probably need to do is to navigate to your installation's bin directory and launch from there. -Original Message- From: Divya Gehlot

RE: Drill JDBC connection on windows in embedded node

2017-08-07 Thread Kunal Khatua
+1 to Andries' comment. When you start Drill in Embedded mode, you are bringing up a single standalone Drillbit without any Zookeeper that would otherwise allow discovery of other Drillbits in the cluster. Starting in embedded mode is done via SQLLine because in a practical usage, a single

RE: delimiter in column values

2017-08-02 Thread Kunal Khatua
your numbers will start as > VarChar. You can convert to a numeric type in the query. > > Example using your data: > > Column1,Column2,Column3,Column4,Column5 > colonedata1,coltwodata1,-35.924476,138.5987123, > colonedata2,coltwodata2,-27.4372536,153.0304583,137 > > Note that

RE: pass parameters to Drill query in file

2017-08-02 Thread Kunal Khatua
If you just want to run the query with the parameters like "alter session", you can write those session-altering SQL as the initial set of queries before the actual query in the same file. Then run something like this: /opt /drill/apache-drill-1.11.0/bin/sqlline -u "jdbc:drill:zk=local"

RE: delimiter in column values

2017-08-01 Thread Kunal Khatua
You could just try to have the headers in a single line too... emulating the structure that the rest of the data follows. -Original Message- From: Kunal Khatua [mailto:kkha...@mapr.com] Sent: Tuesday, August 01, 2017 9:38 PM To: user@drill.apache.org Subject: RE: delimiter in column

RE: delimiter in column values

2017-08-01 Thread Kunal Khatua
ta4" "coltwodata4" "-33.8724176" "151.2067579" "" "colonedata5" "coltwodata5" "" "" "" "This col6 data" "coltwodata6" "-33.869732" "151.203" "This col7 da

RE: delimiter in column values

2017-08-01 Thread Kunal Khatua
I think you need quotes around the single word datasets as well, because the quotes act as String delimiters and help in indicating the start and end of a String. Is there a reason why the single word strings cannot be in quotes as well? -Original Message- From: Divya Gehlot

RE: Drill performance tuning parquet

2017-07-28 Thread Kunal Khatua
gt; c4.4xlarge - 4 20 20 20 20 > > > Dan Holmes | Revenue Analytics, Inc. > Direct: 770.859.1255 > www.revenueanalytics.com > > -Original Message- > From: Kunal Khatua [mailto:kkha...@mapr.com] > Sent: Friday, July 28, 2017 2:38 AM > To: user@dri

RE: Drill performance tuning parquet

2017-07-28 Thread Kunal Khatua
set (which means you will end up trading off the various parameters) and then scale out. (Scale out is not cheap). Thanks, Saurabh On Thu, Jul 27, 2017 at 1:37 PM, Kunal Khatua <kkha...@mapr.com> wrote: > You haven't specified what kind of query are you running. > > The As

RE: Drill performance tuning parquet

2017-07-27 Thread Kunal Khatua
You haven't specified what kind of query are you running. The Async Parquet Reader tuning should be more than sufficient in your usecase, since you seem to be only processing 3 files. The feature introduces a small fixed pool of threads that are responsible for the actual fetching of bytes

RE: drill error connecting to Hbase

2017-07-26 Thread Kunal Khatua
connecting to Hbase It is CDH 5.8.2 I believe it is reliable versions, isn't it? Thanks, Shai -Original Message- From: Kunal Khatua [mailto:kkha...@mapr.com] Sent: Monday, July 24, 2017 8:50 AM To: user@drill.apache.org Subject: RE: drill error connecting to Hbase This means

RE: drill error connecting to Hbase

2017-07-23 Thread Kunal Khatua
This means that the connectivity with ZK appears to be working. What are the HBase, ZK and Hadoop versions that you are working with? I presume that the student table is otherwise accessible. -Original Message- From: Shai Shapira [mailto:shai.shap...@amdocs.com] Sent: Sunday, July 23,

RE: Rest API - Need to Improve

2017-07-12 Thread Kunal Khatua
Should we look at some well-established products that have a good set of such APIs for guidance? That will ensure that we atleast identify the most relevant APIs. -Original Message- From: John Omernik [mailto:j...@omernik.com] Sent: Wednesday, July 12, 2017 6:12 AM To: user

RE: Apache Drill C++ Client Binary Versioning Information

2017-06-23 Thread Kunal Khatua
I think you missed sharing the Apache JIRA :) From: Robert Wu [mailto:r...@magnitude.com] Sent: Thursday, June 22, 2017 5:19 PM To: user@drill.apache.org Subject: RE: Apache Drill C++ Client Binary Versioning Information Oh sorry, looks like attachment is not going through. I've captured the

RE: Apache Drill C++ Client Binary Versioning Information

2017-06-16 Thread Kunal Khatua
I don't think the mailing lists allows attachments to go through. Can you share via some free online utility like https://imgur.com/upload From: Robert Wu [mailto:r...@magnitude.com] Sent: Tuesday, June 13, 2017 12:07 PM To: user@drill.apache.org Subject: Apache Drill C++ Client Binary

Re: SIGSEGV error - StubRoutines::jlong_disjoint_arraycopy

2017-06-15 Thread Kunal Khatua
error - StubRoutines::jlong_disjoint_arraycopy Yeah. It only crashes on the larger JSON files. Reworking my python script to use hdfs.tmp instead of dfs.tmp now.. -Original Message- From: Kunal Khatua [mailto:kkha...@mapr.com] Sent: Thursday, June 15, 2017 10:52 AM To: user@drill.apache.org S

RE: SIGSEGV error - StubRoutines::jlong_disjoint_arraycopy

2017-06-15 Thread Kunal Khatua
Nope.. not seen this before. Can you share more details of the log messages, etc? The problem might have to do with the JSON files being very large... because the segmentation fault that triggered the JVM (Drillbit) crash hints at that during the write of the Parquet files. I take it you

RE: Apache Drill Question

2017-06-15 Thread Kunal Khatua
Not familiar with SSHFS or GlusterFS specs, but It should, in theory, work out of the box. You can start off Drill with having the underlying storage plugins talk to a localFS. I'm presuming SSHFS / GlusterFS can expose the files through a local NFS-like mount. However, if your three nodes

RE: JDBC help

2017-06-14 Thread Kunal Khatua
You need to use the org.apache.drill.jdbc.Driver class. The drill-jdbc-all dependency should be enough. Are you trying to connect to an HDFS cluster's zookeeper? -Original Message- From: Aspen Hsu [mailto:a...@xactlycorp.com] Sent: Wednesday, June 14, 2017 12:07 PM To:

RE: Using Apache Drill with AirBnB SuperSet

2017-06-14 Thread Kunal Khatua
The Superset project looks pretty neat. Didn't realize that there is also a Python Driver for Drill. I'd think that would be useful too. -Original Message- From: John Omernik [mailto:j...@omernik.com] Sent: Wednesday, June 14, 2017 3:45 AM To: user Subject:

Re: Increasing store.parquet.block-size

2017-06-09 Thread Kunal Khatua
ptimal, if I have to read the file later? On 09-Jun-2017 11:28 PM, "Kunal Khatua" <kkha...@mapr.com> wrote: > > If you're storing this in S3... you might want to selectively read the > files as well. > > > I'm only speculating, but if you want to download the

Re: Increasing store.parquet.block-size

2017-06-09 Thread Kunal Khatua
n HDFS. The size of the individual .csv source files can be quite huge (around 10GB). So, is there a way to overcome this and create one parquet file or do I have to go ahead with multiple parquet files? On 09-Jun-2017 11:04 PM, "Kunal Khatua" <kkha...@mapr.com> wrote: > Shuporno > &g

Re: Increasing store.parquet.block-size

2017-06-09 Thread Kunal Khatua
Shuporno There are some interesting problems when using Parquet files > 2GB on HDFS. If I'm not mistaken, the HDFS APIs that allow you to read offsets (oddly enough) returns an int value. Large Parquet blocksize also means you'll end up having the file span across multiple HDFS blocks, and

Re: Column alias are ignored when Storage Plugin is enabled

2017-06-08 Thread Kunal Khatua
It could be related to these as well: https://issues.apache.org/jira/browse/DRILL-5537 https://issues.apache.org/jira/browse/DRILL-5538 Please go ahead and file a bug. If it is related, they'll be linked and resolved together. ~ Kunal From: Rahul Raj

Re: [External] Re: UNORDERED_RECEIVER taking 70% of query time

2017-06-04 Thread Kunal Khatua
is there any way by which we can try reducing it? Regards, Jasbir singh -Original Message- From: Kunal Khatua [mailto:kkha...@mapr.com] Sent: Saturday, June 03, 2017 12:00 AM To: user@drill.apache.org; d...@drill.apache.org Cc: Kothari, Maneesh <maneesh.koth...@accenture.com&

Re: [External] Re: UNORDERED_RECEIVER taking 70% of query time

2017-06-02 Thread Kunal Khatua
Hi Jasbir I don't think the Apache mailing lists allows you to send attachments, except may be text files. (The txt file made it through). In your Operator Profile, you'll see two columns... %Fragment Time and %QueryTime Taking your mouse over those table headers should show you a

Re: Parquet filter pushdown and string fields that use dictionary encoding

2017-05-31 Thread Kunal Khatua
lds that use dictionary encoding Thank you Kunal. Kan you please explain to me why min/max values would be relevant for dictionary encoded fields? (I think I may be completely misunderstanding how they work) Regards, -Stefán On Wed, May 31, 2017 at 5:55 PM, Kunal Khatua <kkha...@mapr.com&

Re: Parquet filter pushdown and string fields that use dictionary encoding

2017-05-31 Thread Kunal Khatua
Even though filter pushdown is supported in Drill, it is limited to pushing down of numeric values including dates. We do not support pushdown of varchar because of this bug in the parquet library: https://issues.apache.org/jira/browse/PARQUET-686 The issue of

Re: error in mac OS X driver installer instructions

2017-05-19 Thread Kunal Khatua
Looks like you're right about the typo. We'll have it corrected. Thanks, Jonathan! ~ Kunal From: Jonathan Snyder Sent: Friday, May 19, 2017 1:30:17 PM To: user@drill.apache.org Subject: error in mac OS X

Re: Run Multiple Queries as a Single query Apache Drill

2017-05-15 Thread Kunal Khatua
You could try to set the value through the OPTIONS tab on the top right. This will be a permanent 'system' level setting and not a temporary 'session'-level setting. You can have your API also do this using the 'alter system set ...' to achieve the same end goal.

Re: Drill embedded mode

2017-05-15 Thread Kunal Khatua
Is there a reason you want to use Drill in Embedded mode and not as a standalone server? You can always use a screen session and start up Drill embedded mode and then detach that session. From: Selvarajan Thangavel

Re: Not Able to suscribe

2017-05-15 Thread Kunal Khatua
Did this not work? https://drill.apache.org/mailinglists/ From: Arvind Sharma Sent: Monday, May 15, 2017 12:00:33 AM To: user@drill.apache.org Subject: Not Able to suscribe

Re: In-memory cache in Drill

2017-05-10 Thread Kunal Khatua
sed as a storage for the temporary > tables. > Sincerely, > Michael Shtelma > > > On Wed, May 10, 2017 at 6:30 PM, Kunal Khatua <kkha...@mapr.com> wrote: > > Drill does not cache data in memory because it introduces the risk of > dealing with stale data when working w

Re: In-memory cache in Drill

2017-05-10 Thread Kunal Khatua
e also used as a storage for the temporary tables. Sincerely, Michael Shtelma On Wed, May 10, 2017 at 6:30 PM, Kunal Khatua <kkha...@mapr.com> wrote: > Drill does not cache data in memory because it introduces the risk of dealing > with stale data when working with data at a large scale.

Re: In-memory cache in Drill

2017-05-10 Thread Kunal Khatua
Drill does not cache data in memory because it introduces the risk of dealing with stale data when working with data at a large scale. If you want to avoid hitting the actual storage repeatedly, one option is to use the 'create temp table ' feature

Re: Drill query are stuck in ENQUEUED mode

2017-05-04 Thread Kunal Khatua
ux box Please do let me know how to resolve query stuck issue as it is hampering products performance. Regards, Jasbir Singh -Original Message- From: Kunal Khatua [mailto:kkha...@mapr.com] Sent: Wednesday, May 03, 2017 10:43 PM To: user@drill.apache.org Cc: Kothari, Maneesh <man

<    1   2   3   >