Re: Querying multiple s3 buckets

2020-11-22 Thread Nitin Pawar
gt; to setup an s3 connection to a single bucket, however, I actually want to > run a query on all files in all buckets. Is this possible? > > Regards, > Sebastian > -- Nitin Pawar

Re: What is the most memory-efficient technique for selecting several million records from a CSV file

2020-10-23 Thread Nitin Pawar
Please convert CSV to parquet first and while doing so make sure you cast each column to correct datatype once you have in paraquet, your queries should be bit faster. On Fri, Oct 23, 2020, 11:57 AM Gareth Western wrote: > I have a very large CSV file (nearly 13 million records) stored in

Re: Drill + parquet

2020-02-04 Thread Nitin Pawar
orage>..`xyz.parquet`; > > fails with - > > org.apache.drill.common.exceptions.UserRemoteException: SYSTEM ERROR: > > RemoteException:/path/xyz.parquet (is not a directory) > > > > Please let me know, if I am doing something wrong here. > > > > Thank you! > > - Vishal > > > -- > Nitin Pawar > > > -- Nitin Pawar

Re: Drill + parquet

2020-02-04 Thread Nitin Pawar
apache.drill.common.exceptions.UserRemoteException: SYSTEM ERROR: > NoSuchElementException > > (3) select * from hdfs_storage>..`xyz.parquet`; > fails with - > org.apache.drill.common.exceptions.UserRemoteException: SYSTEM ERROR: > RemoteException:/path/xyz.parquet (is not a directory) > > Ple

Re: Question about foreman restart

2020-01-07 Thread Nitin Pawar
s heap memory that is affected, > then you can increase the heap memory setting to see what affect that has > on Drillbit lifetime. > > Thanks, > - Paul > > [1] http://drill.apache.org/docs/configuring-drill-memory/ > > > > > > > On Tuesday, January 7, 2020, 08:45:

Re: Question about foreman restart

2020-01-07 Thread Nitin Pawar
. Thanks, Nitin On Wed, Jan 8, 2020 at 12:24 AM Abhishek Girish wrote: > Thanks Nitin. > > As mentioned on Slack, Drill would not resubmit the queries. If any > drillbit being used in query execution goes down, the query in question is > cancelled. > > On Tue, Jan 7, 2020 at

Re: Question about foreman restart

2020-01-07 Thread Nitin Pawar
I have created DRILL-7517 <https://issues.apache.org/jira/browse/DRILL-7517> this for drill shutting down issue. DRILL setup MAX Memory given : 56GB HEAP-12GB Direct memory: 40GB Thanks, Nitin On Tue, Jan 7, 2020 at 10:15 PM Nitin Pawar wrote: > Hello Team > We have recently upgra

Question about foreman restart

2020-01-07 Thread Nitin Pawar
.. would the queries with this node as foreman be resubmitted automatically ?* Also we have a 64GB RAM machines. Can someone recommend memory setting for this environment -- Nitin Pawar

Re: Clarification regarding Apache drill setup

2019-08-16 Thread Nitin Pawar
; I totally understand how busy you can be but if you get a chance, please > help me to get a clarity on these items. It will be really helpful > > Thanks again! > Manu Mukundan > Bigdata Architect, > Prevalent AI, > manu.mukun...@prevalent.ai > > > -- Nitin Pawar

Re: Blocker on drill upgrade path

2019-04-22 Thread Nitin Pawar
roposed by Aman is the best solution for this > problem, since for example if you have several aggregate functions in the > project for the same columns, it will cause problems with such naming. > > Kind regards, > Volodymyr Vysotskyi > > > On Sat, Apr 20, 2019 at 8:44 A

Re: Blocker on drill upgrade path

2019-04-19 Thread Nitin Pawar
ly different alias name ? The following should work: > select max(last_name) *max_last_name* from cp.`employee.json` group by > last_name limit 5; > > On Fri, Apr 19, 2019 at 2:24 PM Nitin Pawar > wrote: > > > sorry my bad. i meant the query which was failing was with al

Re: Blocker on drill upgrade path

2019-04-19 Thread Nitin Pawar
datorImpl.validateNamespace():977 > > org.apache.calcite.sql.validate.SqlValidatorImpl.validateQuery():953 > > org.apache.calcite.sql.SqlSelect.validate():216 > > > > org.apache.calcite.sql.validate.SqlValidatorImpl.validateScopedExpression():928 > > org.

Re: Blocker on drill upgrade path

2019-04-19 Thread Nitin Pawar
gt; > On 4/18/2019 10:58:48 PM, Nitin Pawar wrote: > Hi, > > We are trying to upgrade drill from 1.13 to 1.15 > following query works in drill 1.13 but not in 1.15 > > select max(last_name) from cp.`employee.json` group by last_name limit 5 > > can you let us k

Blocker on drill upgrade path

2019-04-18 Thread Nitin Pawar
Hi, We are trying to upgrade drill from 1.13 to 1.15 following query works in drill 1.13 but not in 1.15 select max(last_name) from cp.`employee.json` group by last_name limit 5 can you let us know if this backward compatibility issue will be fixed ?? -- Nitin Pawar

Re: Help for statistic functions

2018-12-06 Thread Nitin Pawar
REQUIRED, FLOAT8-OPTIONAL)]. Full expression: --UNKNOWN > EXPRESSION--. > > > > > -- > Nitin Pawar > -- Nitin Pawar

Help for statistic functions

2018-12-05 Thread Nitin Pawar
--. -- Nitin Pawar

Re: Long running query succeeds but UI times out?

2018-09-15 Thread Nitin Pawar
rring? > Can > > I save the results of the query somewhere since it's succeeding in the > > background? > > > > Drill demolishes our current solution with its performance and we really > > want to use it but this bug is making it tricky to sell. > > > > Thanks, > > James > > > -- Nitin Pawar

Re: Drill error

2018-06-28 Thread Nitin Pawar
lopers to solve this issue. > > [1] https://issues.apache.org/jira/projects/DRILL > Thanks. > > Kind regards > Vitalii > > > On Thu, Jun 28, 2018 at 10:50 AM Nitin Pawar > wrote: > > > I was able to fix this issue by doing concat(string1, ' ', string2) >

Re: Drill error

2018-06-28 Thread Nitin Pawar
I was able to fix this issue by doing concat(string1, ' ', string2) instead of concat(string1, string2) Not sure how adding a separator helps but it solved the problem On Thu, Jun 28, 2018 at 11:55 AM, Nitin Pawar wrote: > Could this cause an issue if one of the field in concat function

Re: Drill error

2018-06-28 Thread Nitin Pawar
Could this cause an issue if one of the field in concat function has large text ? On Thu, Jun 28, 2018 at 11:10 AM, Nitin Pawar wrote: > Hi Khurram, > > This is a parquet table. > all the columns in the table are string columns (even date column is > stored as string) > >

Re: Drill error

2018-06-27 Thread Nitin Pawar
> Can you please share the description of the table (i.e. column types) ? > Is this a parquet table or JSON ? > Also please share the version of Drill and the drillbit.log > > Thanks, > Khurram > > On Wed, Jun 27, 2018 at 9:45 AM, Nitin Pawar > wrote: > &g

Re: Drill error

2018-06-27 Thread Nitin Pawar
here are the details query: select Account Account,Name Name,CONCAT(DateString , string2) Merged_String from dfs.tmp.`/nitin/` t1 There is no custom UDF in this query; On Wed, Jun 27, 2018 at 2:22 PM, Nitin Pawar wrote: > Hi Vitalii, > > Thanks for the description. > I wil

Re: Drill error

2018-06-27 Thread Nitin Pawar
se describe your case. What kind of query did > you perform, any UDF's, which data source? > Also logs can help. > > Thanks. > > Kind regards > Vitalii > > > On Tue, Jun 26, 2018 at 1:27 PM Nitin Pawar > wrote: > > > Hi, > > > > Can s

Drill error

2018-06-26 Thread Nitin Pawar
: IllegalStateException: Tried to remove unmanaged buffer. Fragment 0:0 [Error Id: bcd510f6-75ee-49a7-b723-7b35d8575623 on ip-10-0-103-63.ec2.internal:31010] -- Nitin Pawar

Re: Memory Leak in drill 1.13.0

2018-06-08 Thread Nitin Pawar
Hi .. Any help on this? This is happening in our production environment so if we have any configurations to avoid this ?? Thanks, Nitin On Tue, Jun 5, 2018 at 8:21 AM, Nitin Pawar wrote: > > > Hi we are seeing memory leak in apache drill 1.13.0 version > > Acc

Memory Leak in drill 1.13.0

2018-06-04 Thread Nitin Pawar
. Memory leaked: (2097152) Allocator(op:1:0:7:ParquetRowGroupScan) 100/0/64856064/100 (res/actual/peak/limit) Fragment 1:0 [Error Id: 1b712605-ca3d-45cb-9c61-224b190ab4b2 on ip-10-0-101-247.ec2.internal:31010] -- Nitin Pawar

Re: Drill Summit/Conference Proposal

2017-06-19 Thread Nitin Pawar
; > > >> I've never been but what about OsCon? > > >> > > > > > > Great option. It is bigger and better attended than ApacheCon (lately). > > And > > > they allow specialized tracks. > > > > > -- Nitin Pawar

Re: Writing to s3 using Drill

2017-05-26 Thread Nitin Pawar
ror: SYSTEM ERROR: IllegalArgumentException: URI has an authority > component* > *Fragment 0:0* > > Query that I am trying to run: > *create table s3.tmp.`abcd` as select 1 from (values(1));* > > However, this query runs when I use dfs.tmp instead of s3.tmp > > On Fri, May 26,

Re: Writing to s3 using Drill

2017-05-26 Thread Nitin Pawar
ot;defaultInputFormat": null > >> }, > >> "tmp": { > >> "location": "/", > >> "writable": *true*, > >> "defaultInputFormat": "parquet" > >> } > >> } > >> > >> > >> I have removed the info about the formats to keep the mail small. > >> Also, I am using Dill on *Windows 10* > >> > >> On Mon, May 22, 2017 at 3:57 PM, Shuporno Choudhury < > >> shuporno.choudh...@manthan.com> wrote: > >> > >>> Hi, > >>> > >>> Is it possible to write to a folder in an s3 bucket using the *s3.tmp* > >>> workspace? > >>> Whenever I try, it gives me the follwing error: > >>> > >>> *Error: VALIDATION ERROR: Schema [s3.tmp] is not valid with respect to > >>> either root schema or current default schema.* > >>> *Current default schema: s3.root* > >>> > >>> Also, s3.tmp doesn't appear while using the command "*show schemas*" > >>> though the tmp workspace exists in the web console > >>> > >>> I am using Drill Version 1.10; embedded mode on my local system. > >>> > >>> However, I have no problem reading from an s3 bucket, the problem is > >>> only writing to a s3 bucket. > >>> -- > >>> Regards, > >>> Shuporno Choudhury > >>> > >> > >> > >> > >> -- > >> Regards, > >> Shuporno Choudhury > >> > > > > > > > > -- > > Regards, > > Shuporno Choudhury > > > > > > -- > Regards, > Shuporno Choudhury > -- Nitin Pawar

Re: Minimise query plan time for dfs plugin for local file system on tsv file

2017-03-03 Thread Nitin Pawar
seconds. I ran > the explain plan query to validate this. > The query execution time is 2 secs. > total time taken is 32secs > > I wanted to understand how can i minimise the query plan time. Suggestions > ? > Is the time taken described above expected ? > Attached is result from explain plan query > > Regards, > Projjwal > > -- Nitin Pawar

Re: Query on performance using Drill and Amazon s3.

2017-02-21 Thread Nitin Pawar
e network for data transfer is the major time taking > component compared with the query execution time, I think that the entire > data is first transferred to drill cluster and then the query is executed > on the drill cluster ? > > Regards, > Projjwal > > On Mon, F

RE: Query on performance using Drill and Amazon s3.

2017-02-20 Thread Nitin Pawar
displayed? > > > > Regards > > Chetan > > > > -Original Message- > From: Nitin Pawar [mailto:nitinpawar...@gmail.com] > Sent: Monday, February 20, 2017 6:19 PM > To: user@drill.apache.org > Subject: Re: Query on performance using Drill and Amazon s3. >

Re: Query on performance using Drill and Amazon s3.

2017-02-20 Thread Nitin Pawar
expected behaviour ? > I am looking for any quick tuning that can improve the performance or any > other suggestions. > > Attaching is the JSON profile for this query. > > Regards, > Projjwal > -- Nitin Pawar

Re: Drill UDF input - pass variable list of strings

2017-02-09 Thread Nitin Pawar
pecify an input param as a > ComplexHolder. I’m not sure if this would work or not, but also take a > look at the implementation of KVGEN(). > I hope this helps, > - C > > > On Feb 9, 2017, at 12:57, Nitin Pawar <nitinpawar...@gmail.com> wrote: > > > >

Re: Drill UDF input - pass variable list of strings

2017-02-09 Thread Nitin Pawar
;sdu...@gainsight.com> > wrote: > > Hi, > > > > I am trying to write a UDF which will whether a list of strings is > > contained in another list. > > > > Is there a way to pass a list of values to UDF where the list size is > > variable? > > > > Thanks in advance! > > - > > Regards, > > Sandeep > -- Nitin Pawar

Re: Storage Plugin for accessing Hive ORC Table from Drill

2017-01-19 Thread Nitin Pawar
Jan 15, 2017 at 2:21 PM, Anup Tiwari <anup.tiw...@games24x7.com> > wrote: > > > Hi Team, > > > > Can someone tell me how to configure custom storage plugin in Drill for > > accessing hive ORC tables? > > > > Thanks in advance!! > > > > Regards, > > *Anup Tiwari* > > > -- Nitin Pawar

Re: Column Name change in output while using over()

2017-01-18 Thread Nitin Pawar
12-05-2016 12:00:00 AM > 1004 3011-03-2016 12:00:00 AM > 1005 9510-14-2016 12:00:00 AM > 1006 1510-05-2016 12:00:00 AM > ​ > > ​What is the problem with over(). ??What i am doing wrong in this query.? > Why my column name not showing ??​ > > > Thanks & Regards. > Sanjiv Kumar. > -- Nitin Pawar

Re: How to get multiple row value as a single column

2017-01-04 Thread Nitin Pawar
i think you are looking for https://issues.apache.org/jira/browse/DRILL-1330 On Jan 4, 2017 4:45 PM, "Sanjiv Kumar" wrote: > Hello > I need help.Suppose I have one table having categoryName, > categoryID, customerName. > EXAMPLE:- > categoryName categoryID

Re: Window function

2016-11-25 Thread Nitin Pawar
adding dev list for comments On Wed, Nov 23, 2016 at 7:04 PM, Nitin Pawar <nitinpawar...@gmail.com> wrote: > Hi, > > according to DRILL-3596 <https://issues.apache.org/jira/browse/DRILL-3596>, > lead or lag function are limited to use offset as 1. > > according to d

Window function

2016-11-23 Thread Nitin Pawar
equal to 1 usecase : I have daily data for a month. every day I want to do a delta with last week same day like compare monday with monday and tuesday with tuesday so basically do a lag(col, 7) -- Nitin Pawar

Re: [Drill 1.6] : Number format exception due to Empty String

2016-10-15 Thread Nitin Pawar
; in trail mail.. > > On 15-Oct-2016 11:35 AM, "Nitin Pawar" <nitinpawar...@gmail.com> wrote: > > is there an option where you can upgrade to 1.8 and test it? > > > On Sat, Oct 15, 2016 at 10:23 AM, Anup Tiwari <anup.tiw...@games24x7.com> > wrote: > >

Re: [Drill 1.6] : Number format exception due to Empty String

2016-10-15 Thread Nitin Pawar
is there an option where you can upgrade to 1.8 and test it? On Sat, Oct 15, 2016 at 10:23 AM, Anup Tiwari <anup.tiw...@games24x7.com> wrote: > No.. on a parquet table.. > > Regards, > *Anup Tiwari* > > On Fri, Oct 14, 2016 at 6:23 PM, Nitin Pawar <nitin

Re: [Drill 1.6] : Number format exception due to Empty String

2016-10-14 Thread Nitin Pawar
mChannel$EpollStreamUnsafe. > epollInReady(AbstractEpollStreamChannel.java:618) > at > io.netty.channel.epoll.EpollEventLoop.processReady( > EpollEventLoop.java:329) > at io.netty.channel.epoll.EpollEventLoop.run(EpollEventLoop.java:250) > at > io.netty.util.concurrent.SingleThreadEventExecutor$2. > run(SingleThreadEventExecutor.java:111) > at java.lang.Thread.run(Thread.java:745) > > > Also when i am trying to exclude empty string i.e. *col_name <> ''* then it > is excluding null values as well. > > Regards, > *Anup Tiwari* > -- Nitin Pawar