gt; to setup an s3 connection to a single bucket, however, I actually want to
> run a query on all files in all buckets. Is this possible?
>
> Regards,
> Sebastian
>
--
Nitin Pawar
Please convert CSV to parquet first and while doing so make sure you cast
each column to correct datatype
once you have in paraquet, your queries should be bit faster.
On Fri, Oct 23, 2020, 11:57 AM Gareth Western
wrote:
> I have a very large CSV file (nearly 13 million records) stored in
orage>..`xyz.parquet`;
> > fails with -
> > org.apache.drill.common.exceptions.UserRemoteException: SYSTEM ERROR:
> > RemoteException:/path/xyz.parquet (is not a directory)
> >
> > Please let me know, if I am doing something wrong here.
> >
> > Thank you!
> > - Vishal
>
>
> --
> Nitin Pawar
>
>
>
--
Nitin Pawar
apache.drill.common.exceptions.UserRemoteException: SYSTEM ERROR:
> NoSuchElementException
>
> (3) select * from hdfs_storage>..`xyz.parquet`;
> fails with -
> org.apache.drill.common.exceptions.UserRemoteException: SYSTEM ERROR:
> RemoteException:/path/xyz.parquet (is not a directory)
>
> Ple
s heap memory that is affected,
> then you can increase the heap memory setting to see what affect that has
> on Drillbit lifetime.
>
> Thanks,
> - Paul
>
> [1] http://drill.apache.org/docs/configuring-drill-memory/
>
>
>
>
>
>
> On Tuesday, January 7, 2020, 08:45:
.
Thanks,
Nitin
On Wed, Jan 8, 2020 at 12:24 AM Abhishek Girish wrote:
> Thanks Nitin.
>
> As mentioned on Slack, Drill would not resubmit the queries. If any
> drillbit being used in query execution goes down, the query in question is
> cancelled.
>
> On Tue, Jan 7, 2020 at
I have created DRILL-7517 <https://issues.apache.org/jira/browse/DRILL-7517>
this for drill shutting down issue.
DRILL setup
MAX Memory given : 56GB
HEAP-12GB
Direct memory: 40GB
Thanks,
Nitin
On Tue, Jan 7, 2020 at 10:15 PM Nitin Pawar wrote:
> Hello Team
> We have recently upgra
.. would the queries with this
node as foreman be resubmitted automatically ?*
Also we have a 64GB RAM machines. Can someone recommend memory setting for
this environment
--
Nitin Pawar
; I totally understand how busy you can be but if you get a chance, please
> help me to get a clarity on these items. It will be really helpful
>
> Thanks again!
> Manu Mukundan
> Bigdata Architect,
> Prevalent AI,
> manu.mukun...@prevalent.ai
>
>
>
--
Nitin Pawar
roposed by Aman is the best solution for this
> problem, since for example if you have several aggregate functions in the
> project for the same columns, it will cause problems with such naming.
>
> Kind regards,
> Volodymyr Vysotskyi
>
>
> On Sat, Apr 20, 2019 at 8:44 A
ly different alias name ? The following should work:
> select max(last_name) *max_last_name* from cp.`employee.json` group by
> last_name limit 5;
>
> On Fri, Apr 19, 2019 at 2:24 PM Nitin Pawar
> wrote:
>
> > sorry my bad. i meant the query which was failing was with al
datorImpl.validateNamespace():977
>
> org.apache.calcite.sql.validate.SqlValidatorImpl.validateQuery():953
>
> org.apache.calcite.sql.SqlSelect.validate():216
>
>
>
> org.apache.calcite.sql.validate.SqlValidatorImpl.validateScopedExpression():928
>
> org.
gt;
> On 4/18/2019 10:58:48 PM, Nitin Pawar wrote:
> Hi,
>
> We are trying to upgrade drill from 1.13 to 1.15
> following query works in drill 1.13 but not in 1.15
>
> select max(last_name) from cp.`employee.json` group by last_name limit 5
>
> can you let us k
Hi,
We are trying to upgrade drill from 1.13 to 1.15
following query works in drill 1.13 but not in 1.15
select max(last_name) from cp.`employee.json` group by last_name limit 5
can you let us know if this backward compatibility issue will be fixed ??
--
Nitin Pawar
REQUIRED, FLOAT8-OPTIONAL)]. Full expression: --UNKNOWN
> EXPRESSION--.
>
>
>
>
> --
> Nitin Pawar
>
--
Nitin Pawar
--.
--
Nitin Pawar
rring?
> Can
> > I save the results of the query somewhere since it's succeeding in the
> > background?
> >
> > Drill demolishes our current solution with its performance and we really
> > want to use it but this bug is making it tricky to sell.
> >
> > Thanks,
> > James
> >
>
--
Nitin Pawar
lopers to solve this issue.
>
> [1] https://issues.apache.org/jira/projects/DRILL
> Thanks.
>
> Kind regards
> Vitalii
>
>
> On Thu, Jun 28, 2018 at 10:50 AM Nitin Pawar
> wrote:
>
> > I was able to fix this issue by doing concat(string1, ' ', string2)
>
I was able to fix this issue by doing concat(string1, ' ', string2) instead
of concat(string1, string2)
Not sure how adding a separator helps but it solved the problem
On Thu, Jun 28, 2018 at 11:55 AM, Nitin Pawar
wrote:
> Could this cause an issue if one of the field in concat function
Could this cause an issue if one of the field in concat function has large
text ?
On Thu, Jun 28, 2018 at 11:10 AM, Nitin Pawar
wrote:
> Hi Khurram,
>
> This is a parquet table.
> all the columns in the table are string columns (even date column is
> stored as string)
>
>
> Can you please share the description of the table (i.e. column types) ?
> Is this a parquet table or JSON ?
> Also please share the version of Drill and the drillbit.log
>
> Thanks,
> Khurram
>
> On Wed, Jun 27, 2018 at 9:45 AM, Nitin Pawar
> wrote:
>
&g
here are the details
query:
select Account Account,Name Name,CONCAT(DateString , string2)
Merged_String from dfs.tmp.`/nitin/` t1
There is no custom UDF in this query;
On Wed, Jun 27, 2018 at 2:22 PM, Nitin Pawar
wrote:
> Hi Vitalii,
>
> Thanks for the description.
> I wil
se describe your case. What kind of query did
> you perform, any UDF's, which data source?
> Also logs can help.
>
> Thanks.
>
> Kind regards
> Vitalii
>
>
> On Tue, Jun 26, 2018 at 1:27 PM Nitin Pawar
> wrote:
>
> > Hi,
> >
> > Can s
: IllegalStateException: Tried to remove unmanaged
buffer.
Fragment 0:0
[Error Id: bcd510f6-75ee-49a7-b723-7b35d8575623 on
ip-10-0-103-63.ec2.internal:31010]
--
Nitin Pawar
Hi .. Any help on this?
This is happening in our production environment so if we have any
configurations to avoid this ??
Thanks,
Nitin
On Tue, Jun 5, 2018 at 8:21 AM, Nitin Pawar wrote:
>
>
> Hi we are seeing memory leak in apache drill 1.13.0 version
>
> Acc
.
Memory leaked: (2097152)
Allocator(op:1:0:7:ParquetRowGroupScan) 100/0/64856064/100
(res/actual/peak/limit)
Fragment 1:0
[Error Id: 1b712605-ca3d-45cb-9c61-224b190ab4b2 on
ip-10-0-101-247.ec2.internal:31010]
--
Nitin Pawar
;
> > >> I've never been but what about OsCon?
> > >>
> > >
> > > Great option. It is bigger and better attended than ApacheCon (lately).
> > And
> > > they allow specialized tracks.
> >
> >
>
--
Nitin Pawar
ror: SYSTEM ERROR: IllegalArgumentException: URI has an authority
> component*
> *Fragment 0:0*
>
> Query that I am trying to run:
> *create table s3.tmp.`abcd` as select 1 from (values(1));*
>
> However, this query runs when I use dfs.tmp instead of s3.tmp
>
> On Fri, May 26,
ot;defaultInputFormat": null
> >> },
> >> "tmp": {
> >> "location": "/",
> >> "writable": *true*,
> >> "defaultInputFormat": "parquet"
> >> }
> >> }
> >>
> >>
> >> I have removed the info about the formats to keep the mail small.
> >> Also, I am using Dill on *Windows 10*
> >>
> >> On Mon, May 22, 2017 at 3:57 PM, Shuporno Choudhury <
> >> shuporno.choudh...@manthan.com> wrote:
> >>
> >>> Hi,
> >>>
> >>> Is it possible to write to a folder in an s3 bucket using the *s3.tmp*
> >>> workspace?
> >>> Whenever I try, it gives me the follwing error:
> >>>
> >>> *Error: VALIDATION ERROR: Schema [s3.tmp] is not valid with respect to
> >>> either root schema or current default schema.*
> >>> *Current default schema: s3.root*
> >>>
> >>> Also, s3.tmp doesn't appear while using the command "*show schemas*"
> >>> though the tmp workspace exists in the web console
> >>>
> >>> I am using Drill Version 1.10; embedded mode on my local system.
> >>>
> >>> However, I have no problem reading from an s3 bucket, the problem is
> >>> only writing to a s3 bucket.
> >>> --
> >>> Regards,
> >>> Shuporno Choudhury
> >>>
> >>
> >>
> >>
> >> --
> >> Regards,
> >> Shuporno Choudhury
> >>
> >
> >
> >
> > --
> > Regards,
> > Shuporno Choudhury
> >
>
>
>
> --
> Regards,
> Shuporno Choudhury
>
--
Nitin Pawar
seconds. I ran
> the explain plan query to validate this.
> The query execution time is 2 secs.
> total time taken is 32secs
>
> I wanted to understand how can i minimise the query plan time. Suggestions
> ?
> Is the time taken described above expected ?
> Attached is result from explain plan query
>
> Regards,
> Projjwal
>
>
--
Nitin Pawar
e network for data transfer is the major time taking
> component compared with the query execution time, I think that the entire
> data is first transferred to drill cluster and then the query is executed
> on the drill cluster ?
>
> Regards,
> Projjwal
>
> On Mon, F
displayed?
>
>
>
> Regards
>
> Chetan
>
>
>
> -Original Message-
> From: Nitin Pawar [mailto:nitinpawar...@gmail.com]
> Sent: Monday, February 20, 2017 6:19 PM
> To: user@drill.apache.org
> Subject: Re: Query on performance using Drill and Amazon s3.
>
expected behaviour ?
> I am looking for any quick tuning that can improve the performance or any
> other suggestions.
>
> Attaching is the JSON profile for this query.
>
> Regards,
> Projjwal
>
--
Nitin Pawar
pecify an input param as a
> ComplexHolder. I’m not sure if this would work or not, but also take a
> look at the implementation of KVGEN().
> I hope this helps,
> - C
>
> > On Feb 9, 2017, at 12:57, Nitin Pawar <nitinpawar...@gmail.com> wrote:
> >
> >
;sdu...@gainsight.com>
> wrote:
> > Hi,
> >
> > I am trying to write a UDF which will whether a list of strings is
> > contained in another list.
> >
> > Is there a way to pass a list of values to UDF where the list size is
> > variable?
> >
> > Thanks in advance!
> > -
> > Regards,
> > Sandeep
>
--
Nitin Pawar
Jan 15, 2017 at 2:21 PM, Anup Tiwari <anup.tiw...@games24x7.com>
> wrote:
>
> > Hi Team,
> >
> > Can someone tell me how to configure custom storage plugin in Drill for
> > accessing hive ORC tables?
> >
> > Thanks in advance!!
> >
> > Regards,
> > *Anup Tiwari*
> >
>
--
Nitin Pawar
12-05-2016 12:00:00 AM
> 1004 3011-03-2016 12:00:00 AM
> 1005 9510-14-2016 12:00:00 AM
> 1006 1510-05-2016 12:00:00 AM
>
>
> What is the problem with over(). ??What i am doing wrong in this query.?
> Why my column name not showing ??
>
>
> Thanks & Regards.
> Sanjiv Kumar.
>
--
Nitin Pawar
i think you are looking for
https://issues.apache.org/jira/browse/DRILL-1330
On Jan 4, 2017 4:45 PM, "Sanjiv Kumar" wrote:
> Hello
> I need help.Suppose I have one table having categoryName,
> categoryID, customerName.
> EXAMPLE:-
> categoryName categoryID
adding dev list for comments
On Wed, Nov 23, 2016 at 7:04 PM, Nitin Pawar <nitinpawar...@gmail.com>
wrote:
> Hi,
>
> according to DRILL-3596 <https://issues.apache.org/jira/browse/DRILL-3596>,
> lead or lag function are limited to use offset as 1.
>
> according to d
equal
to 1
usecase :
I have daily data for a month.
every day I want to do a delta with last week same day like compare monday
with monday and tuesday with tuesday so basically do a lag(col, 7)
--
Nitin Pawar
; in trail mail..
>
> On 15-Oct-2016 11:35 AM, "Nitin Pawar" <nitinpawar...@gmail.com> wrote:
>
> is there an option where you can upgrade to 1.8 and test it?
>
>
> On Sat, Oct 15, 2016 at 10:23 AM, Anup Tiwari <anup.tiw...@games24x7.com>
> wrote:
>
>
is there an option where you can upgrade to 1.8 and test it?
On Sat, Oct 15, 2016 at 10:23 AM, Anup Tiwari <anup.tiw...@games24x7.com>
wrote:
> No.. on a parquet table..
>
> Regards,
> *Anup Tiwari*
>
> On Fri, Oct 14, 2016 at 6:23 PM, Nitin Pawar <nitin
mChannel$EpollStreamUnsafe.
> epollInReady(AbstractEpollStreamChannel.java:618)
> at
> io.netty.channel.epoll.EpollEventLoop.processReady(
> EpollEventLoop.java:329)
> at io.netty.channel.epoll.EpollEventLoop.run(EpollEventLoop.java:250)
> at
> io.netty.util.concurrent.SingleThreadEventExecutor$2.
> run(SingleThreadEventExecutor.java:111)
> at java.lang.Thread.run(Thread.java:745)
>
>
> Also when i am trying to exclude empty string i.e. *col_name <> ''* then it
> is excluding null values as well.
>
> Regards,
> *Anup Tiwari*
>
--
Nitin Pawar
43 matches
Mail list logo