Re: [ANNOUNCE] New PMC Chair of Apache Drill

2019-08-27 Thread Chunhui Shi
Congrats Charles! And thanks Arina for your contributions!

Chunhui

On Mon, Aug 26, 2019 at 10:43 AM weijie tong 
wrote:

> Congratulations Charles.
>
> On Sat, Aug 24, 2019 at 11:33 AM Robert Hou  wrote:
>
> > Congratulations Charles, and thanks for your contributions to Drill!
> >
> > Thank you Arina for all you have done as PMC Chair this past year.
> >
> > --Robert
> >
> > On Fri, Aug 23, 2019 at 4:16 PM Khurram Faraaz 
> > wrote:
> >
> > > Congratulations Charles, and thank you Arina.
> > >
> > > Regards,
> > > Khurram
> > >
> > > On Fri, Aug 23, 2019 at 2:54 PM Niels Basjes  wrote:
> > >
> > > > Congratulations Charles.
> > > >
> > > > Niels Basjes
> > > >
> > > > On Thu, Aug 22, 2019, 09:28 Arina Ielchiieva 
> wrote:
> > > >
> > > > > Hi all,
> > > > >
> > > > > It has been a honor to serve as Drill Chair during the past year
> but
> > > it's
> > > > > high time for the new one...
> > > > >
> > > > > I am very pleased to announce that the Drill PMC has voted to elect
> > > > Charles
> > > > > Givre as the new PMC chair of Apache Drill. He has also been
> approved
> > > > > unanimously by the Apache Board in last board meeting.
> > > > >
> > > > > Congratulations, Charles!
> > > > >
> > > > > Kind regards,
> > > > > Arina
> > > > >
> > > >
> > >
> >
>


Re: [ANNOUNCE] New Committer: Hanumath Rao Maduri

2018-11-01 Thread Chunhui Shi
Congratulations Hanu!
--
From:Arina Ielchiieva 
Send Time:2018 Nov 1 (Thu) 06:05
To:dev ; user 
Subject:[ANNOUNCE] New Committer: Hanumath Rao Maduri

The Project Management Committee (PMC) for Apache Drill has invited Hanumath
Rao Maduri to become a committer, and we are pleased to announce that he
has accepted.

Hanumath became a contributor in 2017, making changes mostly in the Drill
planning side, including lateral / unnest support. He is also one of the
contributors of index based planning and execution support.

Welcome Hanumath, and thank you for your contributions!

- Arina
(on behalf of Drill PMC)


Re: [HANGOUT] [new link] Topics for October 02 2018

2018-10-13 Thread Chunhui Shi
Hi Aman, are you going to send out the slides in another email?

Regards,
Chunhui
--
From:Aman Sinha 
Send Time:2018 Oct 12 (Fri) 10:59
To:user ; dev 
Subject:Re: [HANGOUT] [new link] Topics for October 02 2018

Attached is a PDF version of the slides.  Unfortunately, I don't have a 
recording. 

thanks,
Aman


On Thu, Oct 11, 2018 at 9:39 AM Pritesh Maker  wrote:
Divya -  anyone is welcome to join the hangout! Aman will be sharing the
 slides shortly. We use Google Hangouts which doesn't have the option to
 record the session.

 On Thu, Oct 11, 2018 at 1:06 AM Divya Gehlot 
 wrote:

 > Can we have the recordings of the talk for the benefit of the other drill
 > users in the community or it is a closed affair ?
 >
 >
 > Thanks,
 > Divya
 >
 > On Sat, 29 Sep 2018 at 05:13, Karthikeyan Manivannan  >
 > wrote:
 >
 > > Hi,
 > >
 > > We will have a Drill Hangout on October 2 2018 at 10 AM Pacific Time.
 > > Please suggest topics by replying to this thread.
 > >
 > > We now have a ==new Hangout link== that supports 25 participants
 > >
 > https://urldefense.proofpoint.com/v2/url?u=http-3A__meet.google.com_yki-2Diqdf-2Dtai=DwIBaQ=cskdkSMqhcnjZxdQVpwTXg=zySISmkmM4WNViCKijENtQ=qfLrUky-Q0VH16D_8DbqCu_9zAq0dy_xYHqNyo_LBZ4=49AFD7imHiJVVZgvJm_bepjET2MbgKn8axkfn7BFvPI=
 > >
 > > Please note that in this Hangout, non-MapR participants will have to wait
 > > to be let into the call by a MapR participant. Sorry for the
 > inconvenience.
 > >
 > > Thanks.
 > >
 > > Karthik
 > >
 >


Re: [ANNOUNCE] New Committer: Chunhui Shi

2018-09-28 Thread Chunhui Shi
Thank you Arina, PMCs, and every driller friends! I deeply appreciate the 
opportunity to be part of this global growing community of awesome developers.

Best regards,
Chunhui 


--
From:Arina Ielchiieva 
Send Time:2018 Sep 28 (Fri) 02:17
To:dev ; user 
Subject:[ANNOUNCE] New Committer: Chunhui Shi

The Project Management Committee (PMC) for Apache Drill has invited Chunhui
Shi to become a committer, and we are pleased to announce that he has
accepted.

Chunhui Shi has become a contributor since 2016, making changes in various
Drill areas. He has shown profound knowledge in Drill planning side during
his work to support lateral join. He is also one of the contributors of the
upcoming feature to support index based planning and execution.

Welcome Chunhui, and thank you for your contributions!

- Arina
(on behalf of Drill PMC)



Re: [IDEAS] Drill start up quotes

2018-09-13 Thread Chunhui Shi
Some more quotes:

We drill to know we're not alone
Good friends, good books, and a drill cluster: this is the ideal life
Outside of a dog, Drill is man's best friend


--
Sender:Arina Yelchiyeva 
Sent at:2018 Sep 11 (Tue) 10:27
To:user 
Cc:dev 
Subject:Re: [IDEAS] Drill start up quotes

Some quotes ideas:

drill never goes out of style
everything is easier with drill

Kunal,
regarding config, sounds reasonable, I'll do that.

Kind regards,
Arina


On Tue, Sep 11, 2018 at 12:17 AM Benedikt Koehler 
wrote:

> You told me to drill sergeant! (Forrest Gump)
>
> Benedikt
> @furukama
>
>
> Kunal Khatua  schrieb am Mo. 10. Sep. 2018 um 21:01:
>
> > +1 on the suggestion.
> >
> > I would also suggest that we change the backend implementation of the
> > quotes to refer to a properties file (within the classpath) rather than
> > have it hard coded within the SqlLine package.  This will ensure that new
> > quotes can be added with every release without the need to touch the
> > SqlLine fork for Drill.
> >
> > ~ Kunal
> > On 9/10/2018 7:06:59 AM, Arina Ielchiieva  wrote:
> > Hi all,
> >
> > we are close to SqlLine 1.5.0 upgrade which now has the mechanism to
> > preserve Drill customizations. This one does include multiline support
> but
> > the next release might.
> > You all know that one of the Drill customizations is quotes at startup. I
> > was thinking we might want to fresh up the list a little bit.
> >
> > Here is the current list:
> >
> > start your sql engine
> > this isn't your grandfather's sql
> > a little sql for your nosql
> > json ain't no thang
> > drill baby drill
> > just drill it
> > say hello to my little drill
> > what ever the mind of man can conceive and believe, drill can query
> > the only truly happy people are children, the creative minority and drill
> > users
> > a drill is a terrible thing to waste
> > got drill?
> > a drill in the hand is better than two in the bush
> >
> > If anybody has new serious / funny / philosophical / creative quotes
> > ideas, please share and we can consider adding them to the existing list.
> >
> > Kind regards,
> > Arina
> >
> --
>
> --
> Dr. Benedikt Köhler
> Kreuzweg 4 • 82131 Stockdorf
> Mobil: +49 170 333 0161 • Telefon: +49 89 857 45 84
> Mail: bened...@eigenarbeit.org
>


bi-weekly Hangout at April 17th 10:00am PST

2018-04-16 Thread Chunhui Shi
We will have our routine hangout tomorrow.


Please raise any topic you want to discuss before the meeting or at the 
beginning of the meeting.


https://hangouts.google.com/hangouts/_/event/ci4rdiju8bv04a64efj5fedd0lc


Best,

Chunhui


Re: Drill 1.12 with avatica remote driver not working

2018-01-31 Thread Chunhui Shi
What version of Drill you are using? If you use latest Drill built from master 
branch, which is now moving to Calcite 1.15 built from MapR's clone, that might 
work for you.

From: Stelian Groza 
Sent: Wednesday, January 31, 2018 5:26:22 AM
To: user@drill.apache.org
Subject: Drill 1.12 with avatica remote driver not working

Hi;

I am trying to configure a jdbc storage plugin by using avatica remote drive as 
follows:
{
  "type": "jdbc",
  "driver": "org.apache.calcite.avatica.remote.Driver",
  "url": "jdbc:avatica:remote:url=http://localhost:8082/druid/v2/sql/avatica/;,
  "username": "root",
  "password": "mypassword",
  "enabled": true
}

whereas on localhost I have installed druid 
(https://urldefense.proofpoint.com/v2/url?u=http-3A__druid=DwIFAg=cskdkSMqhcnjZxdQVpwTXg=FCGQb-L4gJ1XbsL1WU2sugDtPvzIxWFzAi5u4TTtxaI=T6dhrUjgNMKuTYfVzDjVcmeqHiOqy6v3oBrHCP68UyM=gMmntDGvlQGH1FD2GQKUDq8wWUOinZXNQTk6gorrlww=
 
.io)
 which listen on 8082 for sql requests.
However the sql connection cant be established because drill send first an 
avatica synchConnection  request instead of avatica openConnection.
If I am using a java client with avatica 1.10.0 version everything is working 
fine.
I noticed also on druid there is calcite-avatica-1.4.0-r23 version which might 
be not in synch with 1.10.0.

Is there anything that I could do in order to fix this setup?

Thanks




Re: Records dropping when converting text file to parquet format

2017-10-27 Thread Chunhui Shi
Is there any field null for these dropped records? I think you might need to 
file a JIRA, and attach related logs, profiles, and other observation you have, 
e.g. dropped records to the JIRA. This sounds like a bug.


From: Frank DeLuccia 
Sent: Friday, October 27, 2017 1:53:18 PM
To: user@drill.apache.org
Subject: Records dropping when converting text file to parquet format

Hello,

I’ve been evaluating drill for a project I’m working on and noticed when I did 
a pipe-delimited CTAS to parquet format, about 1800 records dropped from the 
parquet file that was created.  I have no idea why and there doesn’t seem like 
there’s much information on the web about it.

Have you ever heard of such a thing?

Thanks!

SPS Commerce Infinite Retail Power

Frank DeLuccia
Principal Software Engineer

P: 973-616-6131
fdeluc...@spscommerce.com

[cid:18403FF4-DB1E-4D56-B1A7-84805094DBC1]
  [cid:CF7A52E4-E1EE-4854-8C43-5667C728A236]  
  [cid:419B9653-784F-4599-A317-686E6F6A6470] 
   
[cid:D42B13C5-0FF9-4B59-B3B8-E782DBDF7B44] 
   
[cid:DB49BD0A-5550-4A2C-8D9A-4AB0B96F7C42] 

   [cid:92FFB469-99A7-4E1B-B0C5-E4489F40F7D7] 
   
[cid:B6437814-464E-4989-AA4A-72909D436C6D]    
[cid:E9E66938-1B21-45C4-B14F-F7C2ED20763E] 




Hangout minutes for Oct/3 2017

2017-10-05 Thread Chunhui Shi
Attendees: Sorabh, Sindhu, Padma, Arina, Vitalii, Volodymyr, Vova, Pritesh, 
Aman, Vlad, Boaz


We discussed about 1.12.0 release timeline, and might want to set the release 
time to early November. Arina offered to work as release manager for this 
release and will come up with the timeline proposal. Thanks Arina!


We also talked about some possible features to be included in 1.12.0. E.g. 
Kafka storage plugin. And what progress or obstacle in these works.


No other topic was raised.


Thank you everyone,


Chunhui


Drill hangout is going to be on regular time at 10:00am Pacific Time today

2017-10-03 Thread Chunhui Shi
Please use this link to join the hangout:


 https://plus.google.com/hangouts/_/event/ci4rdiju8bv04a64efj5fedd0lc


Thanks,


Chunhui


Re: error starting sqline in distributed mode on windows server 2012

2017-08-08 Thread Chunhui Shi
You may want to check drill's zookeeper configuration (usually in 
drill-override.conf) for 'zk.connect' option to see the connection parameter 
were set to the same values as you used in sqlline.



From: Divya Gehlot 
Sent: Tuesday, August 8, 2017 1:34:20 AM
To: user@drill.apache.org
Subject: error starting sqline in distributed mode on windows server 2012

Hi,

C:\apache-drill-1.11.0\bin>sqlline.bat -u
> "jdbc:drill:zk=:2181,:2181,:2181"
> DRILL_ARGS - " -u jdbc:drill:zk=:2181,:2181,:2181"
> HADOOP_HOME not detected...
> HBASE_HOME not detected...
> Calculating Drill classpath...
> No active Drillbit endpoint found from ZooKeeper. Check connection
> parameters?
> apache drill 1.11.0
> "got drill?"
> 0: jdbc:drill:zk=:2181,> SELECT * FROM sys.drillbits;
> No current connection
> 0: jdbc:drill:zk=:2181,>



P.S. - removed the real Ip adress for security purpose

Appreciate the help !

Thanks,
Divya


Re: Drill hangout today

2017-06-29 Thread Chunhui Shi
Hangout minutes:


Attendees: Jyothsna, Jinfeng, Pritesh, Boaz, Paul, Arina, John, Rob, etc

Arina as the release manager for 1.11.0 asked some questions about logistics of 
a new release for Drill.

Jinfeng provided some description about the workflow, like get PGP key, prepare 
candidates.

Arina also want to have the team home page to include current PMC, PMC chair, 
committer information and filed a JIRA for it.


Then John did an interesting demo about how he setup a Mesos managed Drill 
cluster and access the cluster through Jupyter notebook

to run SQL queries in Drill and plot results directly in Jupyter notebook. 
Since the demo is so successful and Paul already had some conversation

with John about Mesos managing Drill cluster,  there may be some follow up 
works to come.


John, could you share the links about how you did Mesos managed cluster, and 
how you run Jupyter notebook to make Spark and Drill

communicate with each other via Spark's dataframe?  I believe the broader 
community will be very interested in the work too.


Best,


Chunhui




Re: Drill hangout today

2017-06-27 Thread Chunhui Shi
As usual, the links are:


Hangout link - 
https://plus.google.com/hangouts/_/event/ci4rdiju8bv04a64efj5fedd0lc

Minutes will be posted at 
https://docs.google.com/document/d/1o2GvZUtJvKzN013JdM715ZBzhseT0VyZ9WgmLMeeUUk/edit?ts=5744c15c#heading=h.z8q6drmaybbj

Signup rotation for leading the hangouts is at 
https://docs.google.com/spreadsheets/d/1bEQKk16Kktb1XeZwKD8xCuhaO8FtNfF1Cr2rcTv1a6M/edit#gid=0


From: Chunhui Shi <c...@mapr.com>
Sent: Tuesday, June 27, 2017 9:53:37 AM
To: d...@drill.apache.org; user@drill.apache.org
Subject: Drill hangout today

We are going to have a hangout today at 10:00am Pacific Time. Please feel free 
to raise topics of interests.

Chunhui


Re: querying from multiple directories in S3

2017-05-10 Thread Chunhui Shi
I think what Charles meant was "WHERE (dir2 = 15 AND dir3 < 20) OR (dir2 = 14 
AND dir3 > 4)",  and of course you need to add dir0 and dir1 for year and month.


And what do you mean by "scan all the files on every query", scan all the files 
of one day data, I thought this was your purpose?


From: Wesley Chow 
Sent: Wednesday, May 10, 2017 9:04:12 AM
To: user@drill.apache.org
Subject: Re: querying from multiple directories in S3

I don't think so, because doesn't AND commute, which would mean dir2 = 15
AND dir2=14 would always be false?

Even if there is some comparison that works, isn't there still an issue
that the S3 file source has to scan all the files on every query?

Wes

On Wed, May 10, 2017 at 8:15 AM, Charles Givre  wrote:

> Hi Wes,
> Are you putting the dirX fields in the WHERE clause?
> IE  Couldn't you do soemthing like:
>
> SELECT  
> FROM s3.data
> WHERE (dir2 = 15 AND dir3 < 20) AND (dir2 = 14 AND dir3 > 4)
>
> In theory this could work for UTC -4.  It’s ugly… but I think it would
> work.
> — C
>
>
>
> > On May 9, 2017, at 10:06, Wesley Chow  wrote:
> >
> > What is the recommended way to issue a query against a large number of
> > tables in S3? At the moment I'm aliasing the table as a giant UNION ALL,
> > but is there a better way to do this?
> >
> > Our data is stored as a time hierarchy, like /MM/DD/HH/MM in UTC, but
> > unfortunately I can't simply run the query recursively on an entire day
> of
> > data. I usually need a day of data in a non-UTC time zone. Is there some
> > elegant way to grab that data using the dir0, dir1 magic columns?
> >
> > Thanks,
> > Wes
>
>


Re: Error trying to query JSON array/MongoDB

2017-04-12 Thread Chunhui Shi
The MongoDB driver library used in Drill now is 3.2. I don't think Mongo has 
forward compatibility guarantee that 3.2 library could talk to 3.4 server. With 
this said, could you try to connect to MongoDB 3.2? If the same problem persist 
then we should debug it.




From: gus 
Sent: Wednesday, April 12, 2017 2:37:27 PM
To: user@drill.apache.org
Subject: Error trying to query JSON array/MongoDB

Hello! I'm using Apache Drill 1.10.0 to query MongoDB-3.4 (linux).
I need to compare one value inside the json array with another collection value.

This is the query:

select fb.v1._ as codigofb, trf.v3 AS topo, fb.v20.a as titulofb,
trf.v20.a AS titulotrf from `filmes` fb JOIN
`trf20170405` trf ON trf.v1._ = fb.v1._;


It prints 100 results and then it gives me this error[1].

Each collection have ~100 MB. And the same error appears when I try to
limit to 100.

This is the example of the document from trf:
https://share.riseup.net/#-vKctuQvhOBQStl6RJ5iRg

Any tips?

cheers!,
gus


[1] error msg:

Error: SYSTEM ERROR: IllegalStateException: You tried to start when you
are using a ValueWriter of type SingleMapWriter.

Fragment 0:0

[Error Id: 0f36b8e6-8f44-4696-a1c3-610a28815d20 on debian:31010]

  (java.lang.IllegalStateException) You tried to start when you are
using a ValueWriter of type SingleMapWriter.

org.apache.drill.exec.vector.complex.impl.AbstractFieldWriter.startList():108
org.apache.drill.exec.vector.complex.impl.SingleMapWriter.startList():98
org.apache.drill.exec.vector.complex.impl.MapOrListWriterImpl.start():68
org.apache.drill.exec.store.bson.BsonRecordReader.writeToListOrMap():83
org.apache.drill.exec.store.bson.BsonRecordReader.writeToListOrMap():112
org.apache.drill.exec.store.bson.BsonRecordReader.write():75
org.apache.drill.exec.store.mongo.MongoRecordReader.next():186
org.apache.drill.exec.physical.impl.ScanBatch.next():179
org.apache.drill.exec.record.AbstractRecordBatch.next():119
org.apache.drill.exec.record.AbstractRecordBatch.next():109
org.apache.drill.exec.record.AbstractSingleRecordBatch.innerNext():51

org.apache.drill.exec.physical.impl.project.ProjectRecordBatch.innerNext():135
org.apache.drill.exec.record.AbstractRecordBatch.next():162
org.apache.drill.exec.record.AbstractRecordBatch.next():119
org.apache.drill.exec.record.AbstractRecordBatch.next():109
org.apache.drill.exec.physical.impl.join.HashJoinBatch.buildSchema():175
org.apache.drill.exec.record.AbstractRecordBatch.next():142
org.apache.drill.exec.record.AbstractRecordBatch.next():119
org.apache.drill.exec.record.AbstractRecordBatch.next():109
org.apache.drill.exec.record.AbstractSingleRecordBatch.innerNext():51

org.apache.drill.exec.physical.impl.project.ProjectRecordBatch.innerNext():135
org.apache.drill.exec.record.AbstractRecordBatch.next():162
org.apache.drill.exec.record.AbstractRecordBatch.next():119
org.apache.drill.exec.record.AbstractRecordBatch.next():109
org.apache.drill.exec.record.AbstractSingleRecordBatch.innerNext():51

org.apache.drill.exec.physical.impl.project.ProjectRecordBatch.innerNext():135
org.apache.drill.exec.record.AbstractRecordBatch.next():162
org.apache.drill.exec.record.AbstractRecordBatch.next():119
org.apache.drill.exec.record.AbstractRecordBatch.next():109
org.apache.drill.exec.record.AbstractSingleRecordBatch.innerNext():51

org.apache.drill.exec.physical.impl.project.ProjectRecordBatch.innerNext():135
org.apache.drill.exec.record.AbstractRecordBatch.next():162
org.apache.drill.exec.physical.impl.BaseRootExec.next():104
org.apache.drill.exec.physical.impl.ScreenCreator$ScreenRoot.innerNext():81
org.apache.drill.exec.physical.impl.BaseRootExec.next():94
org.apache.drill.exec.work.fragment.FragmentExecutor$1.run():232
org.apache.drill.exec.work.fragment.FragmentExecutor$1.run():226
java.security.AccessController.doPrivileged():-2
javax.security.auth.Subject.doAs():415
org.apache.hadoop.security.UserGroupInformation.doAs():1657
org.apache.drill.exec.work.fragment.FragmentExecutor.run():226
org.apache.drill.common.SelfCleaningRunnable.run():38
java.util.concurrent.ThreadPoolExecutor.runWorker():1145
java.util.concurrent.ThreadPoolExecutor$Worker.run():615
java.lang.Thread.run():745 (state=,code=0)


3/21 Hangout starts now

2017-03-21 Thread Chunhui Shi
Hi,

I don't have topic for now. If you have anything want to raise for discussion 
please reply to the email or join the hangout at 
https://hangouts.google.com/hangouts/_/event/ci4rdiju8bv04a64efj5fedd0lc


Thanks,

Chunhui


Re: Kudu plugin

2017-01-25 Thread Chunhui Shi
Could you file a JIRA and work on this update? Thanks.


From: Rahul Raj 
Sent: Wednesday, January 25, 2017 10:03:37 PM
To: user@drill.apache.org
Subject: Kudu plugin

Any experiences with Kudu storage plugin? I could see that there has been
no activity on kudu storage almost for a year.

Getting an error -" out-of-order key" for a query select v,count(k) from
kudu.test group by v where k is the primary key. This happens only when the
aggregation is done on primary key. Should drill move to the latest kudu
client to investigate this further?

Current drill kudu connector uses org.kududb:kudu-client:0.6.0 from
cloudera repository, where the latest released library
org.apache.kudu:kudu-client:1.1.0 is hosted on maven central. There are a
few breaking changes with the new library:

   1. TIMESTAMP renamed to UNIXTIME_MICROS
   2. In KuduRecordReader#setup -
   KuduScannerBuilder#lowerBoundPartitionKeyRaw renamed to lowerBoundRaw
   andKuduScannerBuilder#exclusiveUpperBoundPartitionKeyRaw renamed
   exclusiveUpperBoundRaw. Both methods are deprecated.
   3. In KuduRecordWriterImpl#updateSchema - client.createTable(name,
   kuduSchema) requires CreateTableOperatios as the third argument


Any thoughts on this upgrade?

Regards,
Rahul

--
 This email and any files transmitted with it are confidential and
intended solely for the use of the individual or entity to whom it is
addressed. If you are not the named addressee then you should not
disseminate, distribute or copy this e-mail. Please notify the sender
immediately and delete this e-mail from your system.


Re: Storage Plugin for accessing Hive ORC Table from Drill

2017-01-23 Thread Chunhui Shi
I guess you are using Hive 2.0 as meta server while Drill has only 1.2 
libraries.


In Hive 2.0 above, This delta format could have more than one '_' as separator 
while 1.2 has only one '_'.


I think Drill should eventually update to use Hive's 2.0/2.1 libraries.


From: Anup Tiwari 
Sent: Friday, January 20, 2017 10:07:50 PM
To: user@drill.apache.org; d...@drill.apache.org
Subject: Re: Storage Plugin for accessing Hive ORC Table from Drill

@Andries, We are using Hive 2.1.1 with Drill 1.9.0.

@Zelaine, Could this be a problem in your Hive metastore?--> As i mentioned
earlier, i am able to read hive parquet tables in Drill through hive
storage plugin. So can you tell me a bit more like which type of
configuration i am missing in metastore?

Regards,
*Anup Tiwari*

On Sat, Jan 21, 2017 at 4:56 AM, Zelaine Fong  wrote:

> The stack trace shows the following:
>
> Caused by: org.apache.drill.common.exceptions.DrillRuntimeException:
> java.io.IOException: Failed to get numRows from HiveTable
>
> The Drill optimizer is trying to read rowcount information from Hive.
> Could this be a problem in your Hive metastore?
>
> Has anyone else seen this before?
>
> -- Zelaine
>
> On 1/20/17, 7:35 AM, "Andries Engelbrecht"  wrote:
>
> What version of Hive are you using?
>
>
> --Andries
>
> 
> From: Anup Tiwari 
> Sent: Friday, January 20, 2017 3:00:43 AM
> To: user@drill.apache.org; d...@drill.apache.org
> Subject: Re: Storage Plugin for accessing Hive ORC Table from Drill
>
> Hi,
>
> Please find below Create Table Statement and subsequent Drill Error :-
>
> *Table Structure :*
>
> CREATE TABLE `logindetails_all`(
>   `sid` char(40),
>   `channel_id` tinyint,
>   `c_t` bigint,
>   `l_t` bigint)
> PARTITIONED BY (
>   `login_date` char(10))
> CLUSTERED BY (
>   channel_id)
> INTO 9 BUCKETS
> ROW FORMAT SERDE
>   'org.apache.hadoop.hive.ql.io.orc.OrcSerde'
> STORED AS INPUTFORMAT
>   'org.apache.hadoop.hive.ql.io.orc.OrcInputFormat'
> OUTPUTFORMAT
>   'org.apache.hadoop.hive.ql.io.orc.OrcOutputFormat'
> LOCATION
>   'hdfs://hostname1:9000/usr/hive/warehouse/logindetails_all'
> TBLPROPERTIES (
>   'compactorthreshold.hive.compactor.delta.num.threshold'='6',
>   'compactorthreshold.hive.compactor.delta.pct.threshold'='0.5',
>   'transactional'='true',
>   'transient_lastDdlTime'='1484313383');
> ;
>
> *Drill Error :*
>
> *Query* : select * from hive.logindetails_all limit 1;
>
> *Error :*
> 2017-01-20 16:21:12,625 [277e145e-c6bc-3372-01d0-6c5b75b92d73:foreman]
> INFO  o.a.drill.exec.work.foreman.Foreman - Query text for query id
> 277e145e-c6bc-3372-01d0-6c5b75b92d73: select * from
> hive.logindetails_all
> limit 1
> 2017-01-20 16:21:12,831 [277e145e-c6bc-3372-01d0-6c5b75b92d73:foreman]
> ERROR o.a.drill.exec.work.foreman.Foreman - SYSTEM ERROR:
> NumberFormatException: For input string: "004_"
>
>
> [Error Id: 53fa92e1-477e-45d2-b6f7-6eab9ef1da35 on
> prod-hadoop-101.bom-prod.aws.games24x7.com:31010]
> org.apache.drill.common.exceptions.UserException: SYSTEM ERROR:
> NumberFormatException: For input string: "004_"
>
>
> [Error Id: 53fa92e1-477e-45d2-b6f7-6eab9ef1da35 on
> prod-hadoop-101.bom-prod.aws.games24x7.com:31010]
> at
> org.apache.drill.common.exceptions.UserException$
> Builder.build(UserException.java:543)
> ~[drill-common-1.9.0.jar:1.9.0]
> at
> org.apache.drill.exec.work.foreman.Foreman$ForemanResult.
> close(Foreman.java:825)
> [drill-java-exec-1.9.0.jar:1.9.0]
> at
> org.apache.drill.exec.work.foreman.Foreman.moveToState(
> Foreman.java:935)
> [drill-java-exec-1.9.0.jar:1.9.0]
> at org.apache.drill.exec.work.foreman.Foreman.run(Foreman.
> java:281)
> [drill-java-exec-1.9.0.jar:1.9.0]
> at
> java.util.concurrent.ThreadPoolExecutor.runWorker(
> ThreadPoolExecutor.java:1142)
> [na:1.8.0_72]
> at
> java.util.concurrent.ThreadPoolExecutor$Worker.run(
> ThreadPoolExecutor.java:617)
> [na:1.8.0_72]
> at java.lang.Thread.run(Thread.java:745) [na:1.8.0_72]
> Caused by: org.apache.drill.exec.work.foreman.ForemanException:
> Unexpected
> exception during fragment initialization: Internal error: Error while
> applying rule DrillPushProjIntoScan, args
> [rel#4220197:LogicalProject.NONE.ANY([]).[](input=rel#
> 4220196:Subset#0.ENUMERABLE.ANY([]).[],sid=$0,channel_id=$
> 1,c_t=$2,l_t=$3,login_date=$4),
> rel#4220181:EnumerableTableScan.ENUMERABLE.ANY([]).[](table=[hive,
> logindetails_all])]
> ... 4 common frames omitted
> Caused by: java.lang.AssertionError: Internal error: Error while
> applying
> rule 

Re: Exception : IndexOutOfBoundsException: index: 0, length: 264 - ... querying mongodb

2016-12-07 Thread Chunhui Shi
The length of utf8 encoded byte array is not guarantee to be the same as
String.length().  A fix should be in BsonRecordReader.writeString().

On Wed, Dec 7, 2016 at 3:11 AM, yousuf  wrote:

>
> Hi
>
> I'm currently exploring apache drill, running on a cluster mode. my
> datasoure is mongodb.My datasource table contains 5 million documents. I
> can't execute a simple query
>
> |select body from mongo.twitter.tweets limit 10;|
>
> *Throwing exception*
>
> |QueryFailed:AnErrorOccurredorg.apache.drill.common.
> exceptions.UserRemoteException:SYSTEM ERROR:IndexOutOfBoundsExceptio
> n:index:0,length:264(expected:range(0,256))Fragment1:2[Error
> Id:8903127a-e9e9-407e-8afc-2092b4c03cf0on test01.css.org:31010](java.lan
> g.IndexOutOfBoundsException)index:0,length:264(expected:rang
> e(0,256))io.netty.buffer.AbstractByteBuf.checkIndex():1134io
> .netty.buffer.PooledUnsafeDirectByteBuf.setBytes():272io.
> netty.buffer.WrappedByteBuf.setBytes():390io.netty.buffer.
> UnsafeDirectLittleEndian.setBytes():30io.netty.buffer.DrillB
> uf.setBytes():753io.netty.buffer.AbstractByteBuf.setByte
> s():510org.apache.drill.exec.store.bson.BsonRecordReader.
> writeString():265org.apache.drill.exec.store.bson.
> BsonRecordReader.writeToListOrMap():167org.apache.drill.
> exec.store.bson.BsonRecordReader.write():75org.apache.drill.
> exec.store.mongo.MongoRecordReader.next():186org.apache.drill.exec.physi
> cal.impl.ScanBatch.next():178org.apache.drill.exec.recor
> d.AbstractRecordBatch.next():119org.apache.drill.exec.
> record.AbstractRecordBatch.next():109org.apache.drill.
> exec.record.AbstractSingleRecordBatch.innerNext():51org.
> apache.drill.exec.physical.impl.limit.LimitRecordBatch.in
> nerNext():115org.apache.drill.exec.record.AbstractRecordBatc
> h.next():162org.apache.drill.exec.record.AbstractRecordBatch.next():
> 119org.apache.drill.exec.record.AbstractRecordBatch.
> next():109org.apache.drill.exec.record.AbstractSingleReco
> rdBatch.innerNext():51org.apache.drill.exec.physical.impl.svremover.
> RemovingRecordBatch.innerNext():94org.apache.drill.exec.
> record.AbstractRecordBatch.next():162org.apache.drill.exec.physical.impl.
> BaseRootExec.next():104org.apache.drill.exec.physical.
> impl.SingleSenderCreator$SingleSenderRootExec.innerNext():
> 92org.apache.drill.exec.physical.impl.BaseRootExec.
> next():94org.apache.drill.exec.work.fragment.FragmentExecutor$1.run():
> 232org.apache.drill.exec.work.fragment.FragmentExecutor$1.
> run():226java.security.AccessController.doPrivileged():-
> 2javax.security.auth.Subject.doAs():422org.apache.hadoop.
> security.UserGroupInformation.doAs():1657org.apache.drill.
> exec.work.fragment.FragmentExecutor.run():226org.apache.
> drill.common.SelfCleaningRunnable.run():38java.util.
> concurrent.ThreadPoolExecutor.runWorker():1142java.util.
> concurrent.ThreadPoolExecutor$Worker.run():617java.lang.Thread.run():745|
>
> *Working query which is fetching results:*
>
> |select body from mongo.twitter.tweets where tweet_id ='tag:
> search.twitter.com,2005:xx';|
>
> Sample document in source
>
> |{"_id":ObjectId("58402ad5757d7fede822e641"),"rule_list":["x","(contains:x
> (contains:y OR contains:y1)) OR (contains:v contains:b) OR (contains:v
> (contains:r OR contains:t))"],"actor_friends_
> count":79,"klout_score":19,"actor_favorites_count":0,"actor_
> preferred_username":"xxx","sentiment":"neg","tweet_id":"tag:
> search.twitter.com,2005:x","object_actor_
> followers_count":1286,"actor_posted_time":"2016-07-16T14:
> 08:25.000Z","actor_id":"id:twitter.com:","actor_
> display_name":"x","retweet_count":6,"hashtag_list":["myhashtag"],"body":"my
> tweet body","actor_followers_count":25,"actor_status_count":243,"v
> erb":"share","posted_time":"2016-08-01T07:49:00.000Z","objec
> t_actor_status_count":206,"lang":"ar","object_actor_prefe
> rred_username":"xx","original_tweet_id":"tag:search.twitter.com
> ,2005:xx","gender":"male","object_actor_id":"id:twitter.com:
> xxx","favorites_count":0,"object_posted_time":"2016-06-20T04:12:02.
> 000Z","object_actor_friends_count":2516,"generator_display_name":"Twitter
> for iPhone","object_actor_display_name":"sdfsf","actor_listed_count":0}|
>
> Any help is appreciated!
>
> Yousuf
>
>


Hangout tomorrow

2016-11-28 Thread Chunhui Shi
Hi all,

We are going to have our bi-weekly hangout tomorrow (11/29/16, 10 AM
PST). Please
add your suggestions or topics to this thread.  We will be asking for
topics at the beginning of the hangout.

Hangout link:
https://hangouts.google.com/hangouts/_/event/ci4rdiju8bv04a64efj5fedd0lc

Regards,
Chunhui


Re: Hangout will start in minutes

2016-08-25 Thread Chunhui Shi
Minutes for Hangout (Aug. 23rd 2016)

1, 1.8.0 release

Jinfeng as release manager of 1.8.0 mentioned the performance regression
issues.

1, count * from Parquet and count * from JSON table regression, Padma &
Arina were looking at it and Arina will be working on a fix.


2, Metadata cache performance (Aman)

Aman needs some some tests to run. Hopefully this one is getting closed.


Besides that, about DRILL-4850, Aman suggested to defer for this release,
Per Jinxing, this seems a random issue, recordBatch not always coming in
late causing schema change, generating a new type (Shawn?), exchange
expects a BigInt


People also discuss the proposal about improvement on unit tests. Paul was
suggesting we may need someway to allow running only relevant parts of unit
tests.

Part suggested to document Jason’s operator tests change so we know how to
leverage it

Then we also discuss build improvement, e.g. split java-exec into smaller
pieces,  remove stdout output of unit tests, maven run in parallel, timing
unit tests, drill bit restart, etc.

unit tests fails often. windows, Mac, linux,


Chunhui



On Tue, Aug 23, 2016 at 10:04 AM, Chunhui Shi <c...@maprtech.com> wrote:

> The link of hangout is:
>
> https://plus.google.com/hangouts/_/event/ci4rdiju8bv04a64efj5fedd0lc
>


Hangout will start in minutes

2016-08-23 Thread Chunhui Shi
The link of hangout is:

https://plus.google.com/hangouts/_/event/ci4rdiju8bv04a64efj5fedd0lc


Re: [ANNOUNCE] New PMC Chair of Apache Drill

2016-05-25 Thread Chunhui Shi
Big congratulations to Parth!
Thanks Jacques for founding Drill project and way to go drillers!

Chunhui

On Wed, May 25, 2016 at 11:45 AM, John Omernik  wrote:

> Congratz Parth, and thank you Jacques!
>
> On Wed, May 25, 2016 at 1:25 PM, Xiao Meng  wrote:
>
> > Big congratulations, Parth!
> >
> > And thank you, Jacques, for the leadership and the tremendous
> contributions
> > to the community.
> >
> > Best,
> >
> > Xiao
> >
> > On Wed, May 25, 2016 at 8:35 AM, Jacques Nadeau 
> > wrote:
> >
> > > I'm pleased to announce that the Drill PMC has voted to elect Parth
> > Chandra
> > > as the new PMC chair of Apache Drill. Please join me in congratulating
> > > Parth!
> > >
> > > thanks,
> > > Jacques
> > >
> > > --
> > > Jacques Nadeau
> > > CTO and Co-Founder, Dremio
> > >
> >
>


Re: Drill Issues

2016-05-18 Thread Chunhui Shi
Hi Sanjiv,

3)  As to authentication in windows, I think you already got a proper
reply, that is you have to implement your own authenticator. Depends on
your windows environment, you can decide either call windows API to check
user+password, or connect Active Directory either through LDAP or Kerberos
to get authenticated, I think there are some example around in mail lists.

2) As to this one, I think you did not provide the requested information --
the plan of the problematic query, you can get the plan by running "explain
plan for " in sqlline.

1) For 1 and 2, I think you can file a JIRA for each of the issues, and
post plan and attach drillbit.log to the JIRA.

On Tue, May 17, 2016 at 11:41 PM, Khurram Faraaz 
wrote:

> Hello Sanjiv,
>
> I work with the Drill team. I don't have your previous email (I joined the
> user group recently). I can take a look at your join query problem and your
> multiple columns having same name and same type issue.
>
> Please share/resend your earlier email, and I can take a look.
> Two things (1) please share the explain plan for your join query problem,
> if you can get one. (2) please share information from drillbit.log related
> to your same name same type many columns issue.
>
> Also I assume your Drillbits are setup on a Linux/UNIX environment.
>
> Thanks,
> Khurram
>
> On Wed, May 18, 2016 at 11:37 AM, Sanjiv Kumar 
> wrote:
>
> > Hello
> >   I am facing some problem while using drill. I have also posted
> > earlier my problem one by one, but didn't get any proper solution for
> that.
> > This time again i am posting my problem.
> >
> > 1) Join Query Problem (for details check in April 16  with *subject*:-
> > Regarding
> > Join Query Problem)
> >
> > 2) Multiple Column having same name & same data type problem (for details
> > check in April 16 *Subject*:- Multiple Column having same name & same
> data
> > type problem)
> >
> > 3) Authentication in Window Operating System (for details check in April
> 16
> > *SUBJECT*:- Regarding Drill Authentication)
> >
> >
> > I hope  this time i get some proper solution from Drill.
> >
> >
> >
> >  ..
> >   Thanks & Regards
> >   *Sanjiv Kumar*
> >
>