Re: Apache Drill on Kubernetes

2018-07-26 Thread Saurabh Mahapatra
Hey Arjun, Is the need for kubernetes a top down requirement in your architecture? John is right when it comes to running Drill inside a container. But there was some talk of addressing the other problem which is whether K8 can be a resource manager for multiple Drill clusters...an

Re: Help with Apache Drill - S3 compatible storage connectivity

2018-06-26 Thread Saurabh Mahapatra
I mean seriously. Dummy id? Sent from my iPhone > On Jun 26, 2018, at 11:30 AM, dummy id wrote: > > Can i get an update on this please? > >> On Fri, Jun 15, 2018 at 11:36 AM, dummy id wrote: >> >> *Team, * >> >> >> >> *I am not sure who can help me out with this, so just adding both

Re: is there any way to download the data through Drill Web UI?

2018-05-16 Thread Saurabh Mahapatra
nk BI tools should have a way to export the data to csv. Tableau does. > > > Thanks. > > > --Robert > > > From: Saurabh Mahapatra > Sent: Tuesday, May 15, 2018 6:56 PM > To: Divya Gehlot > Cc: user@drill.apache.org > Subject:

Re: is there any way to download the data through Drill Web UI?

2018-05-15 Thread Saurabh Mahapatra
eam(BI > analysts) who are not big data evangelist and they prefer the feasible way to > download the data , ad they query for their analysis > > Thanks, > Divya > >> On 14 May 2018 at 22:54, Saurabh Mahapatra <saurabhmahapatr...@gmail.com> >> wrote: >

Re: is there any way to download the data through Drill Web UI?

2018-05-14 Thread Saurabh Mahapatra
Hi Divya, Are you looking for a button in the UI that downloads the table if interests you, see into a (say) CSV file? Thanks, Saurabh Sent from my iPhone > On May 14, 2018, at 7:21 AM, Vitalii Diravka > wrote: > > Hi Divya, > > Consider using of CTAS for

Re: Apache Drill + Google Storage via GCS Connector and Dataproc

2018-05-07 Thread Saurabh Mahapatra
Hey Joe, There are two options I see if you are looking for a quick turnaround on this. 1. Use affordable consulting services to get this work done and this would include support beyond your current problems. This would include making sure that the production system works as intended and you

Re: [Drill 1.12.0] : Suggestions on Downgrade to 1.11.0 & com.mysql.jdbc.exceptions.jdbc4.CommunicationsException

2018-03-16 Thread Saurabh Mahapatra
Anyone have any suggestion on this? Makes me wonder if anything changed? On Fri, Mar 16, 2018 at 1:15 AM, Anup Tiwari wrote: > Hi All, > We checked our MySQL max number of connections which is set to 200 and i > think > this might be due to exceeding max number of

Re: How to get mongo data into saiku using apache drill

2018-03-16 Thread Saurabh Mahapatra
There is a video that was released recently: https://www.youtube.com/watch?v=91uufmvnigY I believe that Tom Barber from Spicule is on this mailing list and should be able to help. Thanks, Saurabh On Fri, Mar 16, 2018 at 2:09 PM, Kunal Khatua wrote: > For security reasons,

Re: Storage plugin for GraphQL

2018-03-16 Thread Saurabh Mahapatra
Its actually quite interesting, providing an alternative to REST: http://graphql.org/ On Fri, Mar 16, 2018 at 4:02 PM, Kunal Khatua wrote: > What is the data format? > > If you have a JDBC driver for that, you should be able to query it. > > On 2/24/2018 9:01:43 PM,

Participate in the Apache Drill Poll on Twitter

2018-03-15 Thread Saurabh Mahapatra
Participate in the Apache Drill Poll and have your voice heard through ONE vote: https://lnkd.in/gfWWXGd

Re: Way to "pivot"

2018-03-06 Thread Saurabh Mahapatra
Looks like SQL Server supports it, not sure if this is in the SQL standard: https://stackoverflow.com/questions/15931607/convert-rows-to-columns-using-pivot-in-sql-server On Tue, Mar 6, 2018 at 11:47 AM, Kunal Khatua wrote: > Not until now :) > > Can you file a JIRA

Re: Drill Blog on Medium.com

2018-03-05 Thread Saurabh Mahapatra
https://medium.com/@ApacheDrill/ If you have any articles you want to post, please send them to this list or email me directly. Thanks, Saurabh On Mon, Mar 5, 2018 at 4:06 PM, Charles Givre <cgi...@gmail.com> wrote: > +1 > > > On Mar 5, 2018, at 15:01, Saurabh Mahapatra &

Drill Blog on Medium.com

2018-03-05 Thread Saurabh Mahapatra
Hi all, I would like to propose a Drill blog at medium.com. As a blogging platform, it is very easy to use, maintain and has excellent reach across all kinds of users. What I really like about this platform is the speed with which we can get posts out. Many of you have reached out to me asking

Spicule Consulting Demo of Apache Drill and Saiku Analytics Enterprise

2018-02-26 Thread Saurabh Mahapatra
As I have said and will say again, Drill adoption continues to make a resurgence. And I am really happy to see consulting companies and open source tools work with us. Really nice to see this video from Spicule Consulting who are experts in BI and analytics. Check their twitter here:

Apache Drill 1.12 Released on MapR Platform

2018-02-08 Thread Saurabh Mahapatra
Hi all, As full disclosure, I work as a product manager at MapR. As some of you may know that, we, at MapR, promote the adoption of Apache Drill by promoting it on our platform. Just wanted to share this blog post with the community as your work is reaching a wider group of users in the

Which Hadoop File Format Should I Use?

2018-02-07 Thread Saurabh Mahapatra
Originally shared with me by Kuna Khatua but is a good read: https://www.jowanza.com/blog/which-hadoop-file-format-should-i-use The Carbondata project looks quite promising. Any thoughts on what file format you prefer? Thanks, Saurabh

Drill Questions

2018-02-06 Thread Saurabh Mahapatra
Posting this for John Humphreys who posted this in the MapR community but I think this may benefit all users: https://community.mapr.com/thread/22719-re-how-can-i-partition-data-in-drill 1. If I had Spark re-partition a data frame based on a column, and then saved the data frame to

Re: Apache Drill with Azure Data Lake Store

2018-02-04 Thread Saurabh Mahapatra
Hi Kamal, My understanding was that the file system running on top of Azure data store was still HDFS? Is that true? If that be the case, the DFS plugin should work. It is worth a test. Thanks, Saurabh Sent from my iPhone > On Feb 3, 2018, at 6:02 PM, Kamal Baig wrote:

Re: Apache Phoenix integration

2018-02-02 Thread Saurabh Mahapatra
ler.java:141) >>> org.eclipse.jetty.server.handler.HandlerWrapper.handle( >>> HandlerWrapper.java:97) >>> org.eclipse.jetty.server.Server.handle(Server.java:462) >>> org.eclipse.jetty.server.HttpChannel.handle(HttpChannel.java:279) >>> org.

Re: Does Drill support Full-text search as in Elasticsearch

2017-12-06 Thread Saurabh Mahapatra
Hey Ayush, Just to add to what Aman said: I think the request that I have heard is a SQL interface to Elasticsearch especially when you are joining result with another data source like a RDMBS. Is that what you are trying to accomplish? But the guts of the technology i.e. indexing, search,

Apache Arrow Integration

2017-11-16 Thread Saurabh Mahapatra
Hi all, I wanted to get some thoughts on leveraging Apache Arrow for improving Drill speed. I believe this was discussed in the Drill hackathon in September. So what was decided? Any thoughts are more than welcome. Am I right when I say that leveraging an in-memory representation like Arrow is

Re: Can Apache Drill perform streaming queries?

2017-11-09 Thread Saurabh Mahapatra
Isn't there the new Kafka plugin? What does that exactly do? Best, Saurabh Sent from my iPhone > On Nov 9, 2017, at 5:15 AM, kant kodali wrote: > > Hi Tug, > > It's Parquet data on HDFS and the data to HDFS is constantly written by > spark while consuming from Kafka. >

Apache Drill Cookbook

2017-11-03 Thread Saurabh Mahapatra
Hi all, I was curious if there is anyone in the community writing a book on Apache Drill. I think the most popular query engines have O'Reilly books on them and I think we should have our own. Given the interest and adoption of Drill in recent months, I think this is an absolute must. Any

Re: Drill performance question

2017-10-30 Thread Saurabh Mahapatra
o:ted.dunn...@gmail.com] > Sent: Monday, October 30, 2017 9:34 AM > To: user <user@drill.apache.org> > Subject: Re: Drill performance question > > Also, on a practical note, Parquet will likely crush CSV on performance. > Columnar. Compressed. Binary. All that. > >

Re: looking for dotnet drivers to access the drill

2017-10-30 Thread Saurabh Mahapatra
Hi Ram, I am curious if you are trying to access Drill from a .NET application. Is this a C# or Java app? Not sure about your specifics but you most probably can use the ODBC driver. See this: https://msdn.microsoft.com/en-us/library/ms228366(v=vs.90).aspx I do not know enough about what you

Re: Drill performance question

2017-10-30 Thread Saurabh Mahapatra
Hi Charles, Can you share some query patterns on this data? More specifically, the number of columns you retrieving out of the total, the filter on the time dimension itself (ranges and granularities) How much is ad hoc and how much is not. Best, Saurabh On Mon, Oct 30, 2017 at 9:27 AM,

Re: YARN support for Drill

2017-10-26 Thread Saurabh Mahapatra
This is absolutely great news for the community!! Thanks for letting us know. Will this be available in the 1.12 release due in November/December? Best, Saurabh On Thu, Oct 26, 2017 at 10:23 AM, Paul Rogers wrote: > Hi All, > > We just issued a PR #1011 [1] to run Drill as

S3 Plugin: Question on Object Metadata

2017-10-24 Thread Saurabh Mahapatra
Hi there, I would like to be Drill to process specific S3 objects within a bucket that match certain object metadata criteria. I intend to add custom metadata to my S3 objects: http://docs.aws.amazon.com/AmazonS3/latest/user-guide/add-object-metadata.html Does the S3 plugin support this? If

Re: Benchmark numbers using Drill

2017-10-24 Thread Saurabh Mahapatra
wrote: > Yes a very good info which helps a lots of ppl like me who is using Drill > as one of their production environment > cant we share this information as recommendation to Dril users on the > Apache Drill KB ? > > On 20 October 2017 at 01:58, Saurabh Mahapatra < > saurabhm

Re: Benchmark numbers using Drill

2017-10-19 Thread Saurabh Mahapatra
I do not think you will get such information about benchmarks from customers on production workloads. But from the customers I have worked with who have taken Drill to production, here is some information that may be of use to you: 1. The trend universally has been to use beefier machines for

Re: User client timeout with results > 2M rows

2017-09-21 Thread Saurabh Mahapatra
Brings us back to the feedback that John Omernik has given time and time again that we need to improve the error/warning messages. Wasn't this discussed at a session on the user experience at the Drill hackathon this week? Best, Saurabh On Thu, Sep 21, 2017 at 10:36 AM, Kunal Khatua

Re: Does Drill Use Apache Struts

2017-09-08 Thread Saurabh Mahapatra
Thanks John, all. I think this discussion thread is important. As a community member, I learn so much by reading these threads. Since you work in cyber security research, are there specific things we should think about from a security standpoint for Drill? I know that we have a REST API and

New Oracle Data Visualization Integration with Apache Drill

2017-09-05 Thread Saurabh Mahapatra
>From the blog https://blogs.oracle.com/analyticscloud/how-to-explore-data-sets-simultaneously-with-oracle-data-visualization?platform=hootsuite Best, Saurabh

Apache Drill vs Amazon Athena – A Comparison on Data Partitioning

2017-09-05 Thread Saurabh Mahapatra
Drill is getting more traction with comparisons:) Saw this blog post from Treselle Engineering. You should be able to see this on Twitter @ApacheDrill http://www.treselle.com/blog/apache-drill-vs-amazon-athena-a-comparison-on-data-partitioning/ Best, Saurabh

Re: Avro - Let's talk Avro again

2017-08-18 Thread Saurabh Mahapatra
Thank you for this candid feedback, Stefan. The fact that you even decided to write an email offering this feedback despite moving away from Drill just suggests to me that you are still a supporter. We need all the help that we can get from every member in this community to make Drill provide

Blog for Apache Drill

2017-08-08 Thread Saurabh Mahapatra
Hi all, Any ideas on how we can make this happen. I was thinking of keeping a dedicated page in Medium? And have it integrated into the website? Ideas please? I would like to start collecting some topics that we can talk about. Thanks, Saurabh

Re: Drill performance tuning parquet

2017-08-01 Thread Saurabh Mahapatra
> > I don't think I am CPU bound based on the stats that EC2 gave me. I > > am using this example to both learn how to scale drill and also how to > > understand it. > > > > We are considering using it for our text file processing as an > > exploratory too

Re: Drill and Tableau

2017-07-31 Thread Saurabh Mahapatra
then create a nested directory structure > Somthing like as shown below > /path/to/directory > /country1 >/datafiles.parquet > /country2 > datafiles.parquet > > Thanks, > Divya > > > > On 26 July 2017 at 03:21, Sau

Re: Elastic Search Plugins

2017-07-28 Thread Saurabh Mahapatra
I concur. This would be very valuable for several users that I know who have asked for this. Can we have this as an agenda item for the next hangout? Thanks, Saurabh On Fri, Jul 28, 2017 at 11:49 AM, Charles Givre wrote: > I completely concur on that!! It seems as if the PR

Document Plugins and Integrations for Drill

2017-07-28 Thread Saurabh Mahapatra
Hi all, I am beginning to see that there are a variety of plugins and integrations has been built for Drill and I keep discovering them everyday. Here are two that I came across today: Excel plugin: https://github.com/bizreach/drill-excel-plugin/ Ambari integration:

Re: How much more information can error messages provide?

2017-07-27 Thread Saurabh Mahapatra
I completely agree with John. I think we need to make error/warning messages more friendly moving forward with any new features that ship. Please share the JIRA that you create. But a holistic approach scares me. How would we prioritize the ones that would impact most users? Any thoughts on that.

Re: Drill performance tuning parquet

2017-07-27 Thread Saurabh Mahapatra
Hi Dan, Here are some thoughts from my end. So this is just one query and you have the numbers. But how about a representative collection? Do you have the use cases? Now, I know from experience that if you can predict the pattern of the queries to about 60%, that would be great. The rest could

Re: append data to already existing table saved in parquet format

2017-07-26 Thread Saurabh Mahapatra
for now. > > - Paul > >> On Jul 26, 2017, at 8:46 AM, Saurabh Mahapatra < > saurabhmahapatr...@gmail.com> wrote: >> >> Does Drill provide that kind of functionality? Theoretically yes. CTAS >> should work. But your cluster has to be sized. But I would n

Re: append data to already existing table saved in parquet format

2017-07-26 Thread Saurabh Mahapatra
I always recommend against using CTAS as a shortcut for a ETL type large workload. You will need to size your Drill cluster accordingly. Consider using Hive or Spark instead. What are the source file formats? For every hour, what is the size and the number of rows for that data? Are you doing any

Re: Drill and Tableau

2017-07-25 Thread Saurabh Mahapatra
Hi Divya, There is nothing as a naive question. Please feel free to post any questions you have. There is someone in the community that will help you out. This is my opinion: There are a variety of BI tools in the market that offer excellent visualization and interaction with data capabilities.