Drill Questions

2018-02-06 Thread Saurabh Mahapatra
Posting this for John Humphreys who posted this in the MapR community but I think this may benefit all users: https://community.mapr.com/thread/22719-re-how-can-i-partition-data-in-drill 1. If I had Spark re-partition a data frame based on a column, and then saved the data frame to

RE: Apache Phoenix integration

2018-02-06 Thread Kunal Khatua
The downside of integrating first is that you might spend a lot of time fire-fighting 'bugs' that are already resolved with the later versions. Ideally, you want to go the other way around from the ground up. Bringing up the platform (CDH in this case) to more recent versions, and then work on

Re: Google Hangouts: Lateral Join High Level Design Presentation

2018-02-06 Thread Aman Sinha
It looks like Volodymyr also had a topic: Decimal types support. He is starting with that. I am not sure if there is going to be sufficient time to cover 2 topics today... On Tue, Feb 6, 2018 at 10:00 AM, Timothy Farkas wrote: > Google Hangout Reminder. > >

Re: Google Hangouts: Lateral Join High Level Design Presentation

2018-02-06 Thread Timothy Farkas
Google Hangout Reminder. https://hangouts.google.com/hangouts/_/event/ci4rdiju8bv04a64efj5fedd0lc From: Timothy Farkas Sent: Monday, February 5, 2018 12:35:22 PM To: d...@drill.apache.org; user@drill.apache.org Subject: Google Hangouts:

RE: Decimal Support Target Date & Workarounds?

2018-02-06 Thread john.humphreys
Perfect, thank you for the document and thanks so much for your time. -Original Message- From: Vova Vysotskyi [mailto:vvo...@gmail.com] Sent: Tuesday, February 06, 2018 10:10 AM To: user@drill.apache.org Subject: Re: Decimal Support Target Date & Workarounds? There are also problems

Re: Decimal Support Target Date & Workarounds?

2018-02-06 Thread Vova Vysotskyi
There are also problems with aggregate functions. Some of them, for example, return the result with double type instead of decimal. Queries on decimal columns may fail with exceptions, return the wrong result or run slower than similar queries with other types. For more details please see

RE: Decimal Support Target Date & Workarounds?

2018-02-06 Thread john.humphreys
Oh cool, thank you so much for the information. Any chance you can elaborate on the nature of the problems? Assuming you stayed away from decimal UDFs, would doing regular selects and aggregates (like max) on decimal columns cause failures? Also, are we talking about drill-bits actually

Re: Unit testing a simple function

2018-02-06 Thread Niels Basjes
Hi, I've been fiddeling around and I have something that seems to work on my machine ... I did a few things: 1) Split the project into two modules: the function and the tests. I did this because of the special packaging requirements that force the source of the function to be included.

Re: Apache Phoenix integration

2018-02-06 Thread Flavio Pompermaier
While it should not be a big problem to provide a cdh5 profile to Apache Drill, my current contribution was just a quick and dirty way to integrate Drill with Phoenix. I think it will be much better to avoid a fork of Apache Drill (i.e. Drillix) and try to merge the 2 things. Then I can work on

Re: Decimal Support Target Date & Workarounds?

2018-02-06 Thread Vova Vysotskyi
Hi John, Enabling decimal support does not make drill unstable if decimal types do not process. Problems may appear only when you trying to use decimal columns or some problematic decimal UDFs. Currently, we are actively working on completing decimal support and I suppose it will be available in

Re: PCAP files with Apache Drill and Sergeant R

2018-02-06 Thread Arjun kr
Hi Houssem, You should be able to query it using DFS plugin and S3 storage plugin ( I have not tried it with S3 plugin though). You can enable pcap format in storage plugin definition as given below. "formats": { , "pcap": { "type": "pcap" } } Also, it would be best to use Drill

Re: Apache drill validation error, table not found

2018-02-06 Thread Arjun kr
Is the failure due to ORC file not being supported by DFS/S3 plugin? This error may come if you are querying on unsupported format or if you don't have the format defined in corresponding storage plugin definition. Below is sample execution for junk format 'thenga' not defined in storage