Re: Multiple fragments in apache drill

Kwizera hugues Teddy Fri, 15 Feb 2019 07:25:09 -0800

Hi Kunal Khatua,

Thanks for your helpful response.


Kind Hugues

On Wed, Feb 13, 2019 at 8:22 PM Kunal Khatua <[email protected]> wrote:

> Hi Hugues
>
> The number of fragments is determined by the number of sources (i.e.
> whether the data can be read in parallel) and the number of estimated rows.
> CSV and Parquet files are easy to read in parallel, but JSON files are
> not, because Drill does not know how many JSON documents exist in the file
> and where their offsets are.
>
> The number of estimated rows tells Drill whether to parallelize a major
> fragment of operators. You can try reducing this property in your
> session/system via the UI [/options page] :
> planner.slice_target
>
> ~ Kunal
>
> On 2/13/2019 7:14:34 AM, Kwizera hugues Teddy <[email protected]> wrote:
> Hello Team drill,
>
> I'm executing a query in Apache drill cluster, however, it is making only 1
> minor segment. I have tried various queries like union of 2 queries
> , aggragation etc, and executing it on millions records however it is
> still making 1 fragment only. Is there any configuration change that I can
> do for making multiple segments so that these could be executed on each
> drill bit individually. How can I confirm whether the query is being
> executed on 1 drillbit instance or multiple instances.
>
> - We are trying to compare Impala vs Drill , but for the moment Impala is
> more fast Than Drill
>
> - Environment :
>
> Drill On Yarn : whith 6 drillbits;
>
>
> Regards Hugues Teddy
>

Re: Multiple fragments in apache drill

Reply via email to