Hi Kunal Khatua, Thanks for your helpful response.
Kind Hugues On Wed, Feb 13, 2019 at 8:22 PM Kunal Khatua <[email protected]> wrote: > Hi Hugues > > The number of fragments is determined by the number of sources (i.e. > whether the data can be read in parallel) and the number of estimated rows. > CSV and Parquet files are easy to read in parallel, but JSON files are > not, because Drill does not know how many JSON documents exist in the file > and where their offsets are. > > The number of estimated rows tells Drill whether to parallelize a major > fragment of operators. You can try reducing this property in your > session/system via the UI [/options page] : > planner.slice_target > > ~ Kunal > > On 2/13/2019 7:14:34 AM, Kwizera hugues Teddy <[email protected]> wrote: > Hello Team drill, > > I'm executing a query in Apache drill cluster, however, it is making only 1 > minor segment. I have tried various queries like union of 2 queries > , aggragation etc, and executing it on millions records however it is > still making 1 fragment only. Is there any configuration change that I can > do for making multiple segments so that these could be executed on each > drill bit individually. How can I confirm whether the query is being > executed on 1 drillbit instance or multiple instances. > > - We are trying to compare Impala vs Drill , but for the moment Impala is > more fast Than Drill > > - Environment : > > Drill On Yarn : whith 6 drillbits; > > > Regards Hugues Teddy >
