Hi Juergen, can you share the query you tried to run ?
Thanks On Thu, Jul 23, 2015 at 9:10 AM, Juergen Kneissl <[email protected]> wrote: > Hi everybody, > > I installed and configured a small cluster with two machines (gnu/linux) > with the following setup: > > zookeeper in version 3.4.6 , drill in version 1.1.0 and also using > hadoop (version 2.7.1) hdfs as dist. filesystem. > > So, I am playing around a bit, but what I am still not understanding is > why my drill Foreman bit1 (or whoever that is in the situation) is not > "really" parallelizing my request. (or do I expect something from the > architecture that is not intended?) > > > I select and aggregate on a 1,4 GB gzipped csv file, and I thought at > least part of the query would be processed on the other drillbit. > (bit 2) > > For instance, in the profiles I see that Major Fragment 01 was divided > into four Minor Fragments (of which two were forwarded to bit 2) > > If I check on the drillbit.log file of the bit2 (in the above > configuration) a debug message tells me that the incoming record count > is 0? > > The question is: What am I doing wrong in my configuration? Has it > something todo with using a csv file? > > The query is also set in a way that it is clear the whole file has to be > read in memory. That does not concern me that much, now I just wanted to > check how the Foreman does the "Parallelization" > > Best Regards & Thanks for any hint > > > Juergen > -- Abdelhakim Deneche Software Engineer <http://www.mapr.com/> Now Available - Free Hadoop On-Demand Training <http://www.mapr.com/training?utm_source=Email&utm_medium=Signature&utm_campaign=Free%20available>
