Running out of heap could also make a Drillbit become irresponsive,
eventually it will die after printing the following message in it's
drillbit.out:

Unable to handle out of memory condition in FragmentExecutor

You may want to check your drillbits' drillbit.out for such message

On Mon, Jun 13, 2016 at 4:27 PM, John Omernik <[email protected]> wrote:

> I'd like to talk about that on the hangout.  Drill should do better at
> failing with a clean oom error rather then having a bit go unresponsive.
> Can just that bit be restarted to return to a copacetic state? As an admin,
> if this is the case, how do I find this bit?
>
> Other than adding RAM, are there any query tuning settings that could help
> prevent the unresponsive bit? ( I see this as two issues, the memory
> settings for the 1024m block size CTAS and the how can we prevent a bit
> from going unresponsive? )
> On Jun 13, 2016 6:19 PM, "Parth Chandra" <[email protected]> wrote:
>
> The only time I've seen a drillbit get unresponsive is when you run out of
> Direct memory. Did you see any 'Out of Memory Error' in your logs? If you
> see those then you need to increase the Direct memory setting for the JVM.
> ( DRILL_MAX_DIRECT_MEMORY in drill-env.sh)
>
>
>
>
> On Mon, Jun 13, 2016 at 4:10 PM, John Omernik <[email protected]> wrote:
>
> > The 512m block size worked.  My issue with the 1024m block size was on
> the
> > writing using a CTAS.... that's where my nodes got into a bad
> state....thus
> > I am wondering what setting on drill would be the right setting to help
> > node memory pressures on a CTAs using 1024m block size
> > On Jun 13, 2016 6:06 PM, "Parth Chandra" <[email protected]> wrote:
> >
> > In general, you want to make the Parquet block size and the HDFS block
> size
> > the same. A Parquet block size that is larger than the HDFS block size
> can
> > split a Parquet block ( i.e. row_group ) across nodes and that will
> > severely affect performance as data reads will no longer be local. 512 MB
> > is a pretty good setting.
> >
> > Note that you need to ensure the Parquet block size in the source file
> > which (maybe) was produced outside of Drill. So you will need to make the
> > change in the application used to write the Parquet file.
> >
> > If you're using Drill to write the source file as well then, of course,
> the
> > block size setting will be used by the writer.
> >
> > If you're using the new reader, then there is really no knob you have to
> > tweak. Is parquet-tools able to read the file(s)?
> >
> >
> >
> > On Mon, Jun 13, 2016 at 1:59 PM, John Omernik <[email protected]> wrote:
> >
> > > I am doing some performance testing, and per the Impala documentation,
> I
> > am
> > > trying to use a block size of 1024m in both Drill and MapR FS.  When I
> > set
> > > the MFS block size to 512 and the Drill (default) block size I saw some
> > > performance improvements, and wanted to try the 1024 to see how it
> > worked,
> > > however, my query hung and I got into that "bad state" where the nodes
> > are
> > > not responding right and I have to restart my whole cluster (This
> really
> > > bothers me that a query can make the cluster be unresponsive)
> > >
> > > That said, what memory settings can I tweak to help the query work.
> This
> > is
> > > quite a bit of data, a CTAS from Parquet to Parquet, 100-130G of data
> per
> > > data (I am doing a day at a time), 103 columns.   I have to use the
> > > "use_new_reader" option due to my other issues, but other than that I
> am
> > > just setting the block size on MFS and then updating the block size in
> > > Drill, and it's dying. Since this is a simple CTAS (no sort) which
> > settings
> > > can be beneficial for what is happening here?
> > >
> > > Thanks
> > >
> > > John
> > >
> >
>



-- 

Abdelhakim Deneche

Software Engineer

  <http://www.mapr.com/>


Now Available - Free Hadoop On-Demand Training
<http://www.mapr.com/training?utm_source=Email&utm_medium=Signature&utm_campaign=Free%20available>

Reply via email to