Re: Heap memory - leak

Paul Rogers Fri, 07 Jun 2019 16:50:35 -0700

Hi Dweep,

This mailing list does not support attachments. Consider filing a JIRA ticket 
and attaching your images there: [1]

You mention you've assigned Drill 14 GB of heap. You also mention that your 
task ran out of heap. As it turns out, Drill also uses direct memory to store 
intermediate data. I wonder if the error condition is actually about direct 
memory. How much direct memory have you given to Drill?

14 GB of heap and the default direct memory (8GB) should be plenty for a query 
that produces a 48 MB Parquet file: assuming that the input size is similar: 
~200 MB (uncompressed JSON).

You mention that you run the JSON-to-Parquet conversion once per hour. Do you 
use this Drill instance for any other tasks? Are there other tasks running at 
the same time? How many nodes of Drill are in use? 

Finally, you mention you use the REST API. Perhaps something odd is happening 
there. A stack trace of your error would help. The stack trace may be in the 
error message, or in the Drill log file.

Thanks,
- Paul

[1] https://issues.apache.org/jira

    On Friday, June 7, 2019, 2:36:36 AM PDT, Dweep Sharma 
<[email protected]> wrote:  

 Hi Divya, 

The size is 48 MB (after converting to Parquet)

On Fri, Jun 7, 2019 at 1:45 PM Divya Gehlot <[email protected]> wrote:

Can you share the more details .
Query profile and other aspects like data size and all to have better view
what’s happening

Thanks ,
Divya

On Fri, 7 Jun 2019 at 4:13 PM, Dweep Sharma <[email protected]> wrote:

> Data is in JSON format.
>
> On Fri, Jun 7, 2019 at 1:39 PM Dweep Sharma <[email protected]>
> wrote:
>
> > Hi,
> >
> > I have a memory leak issue. 14GB memory is assigned to heap but it gets
> > full within a day with just one cron running.
> >
> > Task is a CTAS query from Kafka to S3 once every hour. CTAS is issued via
> > the Drill Rest API.
> >
> > Please assist on a resolution.
> >
> > -Dweep
> >
>
>

Re: Heap memory - leak

Reply via email to