Hi Dweep, This mailing list does not support attachments. Consider filing a JIRA ticket and attaching your images there: [1]
You mention you've assigned Drill 14 GB of heap. You also mention that your task ran out of heap. As it turns out, Drill also uses direct memory to store intermediate data. I wonder if the error condition is actually about direct memory. How much direct memory have you given to Drill? 14 GB of heap and the default direct memory (8GB) should be plenty for a query that produces a 48 MB Parquet file: assuming that the input size is similar: ~200 MB (uncompressed JSON). You mention that you run the JSON-to-Parquet conversion once per hour. Do you use this Drill instance for any other tasks? Are there other tasks running at the same time? How many nodes of Drill are in use? Finally, you mention you use the REST API. Perhaps something odd is happening there. A stack trace of your error would help. The stack trace may be in the error message, or in the Drill log file. Thanks, - Paul [1] https://issues.apache.org/jira On Friday, June 7, 2019, 2:36:36 AM PDT, Dweep Sharma <[email protected]> wrote: Hi Divya, The size is 48 MB (after converting to Parquet) On Fri, Jun 7, 2019 at 1:45 PM Divya Gehlot <[email protected]> wrote: Can you share the more details . Query profile and other aspects like data size and all to have better view what’s happening Thanks , Divya On Fri, 7 Jun 2019 at 4:13 PM, Dweep Sharma <[email protected]> wrote: > Data is in JSON format. > > On Fri, Jun 7, 2019 at 1:39 PM Dweep Sharma <[email protected]> > wrote: > > > Hi, > > > > I have a memory leak issue. 14GB memory is assigned to heap but it gets > > full within a day with just one cron running. > > > > Task is a CTAS query from Kafka to S3 once every hour. CTAS is issued via > > the Drill Rest API. > > > > Please assist on a resolution. > > > > -Dweep > > > >
