[ 
https://issues.apache.org/jira/browse/CARBONDATA-2143?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ravindra Pesala resolved CARBONDATA-2143.
-----------------------------------------
       Resolution: Fixed
    Fix Version/s: 1.3.1

> Fixed query memory leak issue for task failure during initialization of 
> record reader
> -------------------------------------------------------------------------------------
>
>                 Key: CARBONDATA-2143
>                 URL: https://issues.apache.org/jira/browse/CARBONDATA-2143
>             Project: CarbonData
>          Issue Type: Bug
>            Reporter: Manish Gupta
>            Assignee: Manish Gupta
>            Priority: Major
>             Fix For: 1.3.1
>
>          Time Spent: 1h 50m
>  Remaining Estimate: 0h
>
> *Problem:*
>  Whenever a query is executed, in the internalCompute method of CarbonScanRdd 
> class record reader is initialized. A task completion listener is attached to 
> each task after initialization of the record reader.
>  During record reader initialization, queryResultIterator is initialized and 
> one blocklet is processed. The blocklet processed will use available unsafe 
> memory.
>  Lets say there are 100 columns and 80 columns get the space but there is no 
> space left for the remaining columns to be stored in the unsafe memory. This 
> will result is memory exception and record reader initialization will fail 
> leading to failure in query.
>  In the above case the unsafe memory allocated for 80 columns will not be 
> freed and will always remain occupied till the JVM process persists.
> *Impact*
>  It is memory leak in the system and can lead to query failures for queries 
> executed after one one query fails due to the above reason.
> *Exception Trace*
> java.lang.RuntimeException: java.util.concurrent.ExecutionException: 
> java.lang.RuntimeException: java.lang.RuntimeException: 
> org.apache.carbondata.core.memory.MemoryException: Not enough memory
>                at 
> org.apache.carbondata.core.scan.processor.AbstractDataBlockIterator.updateScanner(AbstractDataBlockIterator.java:136)
>                at 
> org.apache.carbondata.core.scan.processor.impl.DataBlockIteratorImpl.next(DataBlockIteratorImpl.java:50)
>                at 
> org.apache.carbondata.core.scan.processor.impl.DataBlockIteratorImpl.next(DataBlockIteratorImpl.java:32)
>                at 
> org.apache.carbondata.core.scan.result.iterator.DetailQueryResultIterator.getBatchResult(DetailQueryResultIterator.java:49)
>                at 
> org.apache.carbondata.core.scan.result.iterator.DetailQueryResultIterator.next(DetailQueryResultIterator.java:41)
>                at 
> org.apache.carbondata.core.scan.result.iterator.DetailQueryResultIterator.next(DetailQueryResultIterator.java:31)
>                at 
> org.apache.carbondata.core.scan.result.iterator.ChunkRowIterator.<init>(ChunkRowIterator.java:41)
>                at 
> org.apache.carbondata.hadoop.CarbonRecordReader.initialize(CarbonRecordReader.java:84)
>                at 
> org.apache.carbondata.spark.rdd.CarbonScanRDD.internalCompute(CarbonScanRDD.scala:378)
>                at 
> org.apache.carbondata.spark.rdd.CarbonRDD.compute(CarbonRDD.scala:60)
>                at 
> org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:324)
>                at org.apache.spark.rdd.RDD.iterator(RDD.scala:288)
>                at 
> org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38)
>                at 
> org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:324)
>                at org.apache.spark.rdd.RDD.iterator(RDD.scala:288)
>                at 
> org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38)
>                at 
> org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:324)
>                at org.apache.spark.rdd.RDD.iterator(RDD.scala:288)
>                at 
> org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38)
>                at 
> org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:324)
>                at org.apache.spark.rdd.RDD.iterator(RDD.scala:288)
>                at 
> org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38)
>                at 
> org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:324)
>                at org.apache.spark.rdd.RDD.iterator(RDD.scala:288)
>                at 
> org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:87)
>                at org.apache.spark.scheduler.Task.run(Task.scala:99)
>                at 
> org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:322)
>                at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
>                at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to