[jira] [Commented] (CARBONDATA-241) OOM error during query execution in long run
[ https://issues.apache.org/jira/browse/CARBONDATA-241?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15499676#comment-15499676 ] ASF GitHub Bot commented on CARBONDATA-241: --- Github user asfgit closed the pull request at: https://github.com/apache/incubator-carbondata/pull/158 > OOM error during query execution in long run > > > Key: CARBONDATA-241 > URL: https://issues.apache.org/jira/browse/CARBONDATA-241 > Project: CarbonData > Issue Type: Bug >Reporter: kumar vishal >Assignee: kumar vishal > > **Problem:** During long run query execution is taking more time and it is > throwing out of memory issue. > **Reason**: In compaction we are compacting segments and each segment > metadata is loaded in memory. So after compaction compacted segments are > invalid but its meta data is not removed from memory because of this > duplicate metadata is pile up and it is taking more memory and after few days > query exeution is throwing OOM > **Solution**: Need to remove invalid blocks from memory > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CARBONDATA-241) OOM error during query execution in long run
[ https://issues.apache.org/jira/browse/CARBONDATA-241?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15499570#comment-15499570 ] ASF GitHub Bot commented on CARBONDATA-241: --- Github user gvramana commented on a diff in the pull request: https://github.com/apache/incubator-carbondata/pull/158#discussion_r79290267 --- Diff: hadoop/src/main/java/org/apache/carbondata/hadoop/CarbonInputFormat.java --- @@ -706,8 +725,9 @@ private String getUpdateExtension() { /** * @return updateExtension */ - private String[] getValidSegments(JobContext job) throws IOException { -String segmentString = job.getConfiguration().get(INPUT_SEGMENT_NUMBERS, ""); + private String[] getSegmentsFromConfiguration(JobContext job, String segmentType) + throws IOException { +String segmentString = job.getConfiguration().get(segmentType, ""); --- End diff -- change signature to previous one > OOM error during query execution in long run > > > Key: CARBONDATA-241 > URL: https://issues.apache.org/jira/browse/CARBONDATA-241 > Project: CarbonData > Issue Type: Bug >Reporter: kumar vishal >Assignee: kumar vishal > > **Problem:** During long run query execution is taking more time and it is > throwing out of memory issue. > **Reason**: In compaction we are compacting segments and each segment > metadata is loaded in memory. So after compaction compacted segments are > invalid but its meta data is not removed from memory because of this > duplicate metadata is pile up and it is taking more memory and after few days > query exeution is throwing OOM > **Solution**: Need to remove invalid blocks from memory > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CARBONDATA-241) OOM error during query execution in long run
[ https://issues.apache.org/jira/browse/CARBONDATA-241?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15499568#comment-15499568 ] ASF GitHub Bot commented on CARBONDATA-241: --- Github user gvramana commented on a diff in the pull request: https://github.com/apache/incubator-carbondata/pull/158#discussion_r79290374 --- Diff: integration/spark/src/main/scala/org/apache/carbondata/spark/rdd/CarbonScanRDD.scala --- @@ -102,7 +115,7 @@ class CarbonScanRDD[V: ClassTag]( val splits = carbonInputFormat.getSplits(job) if (!splits.isEmpty) { val carbonInputSplits = splits.asScala.map(_.asInstanceOf[CarbonInputSplit]) - + queryModel.setInvalidSegmentIds(validAndInvalidSegments.getInvalidSegments) --- End diff -- move this to common getSplits, other wise validAndInvalidSegments can be null, if parallel deletion happens. > OOM error during query execution in long run > > > Key: CARBONDATA-241 > URL: https://issues.apache.org/jira/browse/CARBONDATA-241 > Project: CarbonData > Issue Type: Bug >Reporter: kumar vishal >Assignee: kumar vishal > > **Problem:** During long run query execution is taking more time and it is > throwing out of memory issue. > **Reason**: In compaction we are compacting segments and each segment > metadata is loaded in memory. So after compaction compacted segments are > invalid but its meta data is not removed from memory because of this > duplicate metadata is pile up and it is taking more memory and after few days > query exeution is throwing OOM > **Solution**: Need to remove invalid blocks from memory > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CARBONDATA-241) OOM error during query execution in long run
[ https://issues.apache.org/jira/browse/CARBONDATA-241?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15499422#comment-15499422 ] ASF GitHub Bot commented on CARBONDATA-241: --- Github user gvramana commented on a diff in the pull request: https://github.com/apache/incubator-carbondata/pull/158#discussion_r79288466 --- Diff: core/src/main/java/org/apache/carbondata/core/carbon/datastore/BlockIndexStore.java --- @@ -260,11 +295,29 @@ public void removeTableBlocks(List removeTableBlocksInfos, } Mapmap = tableBlocksMap.get(absoluteTableIdentifier); // if there is no loaded blocks then return -if (null == map) { +if (null == map || map.isEmpty()) { + return; +} +Map segmentIdToBlockInfoMap = +segmentIdToBlockListMap.get(absoluteTableIdentifier); +if (null == segmentIdToBlockInfoMap || segmentIdToBlockInfoMap.isEmpty()) { return; } -for (TableBlockInfo blockInfos : removeTableBlocksInfos) { - map.remove(blockInfos); +synchronized (lockObject) { + for (String segmentId : segmentsToBeRemoved) { +List tableBlockInfoList = segmentIdToBlockInfoMap.get(segmentId); +if (null == tableBlockInfoList) { + continue; +} +Iterator tableBlockInfoIterator = tableBlockInfoList.iterator(); +while (tableBlockInfoIterator.hasNext()) { + TableBlockInfo info = tableBlockInfoIterator.next(); + AbstractIndex remove = map.remove(info); + if (null != remove) { --- End diff -- tableBlockInfoIterator.remove needs to called irrespective of null != remove > OOM error during query execution in long run > > > Key: CARBONDATA-241 > URL: https://issues.apache.org/jira/browse/CARBONDATA-241 > Project: CarbonData > Issue Type: Bug >Reporter: kumar vishal >Assignee: kumar vishal > > **Problem:** During long run query execution is taking more time and it is > throwing out of memory issue. > **Reason**: In compaction we are compacting segments and each segment > metadata is loaded in memory. So after compaction compacted segments are > invalid but its meta data is not removed from memory because of this > duplicate metadata is pile up and it is taking more memory and after few days > query exeution is throwing OOM > **Solution**: Need to remove invalid blocks from memory > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CARBONDATA-241) OOM error during query execution in long run
[ https://issues.apache.org/jira/browse/CARBONDATA-241?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15496781#comment-15496781 ] ASF GitHub Bot commented on CARBONDATA-241: --- Github user kumarvishal09 commented on a diff in the pull request: https://github.com/apache/incubator-carbondata/pull/158#discussion_r79207294 --- Diff: integration/spark/src/main/scala/org/apache/spark/sql/CarbonDatasourceHadoopRelation.scala --- @@ -20,20 +20,13 @@ package org.apache.spark.sql import java.text.SimpleDateFormat import java.util.Date -import org.apache.carbondata.core.carbon.AbsoluteTableIdentifier --- End diff -- done > OOM error during query execution in long run > > > Key: CARBONDATA-241 > URL: https://issues.apache.org/jira/browse/CARBONDATA-241 > Project: CarbonData > Issue Type: Bug >Reporter: kumar vishal >Assignee: kumar vishal > > **Problem:** During long run query execution is taking more time and it is > throwing out of memory issue. > **Reason**: In compaction we are compacting segments and each segment > metadata is loaded in memory. So after compaction compacted segments are > invalid but its meta data is not removed from memory because of this > duplicate metadata is pile up and it is taking more memory and after few days > query exeution is throwing OOM > **Solution**: Need to remove invalid blocks from memory > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CARBONDATA-241) OOM error during query execution in long run
[ https://issues.apache.org/jira/browse/CARBONDATA-241?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15495467#comment-15495467 ] ASF GitHub Bot commented on CARBONDATA-241: --- Github user kumarvishal09 commented on a diff in the pull request: https://github.com/apache/incubator-carbondata/pull/158#discussion_r79109942 --- Diff: processing/src/main/java/org/apache/carbondata/lcm/status/SegmentStatusManager.java --- @@ -102,6 +91,60 @@ public long getTableStatusLastModifiedTime() throws IOException { /** * get valid segment for given table + * + * @return + * @throws IOException + */ + public InvalidSegmentsInfo getInvalidSegments() throws IOException { --- End diff -- ok i will handle > OOM error during query execution in long run > > > Key: CARBONDATA-241 > URL: https://issues.apache.org/jira/browse/CARBONDATA-241 > Project: CarbonData > Issue Type: Bug >Reporter: kumar vishal >Assignee: kumar vishal > > **Problem:** During long run query execution is taking more time and it is > throwing out of memory issue. > **Reason**: In compaction we are compacting segments and each segment > metadata is loaded in memory. So after compaction compacted segments are > invalid but its meta data is not removed from memory because of this > duplicate metadata is pile up and it is taking more memory and after few days > query exeution is throwing OOM > **Solution**: Need to remove invalid blocks from memory > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CARBONDATA-241) OOM error during query execution in long run
[ https://issues.apache.org/jira/browse/CARBONDATA-241?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15493871#comment-15493871 ] ASF GitHub Bot commented on CARBONDATA-241: --- Github user gvramana commented on a diff in the pull request: https://github.com/apache/incubator-carbondata/pull/158#discussion_r79008577 --- Diff: integration/spark/src/main/scala/org/apache/spark/sql/CarbonDatasourceHadoopRelation.scala --- @@ -20,20 +20,13 @@ package org.apache.spark.sql import java.text.SimpleDateFormat import java.util.Date -import org.apache.carbondata.core.carbon.AbsoluteTableIdentifier --- End diff -- Merged this commit (compilation issue) changes separately. So can take out those changes from PR. > OOM error during query execution in long run > > > Key: CARBONDATA-241 > URL: https://issues.apache.org/jira/browse/CARBONDATA-241 > Project: CarbonData > Issue Type: Bug >Reporter: kumar vishal >Assignee: kumar vishal > > **Problem:** During long run query execution is taking more time and it is > throwing out of memory issue. > **Reason**: In compaction we are compacting segments and each segment > metadata is loaded in memory. So after compaction compacted segments are > invalid but its meta data is not removed from memory because of this > duplicate metadata is pile up and it is taking more memory and after few days > query exeution is throwing OOM > **Solution**: Need to remove invalid blocks from memory > -- This message was sent by Atlassian JIRA (v6.3.4#6332)