[GitHub] [carbondata] ajantha-bhat commented on a change in pull request #3771: [WIP] pushdown array_contains filter to carbon

2020-06-04 Thread GitBox


ajantha-bhat commented on a change in pull request #3771:
URL: https://github.com/apache/carbondata/pull/3771#discussion_r435038194



##
File path: 
core/src/main/java/org/apache/carbondata/core/scan/filter/executer/RowLevelFilterExecuterImpl.java
##
@@ -222,49 +224,90 @@ public BitSetGroup applyFilter(RawBlockletColumnChunks 
rawBlockletColumnChunks,
   }
 }
 BitSetGroup bitSetGroup = new BitSetGroup(pageNumbers);
-for (int i = 0; i < pageNumbers; i++) {
-  BitSet set = new BitSet(numberOfRows[i]);
-  RowIntf row = new RowImpl();
-  BitSet prvBitset = null;
-  // if bitset pipe line is enabled then use rowid from previous bitset
-  // otherwise use older flow
-  if (!useBitsetPipeLine ||
-  null == rawBlockletColumnChunks.getBitSetGroup() ||
-  null == bitSetGroup.getBitSet(i) ||
-  rawBlockletColumnChunks.getBitSetGroup().getBitSet(i).isEmpty()) {
-for (int index = 0; index < numberOfRows[i]; index++) {
-  createRow(rawBlockletColumnChunks, row, i, index);
-  Boolean rslt = false;
-  try {
-rslt = exp.evaluate(row).getBoolean();
-  }
-  // Any invalid member while evaluation shall be ignored, system will 
log the
-  // error only once since all rows the evaluation happens so inorder 
to avoid
-  // too much log inforation only once the log will be printed.
-  catch (FilterIllegalMemberException e) {
-FilterUtil.logError(e, false);
-  }
-  if (null != rslt && rslt) {
-set.set(index);
+
+if (isDimensionPresentInCurrentBlock.length == 1 && 
isDimensionPresentInCurrentBlock[0]) {
+  // fill default value here
+  DimColumnResolvedFilterInfo dimColumnEvaluatorInfo = 
dimColEvaluatorInfoList.get(0);
+  // if filter dimension is not present in the current add its default 
value
+  if (dimColumnEvaluatorInfo.getDimension().getDataType().isComplexType()) 
{
+for (int i = 0; i < pageNumbers; i++) {
+  BitSet set = new BitSet(numberOfRows[i]);
+  RowIntf row = new RowImpl();
+  for (int index = 0; index < numberOfRows[i]; index++) {
+ArrayQueryType complexType =

Review comment:
   done

##
File path: 
core/src/main/java/org/apache/carbondata/core/scan/filter/executer/RowLevelFilterExecuterImpl.java
##
@@ -222,49 +224,90 @@ public BitSetGroup applyFilter(RawBlockletColumnChunks 
rawBlockletColumnChunks,
   }
 }
 BitSetGroup bitSetGroup = new BitSetGroup(pageNumbers);
-for (int i = 0; i < pageNumbers; i++) {
-  BitSet set = new BitSet(numberOfRows[i]);
-  RowIntf row = new RowImpl();
-  BitSet prvBitset = null;
-  // if bitset pipe line is enabled then use rowid from previous bitset
-  // otherwise use older flow
-  if (!useBitsetPipeLine ||
-  null == rawBlockletColumnChunks.getBitSetGroup() ||
-  null == bitSetGroup.getBitSet(i) ||
-  rawBlockletColumnChunks.getBitSetGroup().getBitSet(i).isEmpty()) {
-for (int index = 0; index < numberOfRows[i]; index++) {
-  createRow(rawBlockletColumnChunks, row, i, index);
-  Boolean rslt = false;
-  try {
-rslt = exp.evaluate(row).getBoolean();
-  }
-  // Any invalid member while evaluation shall be ignored, system will 
log the
-  // error only once since all rows the evaluation happens so inorder 
to avoid
-  // too much log inforation only once the log will be printed.
-  catch (FilterIllegalMemberException e) {
-FilterUtil.logError(e, false);
-  }
-  if (null != rslt && rslt) {
-set.set(index);
+
+if (isDimensionPresentInCurrentBlock.length == 1 && 
isDimensionPresentInCurrentBlock[0]) {
+  // fill default value here
+  DimColumnResolvedFilterInfo dimColumnEvaluatorInfo = 
dimColEvaluatorInfoList.get(0);
+  // if filter dimension is not present in the current add its default 
value
+  if (dimColumnEvaluatorInfo.getDimension().getDataType().isComplexType()) 
{
+for (int i = 0; i < pageNumbers; i++) {
+  BitSet set = new BitSet(numberOfRows[i]);
+  RowIntf row = new RowImpl();
+  for (int index = 0; index < numberOfRows[i]; index++) {
+ArrayQueryType complexType =
+(ArrayQueryType) 
complexDimensionInfoMap.get(dimensionChunkIndex[i]);
+int[] numberOfChild = complexType
+
.getNumberOfChild(rawBlockletColumnChunks.getDimensionRawColumnChunks(), null,

Review comment:
   done





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [carbondata] ajantha-bhat commented on a change in pull request #3771: [WIP] pushdown array_contains filter to carbon

2020-06-04 Thread GitBox


ajantha-bhat commented on a change in pull request #3771:
URL: https://github.com/apache/carbondata/pull/3771#discussion_r435036918



##
File path: 
core/src/main/java/org/apache/carbondata/core/scan/complextypes/ComplexQueryType.java
##
@@ -67,4 +67,18 @@ private DimensionColumnPage 
getDecodedDimensionPage(DimensionColumnPage[][] dime
 }
 return dimensionColumnPages[columnIndex][pageNumber];
   }
+
+  /**
+   * Method will copy the block chunk holder data and return the cloned value.
+   * This method is also used by child.
+   */
+  protected byte[] copyBlockDataChunkWithoutClone(DimensionRawColumnChunk[] 
rawColumnChunks,
+  DimensionColumnPage[][] dimensionColumnPages, int rowNumber, int 
pageNumber) {
+byte[] data =
+getDecodedDimensionPage(dimensionColumnPages, 
rawColumnChunks[columnIndex], pageNumber)

Review comment:
   In BlockletScannedResult, dimensionColumnPages[][] 





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [carbondata] ajantha-bhat commented on a change in pull request #3771: [WIP] pushdown array_contains filter to carbon

2020-06-04 Thread GitBox


ajantha-bhat commented on a change in pull request #3771:
URL: https://github.com/apache/carbondata/pull/3771#discussion_r435034670



##
File path: 
core/src/main/java/org/apache/carbondata/core/scan/complextypes/ComplexQueryType.java
##
@@ -67,4 +67,18 @@ private DimensionColumnPage 
getDecodedDimensionPage(DimensionColumnPage[][] dime
 }
 return dimensionColumnPages[columnIndex][pageNumber];
   }
+
+  /**
+   * Method will copy the block chunk holder data and return the cloned value.
+   * This method is also used by child.
+   */
+  protected byte[] copyBlockDataChunkWithoutClone(DimensionRawColumnChunk[] 
rawColumnChunks,
+  DimensionColumnPage[][] dimensionColumnPages, int rowNumber, int 
pageNumber) {
+byte[] data =
+getDecodedDimensionPage(dimensionColumnPages, 
rawColumnChunks[columnIndex], pageNumber)

Review comment:
   I have debugged, cache is already there. the argument of this method, 
`DimensionColumnPage[][] dimensionColumnPages` itself is a cache based on 
column index.
   
   go inside `ComplexQueryType#getDecodedDimensionPage` to see it.
   
   Also observed that only once decodeColumnPage called for that page, reset it 
is using from cache only.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [carbondata] ajantha-bhat commented on a change in pull request #3771: [WIP] pushdown array_contains filter to carbon

2020-05-23 Thread GitBox


ajantha-bhat commented on a change in pull request #3771:
URL: https://github.com/apache/carbondata/pull/3771#discussion_r429519687



##
File path: 
integration/spark/src/test/scala/org/apache/carbondata/integration/spark/testsuite/complexType/TestCompactionComplexType.scala
##
@@ -47,6 +47,33 @@ class TestCompactionComplexType extends QueryTest with 
BeforeAndAfterAll {
 sql("DROP TABLE IF EXISTS compactComplex")
   }
 
+  test("complex issue") {
+sql("drop table if exists complex1")
+sql("create table complex1 (arr array) stored as carbondata")
+sql("insert into complex1 select array('as') union all " +
+"select array('sd','df','gh') union all " +
+"select array('rt','ew','rtyu','jk','sder') union all " +
+"select array('ghsf','dbv','fg','ty') union all " +
+"select array('hjsd','fggb','nhj','sd','asd')")
+

Review comment:
   This is WIP temp, cannot merge this poc code. Why review? 





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [carbondata] ajantha-bhat commented on a change in pull request #3771: [WIP] pushdown array_contains filter to carbon

2020-05-23 Thread GitBox


ajantha-bhat commented on a change in pull request #3771:
URL: https://github.com/apache/carbondata/pull/3771#discussion_r429519687



##
File path: 
integration/spark/src/test/scala/org/apache/carbondata/integration/spark/testsuite/complexType/TestCompactionComplexType.scala
##
@@ -47,6 +47,33 @@ class TestCompactionComplexType extends QueryTest with 
BeforeAndAfterAll {
 sql("DROP TABLE IF EXISTS compactComplex")
   }
 
+  test("complex issue") {
+sql("drop table if exists complex1")
+sql("create table complex1 (arr array) stored as carbondata")
+sql("insert into complex1 select array('as') union all " +
+"select array('sd','df','gh') union all " +
+"select array('rt','ew','rtyu','jk','sder') union all " +
+"select array('ghsf','dbv','fg','ty') union all " +
+"select array('hjsd','fggb','nhj','sd','asd')")
+

Review comment:
   This is WIP temp, cannot merge this poc code. Why review? 





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org