Github user jackylk commented on a diff in the pull request:
https://github.com/apache/carbondata/pull/2366#discussion_r196310131
--- Diff:
integration/spark-common/src/main/java/org/apache/carbondata/streaming/CarbonStreamRecordReader.java
---
@@ -418,36 +412,47 @@ private boolean isScanRequired(BlockletHeader header)
{
}
private boolean scanBlockletAndFillVector(BlockletHeader header) throws
IOException {
+ Constructor cons = null;
// if filter is null and output projection is empty, use the row
number of blocklet header
- if (skipScanData) {
- int rowNums = header.getBlocklet_info().getNum_rows();
- columnarBatch = ColumnarBatch.allocate(outputSchema,
MemoryMode.OFF_HEAP, rowNums);
- columnarBatch.setNumRows(rowNums);
- input.skipBlockletData(true);
- return rowNums > 0;
- }
-
- input.readBlockletData(header);
- columnarBatch = ColumnarBatch.allocate(outputSchema,
MemoryMode.OFF_HEAP, input.getRowNums());
int rowNum = 0;
- if (null == filter) {
- while (input.hasNext()) {
- readRowFromStream();
- putRowToColumnBatch(rowNum++);
+ try {
+ String vectorReaderClassName =
"org.apache.spark.sql.CarbonVectorProxy";
--- End diff --
Since you are using `CarbonVectorProxy`, can you remove the spark
dependency in this stream module?
---