[
https://issues.apache.org/jira/browse/CARBONDATA-308?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15627358#comment-15627358
]
ASF GitHub Bot commented on CARBONDATA-308:
-------------------------------------------
Github user QiangCai commented on a diff in the pull request:
https://github.com/apache/incubator-carbondata/pull/262#discussion_r86058188
--- Diff:
hadoop/src/main/java/org/apache/carbondata/hadoop/CarbonInputSplit.java ---
@@ -22,28 +22,44 @@
import java.io.DataOutput;
import java.io.IOException;
import java.io.Serializable;
+import java.util.ArrayList;
+import java.util.List;
+
+import org.apache.carbondata.core.carbon.datastore.block.BlockletInfos;
+import org.apache.carbondata.core.carbon.datastore.block.Distributable;
+import org.apache.carbondata.core.carbon.datastore.block.TableBlockInfo;
+import org.apache.carbondata.core.carbon.path.CarbonTablePath;
import org.apache.hadoop.fs.Path;
import org.apache.hadoop.io.Writable;
import org.apache.hadoop.mapreduce.lib.input.FileSplit;
+
/**
* Carbon input split to allow distributed read of CarbonInputFormat.
*/
-public class CarbonInputSplit extends FileSplit implements Serializable,
Writable {
+public class CarbonInputSplit extends FileSplit implements Distributable,
Serializable, Writable {
private static final long serialVersionUID = 3520344046772190207L;
private String segmentId;
- /**
+ public String taskId = "0";
+
+ /*
* Number of BlockLets in a block
*/
private int numberOfBlocklets = 0;
- public CarbonInputSplit() {
- super(null, 0, 0, new String[0]);
+ public CarbonInputSplit() {
}
- public CarbonInputSplit(String segmentId, Path path, long start, long
length,
+ private void parserPath(Path path) {
+ String[] nameParts = path.getName().split("-");
+ if (nameParts != null && nameParts.length >= 3) {
+ this.taskId = nameParts[2];
+ }
+ }
+
+ private CarbonInputSplit(String segmentId, Path path, long start, long
length,
--- End diff --
please initialize taskId
> Use CarbonInputFormat in CarbonScanRDD compute
> ----------------------------------------------
>
> Key: CARBONDATA-308
> URL: https://issues.apache.org/jira/browse/CARBONDATA-308
> Project: CarbonData
> Issue Type: Sub-task
> Components: spark-integration
> Reporter: Jacky Li
> Fix For: 0.2.0-incubating
>
>
> Take CarbonScanRDD as the target RDD, modify as following:
> 1. In driver side, only getSplit is required, so only filter condition is
> required, no need to create full QueryModel object, so we can move creation
> of QueryModel from driver side to executor side.
> 2. use CarbonInputFormat.createRecordReader in CarbonScanRDD.compute instead
> of use QueryExecutor directly
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)