Github user QiangCai commented on a diff in the pull request:
https://github.com/apache/incubator-carbondata/pull/104#discussion_r76957634
--- Diff:
processing/src/main/java/org/apache/carbondata/processing/csvreaderstep/UnivocityCsvParser.java
---
@@ -112,25 +116,28 @@ private void initializeReader() throws IOException {
// if already one input stream is open first we need to close and then
// open new stream
close();
- // get the block offset
- long startOffset =
this.csvParserVo.getBlockDetailsList().get(blockCounter).getBlockOffset();
- FileType fileType = FileFactory
-
.getFileType(this.csvParserVo.getBlockDetailsList().get(blockCounter).getFilePath());
- // calculate the end offset the block
- long endOffset =
-
this.csvParserVo.getBlockDetailsList().get(blockCounter).getBlockLength() +
startOffset;
-
- // create a input stream for the block
- DataInputStream dataInputStream = FileFactory
-
.getDataInputStream(this.csvParserVo.getBlockDetailsList().get(blockCounter).getFilePath(),
- fileType, bufferSize, startOffset);
- // if start offset is not 0 then reading then reading and ignoring the
extra line
- if (startOffset != 0) {
- LineReader lineReader = new LineReader(dataInputStream, 1);
- startOffset += lineReader.readLine(new Text(), 0);
+
+ String path =
this.csvParserVo.getBlockDetailsList().get(blockCounter).getFilePath();
+ FileType fileType = FileFactory.getFileType(path);
+
+ DataInputStream dataInputStream =
+ FileFactory.getDataInputStream(path, fileType, bufferSize);
--- End diff --
For csv file , DataInputStream need startOffSet
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---