Github user NamanRastogi commented on the issue:
https://github.com/apache/carbondata/pull/2850
Please check the split method, it splits the list of `CarbonRecordReader`
into multiple `CarbonReader`s. It does not jumble the order of
`CarbonRecordReader`, it still keeps them sequential.
Suppose there are 10 *carbondata* files and thus 10 `CarbonRecordReader` in
`CarbonReader.readers` object and the user wants to get 3 splits, so he will
get a list like this:
```java
CarbonReader reader = CarbonReader.builder(dataDir).build();
List<CarbonReader> multipleReaders = reader.split(3);
```
And the indices of `CarbonRecordReader`s in `multipleReaders` will be like:
`multipleReaders.get(0).readers` points to {0,1,2,3} indices of
*carbondata* files
`multipleReaders.get(1).readers` points to {4,5,6} indices of *carbondata*
files
`multipleReaders.get(2).readers` points to {7,8,9} indices of *carbondata*
files
Now, if you read the rows like following code, the rows will still be in
order.
```java
for (CarbonReader reader_i : multipleReaders) {
reader_i.readNextRow();
}
```
Earlier, you were getting data from 5th `CarbonRecordReader` only after you
have exhausted the 4th. But now, you are getting it earlier, maybe even before
0th. So the user has to make sure he consumes it after he has used up the 4th
file if order is important for him/her, otherwise he/she can use it earlier
also if order is not important. So, for example to count the total no. of rows,
user does not need the original order.
---