[GitHub] carbondata pull request #2850: [CARBONDATA-3056] Added concurrent reading th...
Github user asfgit closed the pull request at: https://github.com/apache/carbondata/pull/2850 ---
[GitHub] carbondata pull request #2850: [CARBONDATA-3056] Added concurrent reading th...
Github user kunal642 commented on a diff in the pull request: https://github.com/apache/carbondata/pull/2850#discussion_r230315785 --- Diff: store/sdk/src/main/java/org/apache/carbondata/sdk/file/CarbonReader.java --- @@ -114,6 +115,57 @@ public static CarbonReaderBuilder builder(String tablePath) { return builder(tablePath, tableName); } + /** + * Breaks the list of CarbonRecordReader in CarbonReader into multiple + * CarbonReader objects, each iterating through some 'carbondata' files + * and return that list of CarbonReader objects + * + * If the no. of files is greater than maxSplits, then break the + * CarbonReader into maxSplits splits, with each split iterating + * through >= 1 file. + * + * If the no. of files is less than maxSplits, then return list of + * CarbonReader with size as the no. of files, with each CarbonReader + * iterating through exactly one file + * + * @param maxSplits: Int + * @return list of {@link CarbonReader} objects + */ + public List split(int maxSplits) throws IOException { +validateReader(); +if (maxSplits < 1) { + throw new RuntimeException( + this.getClass().getSimpleName() + ".split: maxSplits must be positive"); +} + +List carbonReaders = new ArrayList<>(); + +if (maxSplits < this.readers.size()) { --- End diff -- @ravipesala Let us add test cases in a separate PR. would it be okay? ---
[GitHub] carbondata pull request #2850: [CARBONDATA-3056] Added concurrent reading th...
Github user ravipesala commented on a diff in the pull request: https://github.com/apache/carbondata/pull/2850#discussion_r230297864 --- Diff: store/sdk/src/main/java/org/apache/carbondata/sdk/file/CarbonReader.java --- @@ -114,6 +115,57 @@ public static CarbonReaderBuilder builder(String tablePath) { return builder(tablePath, tableName); } + /** + * Breaks the list of CarbonRecordReader in CarbonReader into multiple + * CarbonReader objects, each iterating through some 'carbondata' files + * and return that list of CarbonReader objects + * + * If the no. of files is greater than maxSplits, then break the + * CarbonReader into maxSplits splits, with each split iterating + * through >= 1 file. + * + * If the no. of files is less than maxSplits, then return list of + * CarbonReader with size as the no. of files, with each CarbonReader + * iterating through exactly one file + * + * @param maxSplits: Int + * @return list of {@link CarbonReader} objects + */ + public List split(int maxSplits) throws IOException { --- End diff -- ok ---
[GitHub] carbondata pull request #2850: [CARBONDATA-3056] Added concurrent reading th...
Github user ajantha-bhat commented on a diff in the pull request: https://github.com/apache/carbondata/pull/2850#discussion_r230272267 --- Diff: store/sdk/src/main/java/org/apache/carbondata/sdk/file/CarbonReader.java --- @@ -114,6 +115,57 @@ public static CarbonReaderBuilder builder(String tablePath) { return builder(tablePath, tableName); } + /** + * Breaks the list of CarbonRecordReader in CarbonReader into multiple + * CarbonReader objects, each iterating through some 'carbondata' files + * and return that list of CarbonReader objects + * + * If the no. of files is greater than maxSplits, then break the + * CarbonReader into maxSplits splits, with each split iterating + * through >= 1 file. + * + * If the no. of files is less than maxSplits, then return list of + * CarbonReader with size as the no. of files, with each CarbonReader + * iterating through exactly one file + * + * @param maxSplits: Int + * @return list of {@link CarbonReader} objects + */ + public List split(int maxSplits) throws IOException { --- End diff -- @ravipesala : Adding to builder will break builder pattern, recently we removed arguments from build() and make it as separate API for SDK writer. Reader also followed same. ---
[GitHub] carbondata pull request #2850: [CARBONDATA-3056] Added concurrent reading th...
Github user ravipesala commented on a diff in the pull request: https://github.com/apache/carbondata/pull/2850#discussion_r230246531 --- Diff: store/sdk/src/main/java/org/apache/carbondata/sdk/file/CarbonReader.java --- @@ -114,6 +115,57 @@ public static CarbonReaderBuilder builder(String tablePath) { return builder(tablePath, tableName); } + /** + * Breaks the list of CarbonRecordReader in CarbonReader into multiple + * CarbonReader objects, each iterating through some 'carbondata' files + * and return that list of CarbonReader objects + * + * If the no. of files is greater than maxSplits, then break the + * CarbonReader into maxSplits splits, with each split iterating + * through >= 1 file. + * + * If the no. of files is less than maxSplits, then return list of + * CarbonReader with size as the no. of files, with each CarbonReader + * iterating through exactly one file + * + * @param maxSplits: Int + * @return list of {@link CarbonReader} objects + */ + public List split(int maxSplits) throws IOException { +validateReader(); +if (maxSplits < 1) { + throw new RuntimeException( + this.getClass().getSimpleName() + ".split: maxSplits must be positive"); +} + +List carbonReaders = new ArrayList<>(); + +if (maxSplits < this.readers.size()) { --- End diff -- Add UT only to this method to make sure splits happen correctly with multiple splits combinations and readers size, ---
[GitHub] carbondata pull request #2850: [CARBONDATA-3056] Added concurrent reading th...
Github user ravipesala commented on a diff in the pull request: https://github.com/apache/carbondata/pull/2850#discussion_r230244674 --- Diff: store/sdk/src/main/java/org/apache/carbondata/sdk/file/CarbonReader.java --- @@ -114,6 +115,57 @@ public static CarbonReaderBuilder builder(String tablePath) { return builder(tablePath, tableName); } + /** + * Breaks the list of CarbonRecordReader in CarbonReader into multiple + * CarbonReader objects, each iterating through some 'carbondata' files + * and return that list of CarbonReader objects + * + * If the no. of files is greater than maxSplits, then break the + * CarbonReader into maxSplits splits, with each split iterating + * through >= 1 file. + * + * If the no. of files is less than maxSplits, then return list of + * CarbonReader with size as the no. of files, with each CarbonReader + * iterating through exactly one file + * + * @param maxSplits: Int + * @return list of {@link CarbonReader} objects + */ + public List split(int maxSplits) throws IOException { --- End diff -- I feel this method should be moved to builder. Add another method in builder `build(int splits)` and return List of readers. ---
[GitHub] carbondata pull request #2850: [CARBONDATA-3056] Added concurrent reading th...
Github user NamanRastogi commented on a diff in the pull request: https://github.com/apache/carbondata/pull/2850#discussion_r229299196 --- Diff: store/sdk/src/main/java/org/apache/carbondata/sdk/file/CarbonReader.java --- @@ -114,6 +117,43 @@ public static CarbonReaderBuilder builder(String tablePath) { return builder(tablePath, tableName); } + /** + * Return a new list of {@link CarbonReader} objects + * --- End diff -- Done! ---