[GitHub] carbondata pull request #2850: [CARBONDATA-3056] Added concurrent reading th...

2018-11-02 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/carbondata/pull/2850


---


[GitHub] carbondata pull request #2850: [CARBONDATA-3056] Added concurrent reading th...

2018-11-02 Thread kunal642
Github user kunal642 commented on a diff in the pull request:

https://github.com/apache/carbondata/pull/2850#discussion_r230315785
  
--- Diff: 
store/sdk/src/main/java/org/apache/carbondata/sdk/file/CarbonReader.java ---
@@ -114,6 +115,57 @@ public static CarbonReaderBuilder builder(String 
tablePath) {
 return builder(tablePath, tableName);
   }
 
+  /**
+   * Breaks the list of CarbonRecordReader in CarbonReader into multiple
+   * CarbonReader objects, each iterating through some 'carbondata' files
+   * and return that list of CarbonReader objects
+   *
+   * If the no. of files is greater than maxSplits, then break the
+   * CarbonReader into maxSplits splits, with each split iterating
+   * through >= 1 file.
+   *
+   * If the no. of files is less than maxSplits, then return list of
+   * CarbonReader with size as the no. of files, with each CarbonReader
+   * iterating through exactly one file
+   *
+   * @param maxSplits: Int
+   * @return list of {@link CarbonReader} objects
+   */
+  public List split(int maxSplits) throws IOException {
+validateReader();
+if (maxSplits < 1) {
+  throw new RuntimeException(
+  this.getClass().getSimpleName() + ".split: maxSplits must be 
positive");
+}
+
+List carbonReaders = new ArrayList<>();
+
+if (maxSplits < this.readers.size()) {
--- End diff --

@ravipesala Let us add test cases in a separate PR. would it be okay?


---


[GitHub] carbondata pull request #2850: [CARBONDATA-3056] Added concurrent reading th...

2018-11-02 Thread ravipesala
Github user ravipesala commented on a diff in the pull request:

https://github.com/apache/carbondata/pull/2850#discussion_r230297864
  
--- Diff: 
store/sdk/src/main/java/org/apache/carbondata/sdk/file/CarbonReader.java ---
@@ -114,6 +115,57 @@ public static CarbonReaderBuilder builder(String 
tablePath) {
 return builder(tablePath, tableName);
   }
 
+  /**
+   * Breaks the list of CarbonRecordReader in CarbonReader into multiple
+   * CarbonReader objects, each iterating through some 'carbondata' files
+   * and return that list of CarbonReader objects
+   *
+   * If the no. of files is greater than maxSplits, then break the
+   * CarbonReader into maxSplits splits, with each split iterating
+   * through >= 1 file.
+   *
+   * If the no. of files is less than maxSplits, then return list of
+   * CarbonReader with size as the no. of files, with each CarbonReader
+   * iterating through exactly one file
+   *
+   * @param maxSplits: Int
+   * @return list of {@link CarbonReader} objects
+   */
+  public List split(int maxSplits) throws IOException {
--- End diff --

ok


---


[GitHub] carbondata pull request #2850: [CARBONDATA-3056] Added concurrent reading th...

2018-11-01 Thread ajantha-bhat
Github user ajantha-bhat commented on a diff in the pull request:

https://github.com/apache/carbondata/pull/2850#discussion_r230272267
  
--- Diff: 
store/sdk/src/main/java/org/apache/carbondata/sdk/file/CarbonReader.java ---
@@ -114,6 +115,57 @@ public static CarbonReaderBuilder builder(String 
tablePath) {
 return builder(tablePath, tableName);
   }
 
+  /**
+   * Breaks the list of CarbonRecordReader in CarbonReader into multiple
+   * CarbonReader objects, each iterating through some 'carbondata' files
+   * and return that list of CarbonReader objects
+   *
+   * If the no. of files is greater than maxSplits, then break the
+   * CarbonReader into maxSplits splits, with each split iterating
+   * through >= 1 file.
+   *
+   * If the no. of files is less than maxSplits, then return list of
+   * CarbonReader with size as the no. of files, with each CarbonReader
+   * iterating through exactly one file
+   *
+   * @param maxSplits: Int
+   * @return list of {@link CarbonReader} objects
+   */
+  public List split(int maxSplits) throws IOException {
--- End diff --

@ravipesala : Adding to builder will break builder pattern, recently we 
removed arguments from build() and make it as separate API for SDK writer. 
Reader also followed same.


---


[GitHub] carbondata pull request #2850: [CARBONDATA-3056] Added concurrent reading th...

2018-11-01 Thread ravipesala
Github user ravipesala commented on a diff in the pull request:

https://github.com/apache/carbondata/pull/2850#discussion_r230246531
  
--- Diff: 
store/sdk/src/main/java/org/apache/carbondata/sdk/file/CarbonReader.java ---
@@ -114,6 +115,57 @@ public static CarbonReaderBuilder builder(String 
tablePath) {
 return builder(tablePath, tableName);
   }
 
+  /**
+   * Breaks the list of CarbonRecordReader in CarbonReader into multiple
+   * CarbonReader objects, each iterating through some 'carbondata' files
+   * and return that list of CarbonReader objects
+   *
+   * If the no. of files is greater than maxSplits, then break the
+   * CarbonReader into maxSplits splits, with each split iterating
+   * through >= 1 file.
+   *
+   * If the no. of files is less than maxSplits, then return list of
+   * CarbonReader with size as the no. of files, with each CarbonReader
+   * iterating through exactly one file
+   *
+   * @param maxSplits: Int
+   * @return list of {@link CarbonReader} objects
+   */
+  public List split(int maxSplits) throws IOException {
+validateReader();
+if (maxSplits < 1) {
+  throw new RuntimeException(
+  this.getClass().getSimpleName() + ".split: maxSplits must be 
positive");
+}
+
+List carbonReaders = new ArrayList<>();
+
+if (maxSplits < this.readers.size()) {
--- End diff --

Add UT only to this method to make sure splits happen correctly with 
multiple splits combinations and readers size,


---


[GitHub] carbondata pull request #2850: [CARBONDATA-3056] Added concurrent reading th...

2018-11-01 Thread ravipesala
Github user ravipesala commented on a diff in the pull request:

https://github.com/apache/carbondata/pull/2850#discussion_r230244674
  
--- Diff: 
store/sdk/src/main/java/org/apache/carbondata/sdk/file/CarbonReader.java ---
@@ -114,6 +115,57 @@ public static CarbonReaderBuilder builder(String 
tablePath) {
 return builder(tablePath, tableName);
   }
 
+  /**
+   * Breaks the list of CarbonRecordReader in CarbonReader into multiple
+   * CarbonReader objects, each iterating through some 'carbondata' files
+   * and return that list of CarbonReader objects
+   *
+   * If the no. of files is greater than maxSplits, then break the
+   * CarbonReader into maxSplits splits, with each split iterating
+   * through >= 1 file.
+   *
+   * If the no. of files is less than maxSplits, then return list of
+   * CarbonReader with size as the no. of files, with each CarbonReader
+   * iterating through exactly one file
+   *
+   * @param maxSplits: Int
+   * @return list of {@link CarbonReader} objects
+   */
+  public List split(int maxSplits) throws IOException {
--- End diff --

I feel this method should be moved to builder. Add another method in 
builder `build(int splits)` and return List of readers.


---


[GitHub] carbondata pull request #2850: [CARBONDATA-3056] Added concurrent reading th...

2018-10-30 Thread NamanRastogi
Github user NamanRastogi commented on a diff in the pull request:

https://github.com/apache/carbondata/pull/2850#discussion_r229299196
  
--- Diff: 
store/sdk/src/main/java/org/apache/carbondata/sdk/file/CarbonReader.java ---
@@ -114,6 +117,43 @@ public static CarbonReaderBuilder builder(String 
tablePath) {
 return builder(tablePath, tableName);
   }
 
+  /**
+   * Return a new list of {@link CarbonReader} objects
+   *
--- End diff --

Done!


---