[GitHub] carbondata pull request #2804: [CARBONDATA-2996] CarbonSchemaReader support ...

2018-11-06 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/carbondata/pull/2804


---


[GitHub] carbondata pull request #2804: [CARBONDATA-2996] CarbonSchemaReader support ...

2018-11-06 Thread xubo245
Github user xubo245 commented on a diff in the pull request:

https://github.com/apache/carbondata/pull/2804#discussion_r231108134
  
--- Diff: 
store/sdk/src/main/java/org/apache/carbondata/sdk/file/CarbonSchemaReader.java 
---
@@ -61,14 +65,122 @@ public static Schema readSchemaInSchemaFile(String 
schemaFilePath) throws IOExce
 return new Schema(schemaList);
   }
 
+  /**
+   * get carbondata/carbonindex file in path
+   *
+   * @param path carbon file path
+   * @return CarbonFile array
+   */
+  private static CarbonFile[] getCarbonFile(String path, final String 
extension)
+  throws IOException {
+String dataFilePath = path;
+if (!(dataFilePath.contains(extension))) {
+  CarbonFile[] carbonFiles = FileFactory
+  .getCarbonFile(path)
+  .listFiles(new CarbonFileFilter() {
+@Override
+public boolean accept(CarbonFile file) {
+  if (file == null) {
+return false;
+  }
+  return file.getName().endsWith(extension);
+}
+  });
+  if (carbonFiles == null || carbonFiles.length < 1) {
+throw new IOException("Carbon file not exists.");
+  }
+  return carbonFiles;
+}
+return null;
+  }
+
+  /**
+   * read schema from path,
+   * path can be folder path, carbonindex file path, and carbondata file 
path
+   * and will not check all files schema
+   *
+   * @param path file/folder path
+   * @return schema
+   * @throws IOException
+   */
+  public static Schema readSchema(String path) throws IOException {
+return readSchema(path, false);
+  }
+
+  /**
+   * read schema from path,
+   * path can be folder path, carbonindex file path, and carbondata file 
path
+   * and user can decide whether check all files schema
+   *
+   * @param path file/folder path
+   * @param validateSchema whether check all files schema
+   * @return schema
+   * @throws IOException
+   */
+  public static Schema readSchema(String path, boolean validateSchema) 
throws IOException {
+if (path.endsWith(INDEX_FILE_EXT)) {
+  return readSchemaFromIndexFile(path);
+} else if (path.endsWith(CARBON_DATA_EXT)) {
+  return readSchemaFromDataFile(path);
+} else if (validateSchema) {
+  CarbonFile[] carbonIndexFiles = getCarbonFile(path, INDEX_FILE_EXT);
+  Schema schema;
+  if (carbonIndexFiles != null && carbonIndexFiles.length != 0) {
+schema = 
readSchemaFromIndexFile(carbonIndexFiles[0].getAbsolutePath());
+for (int i = 1; i < carbonIndexFiles.length; i++) {
+  Schema schema2 = 
readSchemaFromIndexFile(carbonIndexFiles[i].getAbsolutePath());
+  if (schema != schema2) {
+throw new CarbonDataLoadingException("Schema is different 
between different files.");
+  }
+}
+CarbonFile[] carbonDataFiles = getCarbonFile(path, 
CARBON_DATA_EXT);
+for (int i = 0; i < carbonDataFiles.length; i++) {
+  Schema schema2 = 
readSchemaFromDataFile(carbonDataFiles[i].getAbsolutePath());
+  if (!schema.equals(schema2)) {
+throw new CarbonDataLoadingException("Schema is different 
between different files.");
+  }
+}
+return schema;
+  } else {
+throw new CarbonDataLoadingException("No carbonindex file in this 
path.");
+  }
+} else {
+  String indexFilePath = getCarbonFile(path, 
INDEX_FILE_EXT)[0].getAbsolutePath();
+  if (indexFilePath != null) {
--- End diff --

removed.


---


[GitHub] carbondata pull request #2804: [CARBONDATA-2996] CarbonSchemaReader support ...

2018-11-06 Thread xubo245
Github user xubo245 commented on a diff in the pull request:

https://github.com/apache/carbondata/pull/2804#discussion_r231107900
  
--- Diff: 
store/sdk/src/main/java/org/apache/carbondata/sdk/file/CarbonSchemaReader.java 
---
@@ -144,4 +246,28 @@ public static Schema readSchemaInIndexFile(String 
indexFilePath) throws IOExcept
 }
   }
 
+  /**
+   * This method return the version details in formatted string by reading 
from carbondata file
+   *
+   * @param dataFilePath
+   * @return
+   * @throws IOException
+   */
+  public static String getVersionDetails(String dataFilePath) throws 
IOException {
--- End diff --

can't


---


[GitHub] carbondata pull request #2804: [CARBONDATA-2996] CarbonSchemaReader support ...

2018-11-06 Thread xubo245
Github user xubo245 commented on a diff in the pull request:

https://github.com/apache/carbondata/pull/2804#discussion_r231107288
  
--- Diff: 
store/sdk/src/main/java/org/apache/carbondata/sdk/file/CarbonSchemaReader.java 
---
@@ -61,14 +65,122 @@ public static Schema readSchemaInSchemaFile(String 
schemaFilePath) throws IOExce
 return new Schema(schemaList);
   }
 
+  /**
+   * get carbondata/carbonindex file in path
+   *
+   * @param path carbon file path
+   * @return CarbonFile array
+   */
+  private static CarbonFile[] getCarbonFile(String path, final String 
extension)
+  throws IOException {
+String dataFilePath = path;
+if (!(dataFilePath.contains(extension))) {
+  CarbonFile[] carbonFiles = FileFactory
+  .getCarbonFile(path)
+  .listFiles(new CarbonFileFilter() {
+@Override
+public boolean accept(CarbonFile file) {
+  if (file == null) {
+return false;
+  }
+  return file.getName().endsWith(extension);
+}
+  });
+  if (carbonFiles == null || carbonFiles.length < 1) {
+throw new IOException("Carbon file not exists.");
+  }
+  return carbonFiles;
+}
+return null;
--- End diff --

ok, throw exception


---


[GitHub] carbondata pull request #2804: [CARBONDATA-2996] CarbonSchemaReader support ...

2018-11-06 Thread xubo245
Github user xubo245 commented on a diff in the pull request:

https://github.com/apache/carbondata/pull/2804#discussion_r231105573
  
--- Diff: 
store/sdk/src/main/java/org/apache/carbondata/sdk/file/CarbonSchemaReader.java 
---
@@ -61,14 +65,122 @@ public static Schema readSchemaInSchemaFile(String 
schemaFilePath) throws IOExce
 return new Schema(schemaList);
   }
 
+  /**
+   * get carbondata/carbonindex file in path
+   *
+   * @param path carbon file path
+   * @return CarbonFile array
+   */
+  private static CarbonFile[] getCarbonFile(String path, final String 
extension)
+  throws IOException {
+String dataFilePath = path;
+if (!(dataFilePath.contains(extension))) {
+  CarbonFile[] carbonFiles = FileFactory
+  .getCarbonFile(path)
+  .listFiles(new CarbonFileFilter() {
+@Override
+public boolean accept(CarbonFile file) {
+  if (file == null) {
+return false;
+  }
+  return file.getName().endsWith(extension);
+}
+  });
+  if (carbonFiles == null || carbonFiles.length < 1) {
+throw new IOException("Carbon file not exists.");
+  }
+  return carbonFiles;
+}
+return null;
+  }
+
+  /**
+   * read schema from path,
+   * path can be folder path, carbonindex file path, and carbondata file 
path
+   * and will not check all files schema
+   *
+   * @param path file/folder path
+   * @return schema
+   * @throws IOException
+   */
+  public static Schema readSchema(String path) throws IOException {
+return readSchema(path, false);
+  }
+
+  /**
+   * read schema from path,
+   * path can be folder path, carbonindex file path, and carbondata file 
path
+   * and user can decide whether check all files schema
+   *
+   * @param path file/folder path
+   * @param validateSchema whether check all files schema
+   * @return schema
+   * @throws IOException
+   */
+  public static Schema readSchema(String path, boolean validateSchema) 
throws IOException {
+if (path.endsWith(INDEX_FILE_EXT)) {
+  return readSchemaFromIndexFile(path);
+} else if (path.endsWith(CARBON_DATA_EXT)) {
+  return readSchemaFromDataFile(path);
+} else if (validateSchema) {
+  CarbonFile[] carbonIndexFiles = getCarbonFile(path, INDEX_FILE_EXT);
+  Schema schema;
+  if (carbonIndexFiles != null && carbonIndexFiles.length != 0) {
+schema = 
readSchemaFromIndexFile(carbonIndexFiles[0].getAbsolutePath());
+for (int i = 1; i < carbonIndexFiles.length; i++) {
+  Schema schema2 = 
readSchemaFromIndexFile(carbonIndexFiles[i].getAbsolutePath());
+  if (schema != schema2) {
+throw new CarbonDataLoadingException("Schema is different 
between different files.");
+  }
+}
+CarbonFile[] carbonDataFiles = getCarbonFile(path, 
CARBON_DATA_EXT);
+for (int i = 0; i < carbonDataFiles.length; i++) {
+  Schema schema2 = 
readSchemaFromDataFile(carbonDataFiles[i].getAbsolutePath());
+  if (!schema.equals(schema2)) {
+throw new CarbonDataLoadingException("Schema is different 
between different files.");
+  }
+}
+return schema;
+  } else {
+throw new CarbonDataLoadingException("No carbonindex file in this 
path.");
+  }
+} else {
+  String indexFilePath = getCarbonFile(path, 
INDEX_FILE_EXT)[0].getAbsolutePath();
+  if (indexFilePath != null) {
+return readSchemaFromIndexFile(indexFilePath);
+  } else {
+String dataFilePath = getCarbonFile(path, 
CARBON_DATA_EXT)[0].getAbsolutePath();
--- End diff --

yeah, removed else


---


[GitHub] carbondata pull request #2804: [CARBONDATA-2996] CarbonSchemaReader support ...

2018-11-06 Thread xubo245
Github user xubo245 commented on a diff in the pull request:

https://github.com/apache/carbondata/pull/2804#discussion_r231101081
  
--- Diff: 
store/sdk/src/main/java/org/apache/carbondata/sdk/file/CarbonSchemaReader.java 
---
@@ -61,14 +65,122 @@ public static Schema readSchemaInSchemaFile(String 
schemaFilePath) throws IOExce
 return new Schema(schemaList);
   }
 
+  /**
+   * get carbondata/carbonindex file in path
+   *
+   * @param path carbon file path
+   * @return CarbonFile array
+   */
+  private static CarbonFile[] getCarbonFile(String path, final String 
extension)
+  throws IOException {
+String dataFilePath = path;
+if (!(dataFilePath.contains(extension))) {
+  CarbonFile[] carbonFiles = FileFactory
+  .getCarbonFile(path)
+  .listFiles(new CarbonFileFilter() {
+@Override
+public boolean accept(CarbonFile file) {
+  if (file == null) {
+return false;
+  }
+  return file.getName().endsWith(extension);
+}
+  });
+  if (carbonFiles == null || carbonFiles.length < 1) {
+throw new IOException("Carbon file not exists.");
+  }
+  return carbonFiles;
+}
+return null;
+  }
+
+  /**
+   * read schema from path,
+   * path can be folder path, carbonindex file path, and carbondata file 
path
+   * and will not check all files schema
+   *
+   * @param path file/folder path
+   * @return schema
+   * @throws IOException
+   */
+  public static Schema readSchema(String path) throws IOException {
+return readSchema(path, false);
+  }
+
+  /**
+   * read schema from path,
+   * path can be folder path, carbonindex file path, and carbondata file 
path
+   * and user can decide whether check all files schema
+   *
+   * @param path file/folder path
+   * @param validateSchema whether check all files schema
+   * @return schema
+   * @throws IOException
+   */
+  public static Schema readSchema(String path, boolean validateSchema) 
throws IOException {
+if (path.endsWith(INDEX_FILE_EXT)) {
+  return readSchemaFromIndexFile(path);
+} else if (path.endsWith(CARBON_DATA_EXT)) {
+  return readSchemaFromDataFile(path);
+} else if (validateSchema) {
+  CarbonFile[] carbonIndexFiles = getCarbonFile(path, INDEX_FILE_EXT);
+  Schema schema;
+  if (carbonIndexFiles != null && carbonIndexFiles.length != 0) {
+schema = 
readSchemaFromIndexFile(carbonIndexFiles[0].getAbsolutePath());
+for (int i = 1; i < carbonIndexFiles.length; i++) {
+  Schema schema2 = 
readSchemaFromIndexFile(carbonIndexFiles[i].getAbsolutePath());
+  if (schema != schema2) {
--- End diff --

ok, done


---


[GitHub] carbondata pull request #2804: [CARBONDATA-2996] CarbonSchemaReader support ...

2018-11-06 Thread KanakaKumar
Github user KanakaKumar commented on a diff in the pull request:

https://github.com/apache/carbondata/pull/2804#discussion_r231061128
  
--- Diff: 
store/sdk/src/main/java/org/apache/carbondata/sdk/file/CarbonSchemaReader.java 
---
@@ -61,14 +65,122 @@ public static Schema readSchemaInSchemaFile(String 
schemaFilePath) throws IOExce
 return new Schema(schemaList);
   }
 
+  /**
+   * get carbondata/carbonindex file in path
+   *
+   * @param path carbon file path
+   * @return CarbonFile array
+   */
+  private static CarbonFile[] getCarbonFile(String path, final String 
extension)
+  throws IOException {
+String dataFilePath = path;
+if (!(dataFilePath.contains(extension))) {
+  CarbonFile[] carbonFiles = FileFactory
+  .getCarbonFile(path)
+  .listFiles(new CarbonFileFilter() {
+@Override
+public boolean accept(CarbonFile file) {
+  if (file == null) {
+return false;
+  }
+  return file.getName().endsWith(extension);
+}
+  });
+  if (carbonFiles == null || carbonFiles.length < 1) {
+throw new IOException("Carbon file not exists.");
+  }
+  return carbonFiles;
+}
+return null;
+  }
+
+  /**
+   * read schema from path,
+   * path can be folder path, carbonindex file path, and carbondata file 
path
+   * and will not check all files schema
+   *
+   * @param path file/folder path
+   * @return schema
+   * @throws IOException
+   */
+  public static Schema readSchema(String path) throws IOException {
+return readSchema(path, false);
+  }
+
+  /**
+   * read schema from path,
+   * path can be folder path, carbonindex file path, and carbondata file 
path
+   * and user can decide whether check all files schema
+   *
+   * @param path file/folder path
+   * @param validateSchema whether check all files schema
+   * @return schema
+   * @throws IOException
+   */
+  public static Schema readSchema(String path, boolean validateSchema) 
throws IOException {
+if (path.endsWith(INDEX_FILE_EXT)) {
+  return readSchemaFromIndexFile(path);
+} else if (path.endsWith(CARBON_DATA_EXT)) {
+  return readSchemaFromDataFile(path);
+} else if (validateSchema) {
+  CarbonFile[] carbonIndexFiles = getCarbonFile(path, INDEX_FILE_EXT);
+  Schema schema;
+  if (carbonIndexFiles != null && carbonIndexFiles.length != 0) {
+schema = 
readSchemaFromIndexFile(carbonIndexFiles[0].getAbsolutePath());
+for (int i = 1; i < carbonIndexFiles.length; i++) {
+  Schema schema2 = 
readSchemaFromIndexFile(carbonIndexFiles[i].getAbsolutePath());
+  if (schema != schema2) {
+throw new CarbonDataLoadingException("Schema is different 
between different files.");
+  }
+}
+CarbonFile[] carbonDataFiles = getCarbonFile(path, 
CARBON_DATA_EXT);
+for (int i = 0; i < carbonDataFiles.length; i++) {
+  Schema schema2 = 
readSchemaFromDataFile(carbonDataFiles[i].getAbsolutePath());
+  if (!schema.equals(schema2)) {
+throw new CarbonDataLoadingException("Schema is different 
between different files.");
+  }
+}
+return schema;
+  } else {
+throw new CarbonDataLoadingException("No carbonindex file in this 
path.");
+  }
+} else {
+  String indexFilePath = getCarbonFile(path, 
INDEX_FILE_EXT)[0].getAbsolutePath();
+  if (indexFilePath != null) {
--- End diff --

I think this null check is not required. Is there any chance the absolute 
path can be null ?


---


[GitHub] carbondata pull request #2804: [CARBONDATA-2996] CarbonSchemaReader support ...

2018-11-06 Thread KanakaKumar
Github user KanakaKumar commented on a diff in the pull request:

https://github.com/apache/carbondata/pull/2804#discussion_r231060332
  
--- Diff: 
store/sdk/src/main/java/org/apache/carbondata/sdk/file/CarbonSchemaReader.java 
---
@@ -144,4 +246,28 @@ public static Schema readSchemaInIndexFile(String 
indexFilePath) throws IOExcept
 }
   }
 
+  /**
+   * This method return the version details in formatted string by reading 
from carbondata file
+   *
+   * @param dataFilePath
+   * @return
+   * @throws IOException
+   */
+  public static String getVersionDetails(String dataFilePath) throws 
IOException {
--- End diff --

This complete method is displayed as removed and added again. Is it 
possible to avoid?


---


[GitHub] carbondata pull request #2804: [CARBONDATA-2996] CarbonSchemaReader support ...

2018-11-06 Thread KanakaKumar
Github user KanakaKumar commented on a diff in the pull request:

https://github.com/apache/carbondata/pull/2804#discussion_r231059908
  
--- Diff: 
store/sdk/src/main/java/org/apache/carbondata/sdk/file/CarbonSchemaReader.java 
---
@@ -61,14 +65,122 @@ public static Schema readSchemaInSchemaFile(String 
schemaFilePath) throws IOExce
 return new Schema(schemaList);
   }
 
+  /**
+   * get carbondata/carbonindex file in path
+   *
+   * @param path carbon file path
+   * @return CarbonFile array
+   */
+  private static CarbonFile[] getCarbonFile(String path, final String 
extension)
+  throws IOException {
+String dataFilePath = path;
+if (!(dataFilePath.contains(extension))) {
+  CarbonFile[] carbonFiles = FileFactory
+  .getCarbonFile(path)
+  .listFiles(new CarbonFileFilter() {
+@Override
+public boolean accept(CarbonFile file) {
+  if (file == null) {
+return false;
+  }
+  return file.getName().endsWith(extension);
+}
+  });
+  if (carbonFiles == null || carbonFiles.length < 1) {
+throw new IOException("Carbon file not exists.");
+  }
+  return carbonFiles;
+}
+return null;
--- End diff --

We can stick to one contract from the method. Either return the list or 
throw exception.  Generally listing APIs should not return null, if this case 
is not expected, we can throw exception.


---


[GitHub] carbondata pull request #2804: [CARBONDATA-2996] CarbonSchemaReader support ...

2018-11-06 Thread KanakaKumar
Github user KanakaKumar commented on a diff in the pull request:

https://github.com/apache/carbondata/pull/2804#discussion_r231058670
  
--- Diff: 
store/sdk/src/main/java/org/apache/carbondata/sdk/file/CarbonSchemaReader.java 
---
@@ -61,14 +65,122 @@ public static Schema readSchemaInSchemaFile(String 
schemaFilePath) throws IOExce
 return new Schema(schemaList);
   }
 
+  /**
+   * get carbondata/carbonindex file in path
+   *
+   * @param path carbon file path
+   * @return CarbonFile array
+   */
+  private static CarbonFile[] getCarbonFile(String path, final String 
extension)
+  throws IOException {
+String dataFilePath = path;
+if (!(dataFilePath.contains(extension))) {
+  CarbonFile[] carbonFiles = FileFactory
+  .getCarbonFile(path)
+  .listFiles(new CarbonFileFilter() {
+@Override
+public boolean accept(CarbonFile file) {
+  if (file == null) {
+return false;
+  }
+  return file.getName().endsWith(extension);
+}
+  });
+  if (carbonFiles == null || carbonFiles.length < 1) {
+throw new IOException("Carbon file not exists.");
+  }
+  return carbonFiles;
+}
+return null;
+  }
+
+  /**
+   * read schema from path,
+   * path can be folder path, carbonindex file path, and carbondata file 
path
+   * and will not check all files schema
+   *
+   * @param path file/folder path
+   * @return schema
+   * @throws IOException
+   */
+  public static Schema readSchema(String path) throws IOException {
+return readSchema(path, false);
+  }
+
+  /**
+   * read schema from path,
+   * path can be folder path, carbonindex file path, and carbondata file 
path
+   * and user can decide whether check all files schema
+   *
+   * @param path file/folder path
+   * @param validateSchema whether check all files schema
+   * @return schema
+   * @throws IOException
+   */
+  public static Schema readSchema(String path, boolean validateSchema) 
throws IOException {
+if (path.endsWith(INDEX_FILE_EXT)) {
+  return readSchemaFromIndexFile(path);
+} else if (path.endsWith(CARBON_DATA_EXT)) {
+  return readSchemaFromDataFile(path);
+} else if (validateSchema) {
+  CarbonFile[] carbonIndexFiles = getCarbonFile(path, INDEX_FILE_EXT);
+  Schema schema;
+  if (carbonIndexFiles != null && carbonIndexFiles.length != 0) {
+schema = 
readSchemaFromIndexFile(carbonIndexFiles[0].getAbsolutePath());
+for (int i = 1; i < carbonIndexFiles.length; i++) {
+  Schema schema2 = 
readSchemaFromIndexFile(carbonIndexFiles[i].getAbsolutePath());
+  if (schema != schema2) {
+throw new CarbonDataLoadingException("Schema is different 
between different files.");
+  }
+}
+CarbonFile[] carbonDataFiles = getCarbonFile(path, 
CARBON_DATA_EXT);
+for (int i = 0; i < carbonDataFiles.length; i++) {
+  Schema schema2 = 
readSchemaFromDataFile(carbonDataFiles[i].getAbsolutePath());
+  if (!schema.equals(schema2)) {
+throw new CarbonDataLoadingException("Schema is different 
between different files.");
+  }
+}
+return schema;
+  } else {
+throw new CarbonDataLoadingException("No carbonindex file in this 
path.");
+  }
+} else {
+  String indexFilePath = getCarbonFile(path, 
INDEX_FILE_EXT)[0].getAbsolutePath();
+  if (indexFilePath != null) {
+return readSchemaFromIndexFile(indexFilePath);
+  } else {
+String dataFilePath = getCarbonFile(path, 
CARBON_DATA_EXT)[0].getAbsolutePath();
--- End diff --

As per getCarbonFile(...) implementation, if there is no INDEX file found, 
it throws exception. So, there is no need of this else case ?


---


[GitHub] carbondata pull request #2804: [CARBONDATA-2996] CarbonSchemaReader support ...

2018-11-06 Thread KanakaKumar
Github user KanakaKumar commented on a diff in the pull request:

https://github.com/apache/carbondata/pull/2804#discussion_r231057030
  
--- Diff: 
store/sdk/src/main/java/org/apache/carbondata/sdk/file/CarbonSchemaReader.java 
---
@@ -61,14 +65,122 @@ public static Schema readSchemaInSchemaFile(String 
schemaFilePath) throws IOExce
 return new Schema(schemaList);
   }
 
+  /**
+   * get carbondata/carbonindex file in path
+   *
+   * @param path carbon file path
+   * @return CarbonFile array
+   */
+  private static CarbonFile[] getCarbonFile(String path, final String 
extension)
+  throws IOException {
+String dataFilePath = path;
+if (!(dataFilePath.contains(extension))) {
+  CarbonFile[] carbonFiles = FileFactory
+  .getCarbonFile(path)
+  .listFiles(new CarbonFileFilter() {
+@Override
+public boolean accept(CarbonFile file) {
+  if (file == null) {
+return false;
+  }
+  return file.getName().endsWith(extension);
+}
+  });
+  if (carbonFiles == null || carbonFiles.length < 1) {
+throw new IOException("Carbon file not exists.");
+  }
+  return carbonFiles;
+}
+return null;
+  }
+
+  /**
+   * read schema from path,
+   * path can be folder path, carbonindex file path, and carbondata file 
path
+   * and will not check all files schema
+   *
+   * @param path file/folder path
+   * @return schema
+   * @throws IOException
+   */
+  public static Schema readSchema(String path) throws IOException {
+return readSchema(path, false);
+  }
+
+  /**
+   * read schema from path,
+   * path can be folder path, carbonindex file path, and carbondata file 
path
+   * and user can decide whether check all files schema
+   *
+   * @param path file/folder path
+   * @param validateSchema whether check all files schema
+   * @return schema
+   * @throws IOException
+   */
+  public static Schema readSchema(String path, boolean validateSchema) 
throws IOException {
+if (path.endsWith(INDEX_FILE_EXT)) {
+  return readSchemaFromIndexFile(path);
+} else if (path.endsWith(CARBON_DATA_EXT)) {
+  return readSchemaFromDataFile(path);
+} else if (validateSchema) {
+  CarbonFile[] carbonIndexFiles = getCarbonFile(path, INDEX_FILE_EXT);
+  Schema schema;
+  if (carbonIndexFiles != null && carbonIndexFiles.length != 0) {
+schema = 
readSchemaFromIndexFile(carbonIndexFiles[0].getAbsolutePath());
+for (int i = 1; i < carbonIndexFiles.length; i++) {
+  Schema schema2 = 
readSchemaFromIndexFile(carbonIndexFiles[i].getAbsolutePath());
+  if (schema != schema2) {
--- End diff --

use equals .. schema.equals(schema2)


---


[GitHub] carbondata pull request #2804: [CARBONDATA-2996] CarbonSchemaReader support ...

2018-11-05 Thread xubo245
Github user xubo245 commented on a diff in the pull request:

https://github.com/apache/carbondata/pull/2804#discussion_r230983007
  
--- Diff: 
store/sdk/src/main/java/org/apache/carbondata/sdk/file/CarbonSchemaReader.java 
---
@@ -61,14 +65,121 @@ public static Schema readSchemaInSchemaFile(String 
schemaFilePath) throws IOExce
 return new Schema(schemaList);
   }
 
+  /**
+   * get carbondata/carbonindex file in path
+   *
+   * @param path carbon file path
+   * @return CarbonFile array
+   */
+  private static CarbonFile[] getCarbonFile(String path, final String 
extension) {
+String dataFilePath = path;
+if (!(dataFilePath.contains(extension))) {
+  CarbonFile[] carbonFiles = FileFactory
+  .getCarbonFile(path)
+  .listFiles(new CarbonFileFilter() {
+@Override
+public boolean accept(CarbonFile file) {
+  if (file == null) {
+return false;
+  }
+  return file.getName().endsWith(extension);
+}
+  });
+  if (carbonFiles == null || carbonFiles.length < 1) {
+throw new RuntimeException("Carbon file not exists.");
--- End diff --

ok, done


---


[GitHub] carbondata pull request #2804: [CARBONDATA-2996] CarbonSchemaReader support ...

2018-11-05 Thread xubo245
Github user xubo245 commented on a diff in the pull request:

https://github.com/apache/carbondata/pull/2804#discussion_r230982799
  
--- Diff: 
store/sdk/src/main/java/org/apache/carbondata/sdk/file/CarbonSchemaReader.java 
---
@@ -61,14 +65,121 @@ public static Schema readSchemaInSchemaFile(String 
schemaFilePath) throws IOExce
 return new Schema(schemaList);
   }
 
+  /**
+   * get carbondata/carbonindex file in path
+   *
+   * @param path carbon file path
+   * @return CarbonFile array
+   */
+  private static CarbonFile[] getCarbonFile(String path, final String 
extension) {
+String dataFilePath = path;
+if (!(dataFilePath.contains(extension))) {
+  CarbonFile[] carbonFiles = FileFactory
+  .getCarbonFile(path)
+  .listFiles(new CarbonFileFilter() {
+@Override
+public boolean accept(CarbonFile file) {
+  if (file == null) {
+return false;
+  }
+  return file.getName().endsWith(extension);
+}
+  });
+  if (carbonFiles == null || carbonFiles.length < 1) {
+throw new RuntimeException("Carbon file not exists.");
+  }
+  return carbonFiles;
+}
+return null;
+  }
+
+  /**
+   * read schema from path,
+   * path can be folder path, carbonindex file path, and carbondata file 
path
+   * and will not check all files schema
+   *
+   * @param path file/folder path
+   * @return schema
+   * @throws IOException
+   */
+  public static Schema readSchema(String path) throws IOException {
+return readSchema(path, false);
+  }
+
+  /**
+   * read schema from path,
+   * path can be folder path, carbonindex file path, and carbondata file 
path
+   * and user can decide whether check all files schema
+   *
+   * @param path file/folder path
+   * @param checkFilesSchema whether check all files schema
+   * @return schema
+   * @throws IOException
+   */
+  public static Schema readSchema(String path, boolean checkFilesSchema) 
throws IOException {
--- End diff --

when user only want to check schema and no need to query data, they can use 
readSchema. and readSchema also will faster.


---


[GitHub] carbondata pull request #2804: [CARBONDATA-2996] CarbonSchemaReader support ...

2018-11-05 Thread xubo245
Github user xubo245 commented on a diff in the pull request:

https://github.com/apache/carbondata/pull/2804#discussion_r230982638
  
--- Diff: 
store/sdk/src/test/java/org/apache/carbondata/sdk/file/CarbonSchemaReaderTest.java
 ---
@@ -101,18 +104,30 @@ public boolean accept(CarbonFile file) {
   String dataFilePath = carbonFiles[0].getAbsolutePath();
 
   Schema schema = CarbonSchemaReader
-  .readSchemaInDataFile(dataFilePath)
+  .readSchema(dataFilePath)
   .asOriginOrder();
 
   assertEquals(schema.getFieldsLength(), 12);
   checkSchema(schema);
+} catch (Throwable e) {
+  e.printStackTrace();
--- End diff --

ok, done,added Assert.fail();


---


[GitHub] carbondata pull request #2804: [CARBONDATA-2996] CarbonSchemaReader support ...

2018-11-05 Thread xubo245
Github user xubo245 commented on a diff in the pull request:

https://github.com/apache/carbondata/pull/2804#discussion_r230982398
  
--- Diff: docs/sdk-guide.md ---
@@ -685,6 +685,31 @@ Find example code at 
[CarbonReaderExample](https://github.com/apache/carbondata/
*/
   public static Schema readSchemaInIndexFile(String indexFilePath);
 ```
+```
+  /**
+   * read schema from path,
+   * path can be folder path,carbonindex file path, and carbondata file 
path
+   * and will not check all files schema
+   *
+   * @param path file/folder path
+   * @return schema
+   * @throws IOException
+   */
+  public static Schema readSchema(String path);
+```
+```
+  /**
+   * read schema from path,
+   * path can be folder path,carbonindex file path, and carbondata file 
path
+   * and user can decide whether check all files schema
+   *
+   * @param path file/folder path
+   * @param checkFilesSchema whether check all files schema
+   * @return schema
+   * @throws IOException
+   */
+  public static Schema readSchema(String path, boolean checkFilesSchema);
--- End diff --

ok, done


---


[GitHub] carbondata pull request #2804: [CARBONDATA-2996] CarbonSchemaReader support ...

2018-11-05 Thread KanakaKumar
Github user KanakaKumar commented on a diff in the pull request:

https://github.com/apache/carbondata/pull/2804#discussion_r230802487
  
--- Diff: 
store/sdk/src/main/java/org/apache/carbondata/sdk/file/CarbonSchemaReader.java 
---
@@ -61,14 +65,121 @@ public static Schema readSchemaInSchemaFile(String 
schemaFilePath) throws IOExce
 return new Schema(schemaList);
   }
 
+  /**
+   * get carbondata/carbonindex file in path
+   *
+   * @param path carbon file path
+   * @return CarbonFile array
+   */
+  private static CarbonFile[] getCarbonFile(String path, final String 
extension) {
+String dataFilePath = path;
+if (!(dataFilePath.contains(extension))) {
+  CarbonFile[] carbonFiles = FileFactory
+  .getCarbonFile(path)
+  .listFiles(new CarbonFileFilter() {
+@Override
+public boolean accept(CarbonFile file) {
+  if (file == null) {
+return false;
+  }
+  return file.getName().endsWith(extension);
+}
+  });
+  if (carbonFiles == null || carbonFiles.length < 1) {
+throw new RuntimeException("Carbon file not exists.");
--- End diff --

Why RunTimeException, IO related failures should throw IOException


---


[GitHub] carbondata pull request #2804: [CARBONDATA-2996] CarbonSchemaReader support ...

2018-11-05 Thread KanakaKumar
Github user KanakaKumar commented on a diff in the pull request:

https://github.com/apache/carbondata/pull/2804#discussion_r230801853
  
--- Diff: 
store/sdk/src/main/java/org/apache/carbondata/sdk/file/CarbonSchemaReader.java 
---
@@ -61,14 +65,121 @@ public static Schema readSchemaInSchemaFile(String 
schemaFilePath) throws IOExce
 return new Schema(schemaList);
   }
 
+  /**
+   * get carbondata/carbonindex file in path
+   *
+   * @param path carbon file path
+   * @return CarbonFile array
+   */
+  private static CarbonFile[] getCarbonFile(String path, final String 
extension) {
+String dataFilePath = path;
+if (!(dataFilePath.contains(extension))) {
+  CarbonFile[] carbonFiles = FileFactory
+  .getCarbonFile(path)
+  .listFiles(new CarbonFileFilter() {
+@Override
+public boolean accept(CarbonFile file) {
+  if (file == null) {
+return false;
+  }
+  return file.getName().endsWith(extension);
+}
+  });
+  if (carbonFiles == null || carbonFiles.length < 1) {
+throw new RuntimeException("Carbon file not exists.");
+  }
+  return carbonFiles;
+}
+return null;
+  }
+
+  /**
+   * read schema from path,
+   * path can be folder path, carbonindex file path, and carbondata file 
path
+   * and will not check all files schema
+   *
+   * @param path file/folder path
+   * @return schema
+   * @throws IOException
+   */
+  public static Schema readSchema(String path) throws IOException {
+return readSchema(path, false);
+  }
+
+  /**
+   * read schema from path,
+   * path can be folder path, carbonindex file path, and carbondata file 
path
+   * and user can decide whether check all files schema
+   *
+   * @param path file/folder path
+   * @param checkFilesSchema whether check all files schema
+   * @return schema
+   * @throws IOException
+   */
+  public static Schema readSchema(String path, boolean checkFilesSchema) 
throws IOException {
--- End diff --

readSchema(String path, boolean checkFilesSchema) 
-- Is this schema validation method is required ? If no use case we can 
skip this..  during query execution anyways schema is validated.


---


[GitHub] carbondata pull request #2804: [CARBONDATA-2996] CarbonSchemaReader support ...

2018-11-05 Thread KanakaKumar
Github user KanakaKumar commented on a diff in the pull request:

https://github.com/apache/carbondata/pull/2804#discussion_r230799912
  
--- Diff: 
store/sdk/src/test/java/org/apache/carbondata/sdk/file/CarbonSchemaReaderTest.java
 ---
@@ -101,18 +104,30 @@ public boolean accept(CarbonFile file) {
   String dataFilePath = carbonFiles[0].getAbsolutePath();
 
   Schema schema = CarbonSchemaReader
-  .readSchemaInDataFile(dataFilePath)
+  .readSchema(dataFilePath)
   .asOriginOrder();
 
   assertEquals(schema.getFieldsLength(), 12);
   checkSchema(schema);
+} catch (Throwable e) {
+  e.printStackTrace();
--- End diff --

should fail


---


[GitHub] carbondata pull request #2804: [CARBONDATA-2996] CarbonSchemaReader support ...

2018-11-05 Thread KanakaKumar
Github user KanakaKumar commented on a diff in the pull request:

https://github.com/apache/carbondata/pull/2804#discussion_r230788337
  
--- Diff: docs/sdk-guide.md ---
@@ -685,6 +685,31 @@ Find example code at 
[CarbonReaderExample](https://github.com/apache/carbondata/
*/
   public static Schema readSchemaInIndexFile(String indexFilePath);
 ```
+```
+  /**
+   * read schema from path,
+   * path can be folder path,carbonindex file path, and carbondata file 
path
+   * and will not check all files schema
+   *
+   * @param path file/folder path
+   * @return schema
+   * @throws IOException
+   */
+  public static Schema readSchema(String path);
+```
+```
+  /**
+   * read schema from path,
+   * path can be folder path,carbonindex file path, and carbondata file 
path
+   * and user can decide whether check all files schema
+   *
+   * @param path file/folder path
+   * @param checkFilesSchema whether check all files schema
+   * @return schema
+   * @throws IOException
+   */
+  public static Schema readSchema(String path, boolean checkFilesSchema);
--- End diff --

checkFilesSchema should be validateSchema 


---


[GitHub] carbondata pull request #2804: [CARBONDATA-2996] CarbonSchemaReader support ...

2018-10-31 Thread xubo245
Github user xubo245 commented on a diff in the pull request:

https://github.com/apache/carbondata/pull/2804#discussion_r229608660
  
--- Diff: 
store/sdk/src/main/java/org/apache/carbondata/sdk/file/CarbonSchemaReader.java 
---
@@ -64,11 +66,70 @@ public static Schema readSchemaInSchemaFile(String 
schemaFilePath) throws IOExce
   /**
* Read carbondata file and return the schema
*
-   * @param dataFilePath complete path including carbondata file name
+   * @param path carbondata store path
* @return Schema object
* @throws IOException
*/
-  public static Schema readSchemaInDataFile(String dataFilePath) throws 
IOException {
+  public static Schema readSchemaFromFirstDataFile(String path) throws 
IOException {
+String dataFilePath = getFirstCarbonDataFile(path);
+return readSchemaInDataFile(dataFilePath);
+  }
+
+  /**
+   * get first carbondata file in path and don't check all files schema
+   *
+   * @param path carbondata file path
+   * @return first carbondata file name
+   */
+  public static String getFirstCarbonDataFile(String path) {
--- End diff --

ok, misunderstand , sorry。
Updated


---


[GitHub] carbondata pull request #2804: [CARBONDATA-2996] CarbonSchemaReader support ...

2018-10-31 Thread ajantha-bhat
Github user ajantha-bhat commented on a diff in the pull request:

https://github.com/apache/carbondata/pull/2804#discussion_r229583983
  
--- Diff: 
store/sdk/src/main/java/org/apache/carbondata/sdk/file/CarbonSchemaReader.java 
---
@@ -64,11 +66,70 @@ public static Schema readSchemaInSchemaFile(String 
schemaFilePath) throws IOExce
   /**
* Read carbondata file and return the schema
*
-   * @param dataFilePath complete path including carbondata file name
+   * @param path carbondata store path
* @return Schema object
* @throws IOException
*/
-  public static Schema readSchemaInDataFile(String dataFilePath) throws 
IOException {
+  public static Schema readSchemaFromFirstDataFile(String path) throws 
IOException {
+String dataFilePath = getFirstCarbonDataFile(path);
+return readSchemaInDataFile(dataFilePath);
+  }
+
+  /**
+   * get first carbondata file in path and don't check all files schema
+   *
+   * @param path carbondata file path
+   * @return first carbondata file name
+   */
+  public static String getFirstCarbonDataFile(String path) {
--- End diff --

I have already suggested to keep getFirstCarbonFile(path, extension) -- 
this only will give data or index file based on the extension. 

no need to have duplicate code for both index and data file 


---


[GitHub] carbondata pull request #2804: [CARBONDATA-2996] CarbonSchemaReader support ...

2018-10-30 Thread xubo245
Github user xubo245 commented on a diff in the pull request:

https://github.com/apache/carbondata/pull/2804#discussion_r229198062
  
--- Diff: 
store/sdk/src/main/java/org/apache/carbondata/sdk/file/CarbonSchemaReader.java 
---
@@ -59,11 +60,30 @@ public static Schema readSchemaInSchemaFile(String 
schemaFilePath) throws IOExce
   /**
* Read carbondata file and return the schema
*
-   * @param dataFilePath complete path including carbondata file name
+   * @param path complete path including carbondata file name
* @return Schema object
* @throws IOException
*/
-  public static Schema readSchemaInDataFile(String dataFilePath) throws 
IOException {
+  public static Schema readSchemaInDataFile(String path) throws 
IOException {
+String dataFilePath = path;
+if (!(dataFilePath.contains(".carbondata"))) {
+  CarbonFile[] carbonFiles = FileFactory
+  .getCarbonFile(path)
+  .listFiles(new CarbonFileFilter() {
+@Override
+public boolean accept(CarbonFile file) {
+  if (file == null) {
+return false;
+  }
+  return file.getName().endsWith(".carbondata");
+}
+  });
+  if (carbonFiles == null || carbonFiles.length < 1) {
+throw new RuntimeException("Carbon data file not exists.");
+  }
+  dataFilePath = carbonFiles[0].getAbsolutePath();
--- End diff --

ok, I add ReadSchemaFromFirstDataFile and ReadSchemaFromFirstIndexFile


---


[GitHub] carbondata pull request #2804: [CARBONDATA-2996] CarbonSchemaReader support ...

2018-10-30 Thread xubo245
Github user xubo245 commented on a diff in the pull request:

https://github.com/apache/carbondata/pull/2804#discussion_r229194361
  
--- Diff: 
store/sdk/src/test/java/org/apache/carbondata/sdk/file/CarbonSchemaReaderTest.java
 ---
@@ -0,0 +1,170 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.carbondata.sdk.file;
+
+import java.io.File;
+import java.util.HashMap;
+import java.util.Map;
+
+import junit.framework.TestCase;
+import org.apache.carbondata.core.metadata.datatype.DataTypes;
+import org.apache.commons.io.FileUtils;
+import org.junit.Test;
+
+public class CarbonSchemaReaderTest extends TestCase {
+
+  @Test
+  public void testReadSchemaFromDataFile() {
+String path = "./testWriteFiles";
+try {
+  FileUtils.deleteDirectory(new File(path));
+
+  Field[] fields = new Field[11];
+  fields[0] = new Field("stringField", DataTypes.STRING);
+  fields[1] = new Field("shortField", DataTypes.SHORT);
+  fields[2] = new Field("intField", DataTypes.INT);
+  fields[3] = new Field("longField", DataTypes.LONG);
+  fields[4] = new Field("doubleField", DataTypes.DOUBLE);
+  fields[5] = new Field("boolField", DataTypes.BOOLEAN);
+  fields[6] = new Field("dateField", DataTypes.DATE);
+  fields[7] = new Field("timeField", DataTypes.TIMESTAMP);
+  fields[8] = new Field("decimalField", DataTypes.createDecimalType(8, 
2));
+  fields[9] = new Field("varcharField", DataTypes.VARCHAR);
+  fields[10] = new Field("arrayField", 
DataTypes.createArrayType(DataTypes.STRING));
+  Map map = new HashMap<>();
+  map.put("complex_delimiter_level_1", "#");
+  CarbonWriter writer = CarbonWriter.builder()
+  .outputPath(path)
+  .withLoadOptions(map)
+  .withCsvInput(new Schema(fields)).build();
+
+  for (int i = 0; i < 10; i++) {
+String[] row2 = new String[]{
+"robot" + (i % 10),
+String.valueOf(i % 1),
+String.valueOf(i),
+String.valueOf(Long.MAX_VALUE - i),
+String.valueOf((double) i / 2),
+String.valueOf(true),
+"2019-03-02",
+"2019-02-12 03:03:34",
+"12.345",
+"varchar",
+"Hello#World#From#Carbon"
+};
+writer.write(row2);
+  }
+  writer.close();
+
+  Schema schema = CarbonSchemaReader
+  .readSchemaInDataFile(path)
+  .asOriginOrder();
+  // Transform the schema
+  assertEquals(schema.getFields().length, 11);
+  String[] strings = new String[schema.getFields().length];
+  for (int i = 0; i < schema.getFields().length; i++) {
+strings[i] = (schema.getFields())[i].getFieldName();
+  }
+  assert (strings[0].equalsIgnoreCase("stringField"));
+  assert (strings[1].equalsIgnoreCase("shortField"));
+  assert (strings[2].equalsIgnoreCase("intField"));
+  assert (strings[3].equalsIgnoreCase("longField"));
+  assert (strings[4].equalsIgnoreCase("doubleField"));
--- End diff --

ok, done


---


[GitHub] carbondata pull request #2804: [CARBONDATA-2996] CarbonSchemaReader support ...

2018-10-30 Thread xubo245
Github user xubo245 commented on a diff in the pull request:

https://github.com/apache/carbondata/pull/2804#discussion_r229194343
  
--- Diff: 
store/sdk/src/test/java/org/apache/carbondata/sdk/file/CarbonSchemaReaderTest.java
 ---
@@ -0,0 +1,170 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.carbondata.sdk.file;
+
+import java.io.File;
+import java.util.HashMap;
+import java.util.Map;
+
+import junit.framework.TestCase;
+import org.apache.carbondata.core.metadata.datatype.DataTypes;
+import org.apache.commons.io.FileUtils;
+import org.junit.Test;
+
+public class CarbonSchemaReaderTest extends TestCase {
+
+  @Test
+  public void testReadSchemaFromDataFile() {
+String path = "./testWriteFiles";
+try {
+  FileUtils.deleteDirectory(new File(path));
+
+  Field[] fields = new Field[11];
+  fields[0] = new Field("stringField", DataTypes.STRING);
--- End diff --

ok, done


---


[GitHub] carbondata pull request #2804: [CARBONDATA-2996] CarbonSchemaReader support ...

2018-10-29 Thread ajantha-bhat
Github user ajantha-bhat commented on a diff in the pull request:

https://github.com/apache/carbondata/pull/2804#discussion_r229173861
  
--- Diff: 
store/sdk/src/test/java/org/apache/carbondata/sdk/file/CarbonSchemaReaderTest.java
 ---
@@ -0,0 +1,170 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.carbondata.sdk.file;
+
+import java.io.File;
+import java.util.HashMap;
+import java.util.Map;
+
+import junit.framework.TestCase;
+import org.apache.carbondata.core.metadata.datatype.DataTypes;
+import org.apache.commons.io.FileUtils;
+import org.junit.Test;
+
+public class CarbonSchemaReaderTest extends TestCase {
+
+  @Test
+  public void testReadSchemaFromDataFile() {
+String path = "./testWriteFiles";
+try {
+  FileUtils.deleteDirectory(new File(path));
+
+  Field[] fields = new Field[11];
+  fields[0] = new Field("stringField", DataTypes.STRING);
+  fields[1] = new Field("shortField", DataTypes.SHORT);
+  fields[2] = new Field("intField", DataTypes.INT);
+  fields[3] = new Field("longField", DataTypes.LONG);
+  fields[4] = new Field("doubleField", DataTypes.DOUBLE);
+  fields[5] = new Field("boolField", DataTypes.BOOLEAN);
+  fields[6] = new Field("dateField", DataTypes.DATE);
+  fields[7] = new Field("timeField", DataTypes.TIMESTAMP);
+  fields[8] = new Field("decimalField", DataTypes.createDecimalType(8, 
2));
+  fields[9] = new Field("varcharField", DataTypes.VARCHAR);
+  fields[10] = new Field("arrayField", 
DataTypes.createArrayType(DataTypes.STRING));
+  Map map = new HashMap<>();
+  map.put("complex_delimiter_level_1", "#");
+  CarbonWriter writer = CarbonWriter.builder()
+  .outputPath(path)
+  .withLoadOptions(map)
+  .withCsvInput(new Schema(fields)).build();
+
+  for (int i = 0; i < 10; i++) {
+String[] row2 = new String[]{
+"robot" + (i % 10),
+String.valueOf(i % 1),
+String.valueOf(i),
+String.valueOf(Long.MAX_VALUE - i),
+String.valueOf((double) i / 2),
+String.valueOf(true),
+"2019-03-02",
+"2019-02-12 03:03:34",
+"12.345",
+"varchar",
+"Hello#World#From#Carbon"
+};
+writer.write(row2);
+  }
+  writer.close();
+
+  Schema schema = CarbonSchemaReader
+  .readSchemaInDataFile(path)
+  .asOriginOrder();
+  // Transform the schema
+  assertEquals(schema.getFields().length, 11);
+  String[] strings = new String[schema.getFields().length];
+  for (int i = 0; i < schema.getFields().length; i++) {
+strings[i] = (schema.getFields())[i].getFieldName();
+  }
+  assert (strings[0].equalsIgnoreCase("stringField"));
+  assert (strings[1].equalsIgnoreCase("shortField"));
+  assert (strings[2].equalsIgnoreCase("intField"));
+  assert (strings[3].equalsIgnoreCase("longField"));
+  assert (strings[4].equalsIgnoreCase("doubleField"));
--- End diff --

can move it to a method and use for both the test case.



---


[GitHub] carbondata pull request #2804: [CARBONDATA-2996] CarbonSchemaReader support ...

2018-10-29 Thread ajantha-bhat
Github user ajantha-bhat commented on a diff in the pull request:

https://github.com/apache/carbondata/pull/2804#discussion_r229173916
  
--- Diff: 
store/sdk/src/main/java/org/apache/carbondata/sdk/file/CarbonSchemaReader.java 
---
@@ -59,11 +60,30 @@ public static Schema readSchemaInSchemaFile(String 
schemaFilePath) throws IOExce
   /**
* Read carbondata file and return the schema
*
-   * @param dataFilePath complete path including carbondata file name
+   * @param path complete path including carbondata file name
* @return Schema object
* @throws IOException
*/
-  public static Schema readSchemaInDataFile(String dataFilePath) throws 
IOException {
+  public static Schema readSchemaInDataFile(String path) throws 
IOException {
+String dataFilePath = path;
+if (!(dataFilePath.contains(".carbondata"))) {
+  CarbonFile[] carbonFiles = FileFactory
+  .getCarbonFile(path)
+  .listFiles(new CarbonFileFilter() {
+@Override
+public boolean accept(CarbonFile file) {
+  if (file == null) {
+return false;
+  }
+  return file.getName().endsWith(".carbondata");
+}
+  });
+  if (carbonFiles == null || carbonFiles.length < 1) {
+throw new RuntimeException("Carbon data file not exists.");
+  }
+  dataFilePath = carbonFiles[0].getAbsolutePath();
--- End diff --

In that case you can implement,

String getFirstCarbonFile(path, ExtenstionType)

and pass it to existing method. ReadSchemaFromFile() must only read it. It 
should not do any extra work.


---


[GitHub] carbondata pull request #2804: [CARBONDATA-2996] CarbonSchemaReader support ...

2018-10-29 Thread ajantha-bhat
Github user ajantha-bhat commented on a diff in the pull request:

https://github.com/apache/carbondata/pull/2804#discussion_r229173821
  
--- Diff: 
store/sdk/src/test/java/org/apache/carbondata/sdk/file/CarbonSchemaReaderTest.java
 ---
@@ -0,0 +1,170 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.carbondata.sdk.file;
+
+import java.io.File;
+import java.util.HashMap;
+import java.util.Map;
+
+import junit.framework.TestCase;
+import org.apache.carbondata.core.metadata.datatype.DataTypes;
+import org.apache.commons.io.FileUtils;
+import org.junit.Test;
+
+public class CarbonSchemaReaderTest extends TestCase {
+
+  @Test
+  public void testReadSchemaFromDataFile() {
+String path = "./testWriteFiles";
+try {
+  FileUtils.deleteDirectory(new File(path));
+
+  Field[] fields = new Field[11];
+  fields[0] = new Field("stringField", DataTypes.STRING);
--- End diff --

write you can move it in the setup() step


---


[GitHub] carbondata pull request #2804: [CARBONDATA-2996] CarbonSchemaReader support ...

2018-10-26 Thread xubo245
Github user xubo245 commented on a diff in the pull request:

https://github.com/apache/carbondata/pull/2804#discussion_r228506535
  
--- Diff: 
store/sdk/src/main/java/org/apache/carbondata/sdk/file/CarbonSchemaReader.java 
---
@@ -59,11 +60,30 @@ public static Schema readSchemaInSchemaFile(String 
schemaFilePath) throws IOExce
   /**
* Read carbondata file and return the schema
*
-   * @param dataFilePath complete path including carbondata file name
+   * @param path complete path including carbondata file name
* @return Schema object
* @throws IOException
*/
-  public static Schema readSchemaInDataFile(String dataFilePath) throws 
IOException {
+  public static Schema readSchemaInDataFile(String path) throws 
IOException {
+String dataFilePath = path;
+if (!(dataFilePath.contains(".carbondata"))) {
+  CarbonFile[] carbonFiles = FileFactory
+  .getCarbonFile(path)
+  .listFiles(new CarbonFileFilter() {
+@Override
+public boolean accept(CarbonFile file) {
+  if (file == null) {
+return false;
+  }
+  return file.getName().endsWith(".carbondata");
+}
+  });
+  if (carbonFiles == null || carbonFiles.length < 1) {
+throw new RuntimeException("Carbon data file not exists.");
+  }
+  dataFilePath = carbonFiles[0].getAbsolutePath();
--- End diff --

yes, take the only one data file. 
It's more convenient  for user give a path to read schema。and maybe the 
folder has sub-folder,use need list iteratively。There are some customer has 
this problem。
We can judge the different files schema if it's necessary。SDK can throw 
exception if multiple files has different schema。



---


[GitHub] carbondata pull request #2804: [CARBONDATA-2996] CarbonSchemaReader support ...

2018-10-26 Thread ajantha-bhat
Github user ajantha-bhat commented on a diff in the pull request:

https://github.com/apache/carbondata/pull/2804#discussion_r228413841
  
--- Diff: 
store/sdk/src/main/java/org/apache/carbondata/sdk/file/CarbonSchemaReader.java 
---
@@ -59,11 +60,30 @@ public static Schema readSchemaInSchemaFile(String 
schemaFilePath) throws IOExce
   /**
* Read carbondata file and return the schema
*
-   * @param dataFilePath complete path including carbondata file name
+   * @param path complete path including carbondata file name
* @return Schema object
* @throws IOException
*/
-  public static Schema readSchemaInDataFile(String dataFilePath) throws 
IOException {
+  public static Schema readSchemaInDataFile(String path) throws 
IOException {
+String dataFilePath = path;
+if (!(dataFilePath.contains(".carbondata"))) {
+  CarbonFile[] carbonFiles = FileFactory
+  .getCarbonFile(path)
+  .listFiles(new CarbonFileFilter() {
+@Override
+public boolean accept(CarbonFile file) {
+  if (file == null) {
+return false;
+  }
+  return file.getName().endsWith(".carbondata");
+}
+  });
+  if (carbonFiles == null || carbonFiles.length < 1) {
+throw new RuntimeException("Carbon data file not exists.");
+  }
+  dataFilePath = carbonFiles[0].getAbsolutePath();
--- End diff --

Taking only one data file (first file) ?

What if this folder has multiple files with different schema. what if user 
wanted schema info from file also?

Supporting schema read from folder is not required as this is exposed for 
user and he has the list of files. 
a) to read one file, user passes single file for this API.  -- already 
supported
b) to read multiple files, user can list files and pass all the files he 
want schema and call our API in a list -- already supported.

Just reading first file from folder doesn't make sense. This PR is not 
required as existing API already support all user scenarios.


---


[GitHub] carbondata pull request #2804: [CARBONDATA-2996] CarbonSchemaReader support ...

2018-10-09 Thread xubo245
GitHub user xubo245 opened a pull request:

https://github.com/apache/carbondata/pull/2804

[CARBONDATA-2996] CarbonSchemaReader support read schema from folder path

[CARBONDATA-2996] CarbonSchemaReader support read schema from folder path
1.readSchemaInDataFile suppurt read schema from folder path
2.readSchemaInIndexFile suppurt read schema from folder path

Be sure to do all of the following checklist to help us incorporate 
your contribution quickly and easily:

 - [ ] Any interfaces changed?
 No
 - [ ] Any backward compatibility impacted?
 No
 - [ ] Document update required?
No
 - [ ] Testing done
add test case
 - [ ] For large changes, please consider breaking it into sub-tasks under 
an umbrella JIRA. 
https://issues.apache.org/jira/browse/CARBONDATA-2951


You can merge this pull request into a Git repository by running:

$ git pull https://github.com/xubo245/carbondata 
CARBONDATA-2996_SchemaSupportFolder

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/carbondata/pull/2804.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #2804


commit 4246f5c720e77e31b898119d1499e412af06d810
Author: xubo245 
Date:   2018-10-09T07:16:09Z

[CARBONDATA-2996] CarbonSchemaReader support read schema from folder path
1.readSchemaInDataFile suppurt read schema from folder path
2.readSchemaInIndexFile suppurt read schema from folder path

commit b486fec8eaea1954c2a35590e5738af873ab4eaa
Author: xubo245 
Date:   2018-10-09T07:24:53Z

support S3




---