Github user xubo245 commented on a diff in the pull request:
https://github.com/apache/carbondata/pull/2804#discussion_r228506535
--- Diff:
store/sdk/src/main/java/org/apache/carbondata/sdk/file/CarbonSchemaReader.java
---
@@ -59,11 +60,30 @@ public static Schema readSchemaInSchemaFile(String
schemaFilePath) throws IOExce
/**
* Read carbondata file and return the schema
*
- * @param dataFilePath complete path including carbondata file name
+ * @param path complete path including carbondata file name
* @return Schema object
* @throws IOException
*/
- public static Schema readSchemaInDataFile(String dataFilePath) throws
IOException {
+ public static Schema readSchemaInDataFile(String path) throws
IOException {
+ String dataFilePath = path;
+ if (!(dataFilePath.contains(".carbondata"))) {
+ CarbonFile[] carbonFiles = FileFactory
+ .getCarbonFile(path)
+ .listFiles(new CarbonFileFilter() {
+ @Override
+ public boolean accept(CarbonFile file) {
+ if (file == null) {
+ return false;
+ }
+ return file.getName().endsWith(".carbondata");
+ }
+ });
+ if (carbonFiles == null || carbonFiles.length < 1) {
+ throw new RuntimeException("Carbon data file not exists.");
+ }
+ dataFilePath = carbonFiles[0].getAbsolutePath();
--- End diff --
yes, take the only one data file.
It's more convenient for user give a path to read schemaãand maybe the
folder has sub-folderï¼use need list iterativelyãThere are some customer has
this problemã
We can judge the different files schema if it's necessaryãSDK can throw
exception if multiple files has different schemaã
---