Github user ajantha-bhat commented on a diff in the pull request: https://github.com/apache/carbondata/pull/2804#discussion_r228413841 --- Diff: store/sdk/src/main/java/org/apache/carbondata/sdk/file/CarbonSchemaReader.java --- @@ -59,11 +60,30 @@ public static Schema readSchemaInSchemaFile(String schemaFilePath) throws IOExce /** * Read carbondata file and return the schema * - * @param dataFilePath complete path including carbondata file name + * @param path complete path including carbondata file name * @return Schema object * @throws IOException */ - public static Schema readSchemaInDataFile(String dataFilePath) throws IOException { + public static Schema readSchemaInDataFile(String path) throws IOException { + String dataFilePath = path; + if (!(dataFilePath.contains(".carbondata"))) { + CarbonFile[] carbonFiles = FileFactory + .getCarbonFile(path) + .listFiles(new CarbonFileFilter() { + @Override + public boolean accept(CarbonFile file) { + if (file == null) { + return false; + } + return file.getName().endsWith(".carbondata"); + } + }); + if (carbonFiles == null || carbonFiles.length < 1) { + throw new RuntimeException("Carbon data file not exists."); + } + dataFilePath = carbonFiles[0].getAbsolutePath(); --- End diff -- Taking only one data file (first file) ? What if this folder has multiple files with different schema. what if user wanted schema info from file also? Supporting schema read from folder is not required as this is exposed for user and he has the list of files. a) to read one file, user passes single file for this API. -- already supported b) to read multiple files, user can list files and pass all the files he want schema and call our API in a list -- already supported. Just reading first file from folder doesn't make sense. This PR is not required as existing API already support all user scenarios.
---