[GitHub] carbondata pull request #2804: [CARBONDATA-2996] CarbonSchemaReader support ...

ajantha-bhat Thu, 25 Oct 2018 23:06:01 -0700

Github user ajantha-bhat commented on a diff in the pull request:

    https://github.com/apache/carbondata/pull/2804#discussion_r228413841
  
    --- Diff: 
store/sdk/src/main/java/org/apache/carbondata/sdk/file/CarbonSchemaReader.java 
---
    @@ -59,11 +60,30 @@ public static Schema readSchemaInSchemaFile(String 
schemaFilePath) throws IOExce
       /**
        * Read carbondata file and return the schema
        *
    -   * @param dataFilePath complete path including carbondata file name
    +   * @param path complete path including carbondata file name
        * @return Schema object
        * @throws IOException
        */
    -  public static Schema readSchemaInDataFile(String dataFilePath) throws 
IOException {
    +  public static Schema readSchemaInDataFile(String path) throws 
IOException {
    +    String dataFilePath = path;
    +    if (!(dataFilePath.contains(".carbondata"))) {
    +      CarbonFile[] carbonFiles = FileFactory
    +          .getCarbonFile(path)
    +          .listFiles(new CarbonFileFilter() {
    +            @Override
    +            public boolean accept(CarbonFile file) {
    +              if (file == null) {
    +                return false;
    +              }
    +              return file.getName().endsWith(".carbondata");
    +            }
    +          });
    +      if (carbonFiles == null || carbonFiles.length < 1) {
    +        throw new RuntimeException("Carbon data file not exists.");
    +      }
    +      dataFilePath = carbonFiles[0].getAbsolutePath();
    --- End diff --
    
    Taking only one data file (first file) ?
    
    What if this folder has multiple files with different schema. what if user 
wanted schema info from file also?
    
    Supporting schema read from folder is not required as this is exposed for 
user and he has the list of files. 
    a) to read one file, user passes single file for this API.  -- already 
supported
    b) to read multiple files, user can list files and pass all the files he 
want schema and call our API in a list -- already supported.
    
    Just reading first file from folder doesn't make sense. This PR is not 
required as existing API already support all user scenarios.

---

[GitHub] carbondata pull request #2804: [CARBONDATA-2996] CarbonSchemaReader support ...

Reply via email to