[GitHub] [carbondata] Indhumathi27 commented on pull request #3809: [CARBONDATA-3881] Fix concurrent main table compaction and SI load issue
Indhumathi27 commented on pull request #3809: URL: https://github.com/apache/carbondata/pull/3809#issuecomment-652824236 retest this please This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [carbondata] Indhumathi27 commented on a change in pull request #3817: [CARBONDATA-3845] Bucket table creation fails with exception for empt…
Indhumathi27 commented on a change in pull request #3817: URL: https://github.com/apache/carbondata/pull/3817#discussion_r448797371 ## File path: integration/spark/src/main/scala/org/apache/carbondata/spark/CarbonOption.scala ## @@ -17,6 +17,8 @@ package org.apache.carbondata.spark +import scala.util.Try + Review comment: Please remove extra line This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [carbondata] Indhumathi27 commented on a change in pull request #3817: [CARBONDATA-3845] Bucket table creation fails with exception for empt…
Indhumathi27 commented on a change in pull request #3817: URL: https://github.com/apache/carbondata/pull/3817#discussion_r448797700 ## File path: integration/spark/src/main/scala/org/apache/spark/sql/parser/CarbonSpark2SqlParser.scala ## @@ -766,13 +766,13 @@ class CarbonSpark2SqlParser extends CarbonDDLSqlParser { throw new MalformedCarbonCommandException("Invalid table properties") } if (options.isBucketingEnabled) { - if (options.bucketNumber.toString.contains("-") || - options.bucketNumber.toString.contains("+") || options.bucketNumber == 0) { + if (options.bucketNumber == None || options.bucketNumber.get.toString.contains("-") || Review comment: ```suggestion if (options.bucketNumber.isEmpty || options.bucketNumber.get.toString.contains("-") || ``` This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [carbondata] Indhumathi27 commented on a change in pull request #3819: [CARBONDATA-3855]support carbon SDK to load data from different files
Indhumathi27 commented on a change in pull request #3819: URL: https://github.com/apache/carbondata/pull/3819#discussion_r448811521 ## File path: sdk/sdk/src/main/java/org/apache/carbondata/sdk/file/ParquetCarbonWriter.java ## @@ -0,0 +1,116 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + *http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.carbondata.sdk.file; + +import java.io.File; +import java.io.IOException; +import java.util.Arrays; +import java.util.List; + +import org.apache.avro.generic.GenericRecord; +import org.apache.hadoop.conf.Configuration; +import org.apache.hadoop.fs.Path; +import org.apache.parquet.avro.AvroReadSupport; +import org.apache.parquet.hadoop.ParquetReader; + +/** + * Implementation to write parquet rows in avro format to carbondata file. + */ +public class ParquetCarbonWriter extends AvroCarbonWriter { + private AvroCarbonWriter avroCarbonWriter = null; + private String filePath = ""; + private boolean isDirectory = false; + private List fileList; + + ParquetCarbonWriter(AvroCarbonWriter avroCarbonWriter) { +this.avroCarbonWriter = avroCarbonWriter; + } + + @Override + public void setFilePath(String filePath) { +this.filePath = filePath; + } + + @Override + public void setIsDirectory(boolean isDirectory) { +this.isDirectory = isDirectory; + } + + @Override + public void setFileList(List fileList) { +this.fileList = fileList; + } + + /** + * Load data of all parquet files at given location iteratively. + * + * @throws IOException + */ + @Override + public void write() throws IOException { +if (this.filePath.length() == 0) { + throw new RuntimeException("'withParquetPath()' " + + "must be called to support load parquet files"); +} +if (this.avroCarbonWriter == null) { + throw new RuntimeException("avro carbon writer can not be null"); +} +if (this.isDirectory) { + if (this.fileList == null || this.fileList.size() == 0) { +File[] dataFiles = new File(this.filePath).listFiles(); +if (dataFiles == null || dataFiles.length == 0) { + throw new RuntimeException("No Parquet file found at given location. Please provide " + + "the correct folder location."); +} +Arrays.sort(dataFiles); +for (File dataFile : dataFiles) { + this.loadSingleFile(dataFile); +} + } else { +for (String file : this.fileList) { + this.loadSingleFile(new File(this.filePath + "/" + file)); +} + } +} else { + this.loadSingleFile(new File(this.filePath)); +} + } + + private void loadSingleFile(File file) throws IOException { +AvroReadSupport avroReadSupport = new AvroReadSupport<>(); +ParquetReader parquetReader = ParquetReader.builder(avroReadSupport, +new Path(String.valueOf(file))).withConf(new Configuration()).build(); +GenericRecord genericRecord = null; +while ((genericRecord = parquetReader.read()) != null) { + System.out.println(genericRecord); Review comment: remove this line ## File path: sdk/sdk/src/main/java/org/apache/carbondata/sdk/file/ORCCarbonWriter.java ## @@ -0,0 +1,196 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + *http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.carbondata.sdk.file; + +import java.io.File; +import java.io.IOException; +import java.util.*; + +import org.apache.hadoop.conf.Configuration; +import org.apache.hadoop.fs.Path; +import org.a
[GitHub] [carbondata] Indhumathi27 commented on pull request #3819: [CARBONDATA-3855]support carbon SDK to load data from different files
Indhumathi27 commented on pull request #3819: URL: https://github.com/apache/carbondata/pull/3819#issuecomment-652849540 @nihal0107 Please remove unused binary files from this PR This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [carbondata] Indhumathi27 commented on a change in pull request #3819: [CARBONDATA-3855]support carbon SDK to load data from different files
Indhumathi27 commented on a change in pull request #3819: URL: https://github.com/apache/carbondata/pull/3819#discussion_r448835565 ## File path: sdk/sdk/src/main/java/org/apache/carbondata/sdk/file/CarbonWriterBuilder.java ## @@ -594,6 +613,332 @@ public CarbonWriterBuilder withJsonInput(Schema carbonSchema) { return this; } + /** + * to build a {@link CarbonWriter}, which accepts loading CSV files. + * + * @param filePath absolute path under which files should be loaded. + * @return CarbonWriterBuilder + */ + public CarbonWriterBuilder withCsvPath(String filePath) { +if (filePath.length() == 0) { + throw new IllegalArgumentException("filePath can not be empty"); +} +this.filePath = filePath; +this.isDirectory = new File(filePath).isDirectory(); +this.withCsvInput(); +return this; + } + + /** + * to build a {@link CarbonWriter}, which accepts CSV files directory and + * list of file which has to be loaded. + * + * @param filePath directory where the CSV file exists. + * @param fileList list of files which has to be loaded. + * @return CarbonWriterBuilder + */ + public CarbonWriterBuilder withCsvPath(String filePath, List fileList) { +this.fileList = fileList; +this.withCsvPath(filePath); +return this; + } + + /** + * to build a {@link CarbonWriter}, which accepts loading Parquet files. + * + * @param filePath absolute path under which files should be loaded. + * @return CarbonWriterBuilder + */ + public CarbonWriterBuilder withParquetPath(String filePath) throws IOException { +if (filePath.length() == 0) { + throw new IllegalArgumentException("filePath can not be empty"); +} +this.filePath = filePath; +this.isDirectory = new File(filePath).isDirectory(); +this.writerType = WRITER_TYPE.PARQUET; +this.buildParquetReader(); +return this; + } + + /** + * to build a {@link CarbonWriter}, which accepts parquet files directory and + * list of file which has to be loaded. + * + * @param filePath directory where the parquet file exists. + * @param fileList list of files which has to be loaded. + * @return CarbonWriterBuilder + * @throws IOException + */ + public CarbonWriterBuilder withParquetPath(String filePath, List fileList) + throws IOException { +this.fileList = fileList; +this.withParquetPath(filePath); +return this; + } + + private void buildParquetReader() throws IOException { +AvroReadSupport avroReadSupport = new AvroReadSupport<>(); +ParquetReader parquetReader; +if (this.isDirectory) { + if (this.fileList == null || this.fileList.size() == 0) { +File[] dataFiles = new File(this.filePath).listFiles(); +if (dataFiles == null || dataFiles.length == 0) { + throw new RuntimeException("No Parquet file found at given location. Please provide" + + "the correct folder location."); +} +parquetReader = ParquetReader.builder(avroReadSupport, Review comment: Please check below points 1. If filePath consists of other files such as orc or csv or avro, and you are trying to build parquet reader with those files 2. When writer.write is called, you are doing listFiles and directly trying to create respective readers. This may fail, if user adds a non-parquet to filePath after building CarbonWriterBuilder This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [carbondata] Indhumathi27 commented on a change in pull request #3819: [CARBONDATA-3855]support carbon SDK to load data from different files
Indhumathi27 commented on a change in pull request #3819: URL: https://github.com/apache/carbondata/pull/3819#discussion_r448835565 ## File path: sdk/sdk/src/main/java/org/apache/carbondata/sdk/file/CarbonWriterBuilder.java ## @@ -594,6 +613,332 @@ public CarbonWriterBuilder withJsonInput(Schema carbonSchema) { return this; } + /** + * to build a {@link CarbonWriter}, which accepts loading CSV files. + * + * @param filePath absolute path under which files should be loaded. + * @return CarbonWriterBuilder + */ + public CarbonWriterBuilder withCsvPath(String filePath) { +if (filePath.length() == 0) { + throw new IllegalArgumentException("filePath can not be empty"); +} +this.filePath = filePath; +this.isDirectory = new File(filePath).isDirectory(); +this.withCsvInput(); +return this; + } + + /** + * to build a {@link CarbonWriter}, which accepts CSV files directory and + * list of file which has to be loaded. + * + * @param filePath directory where the CSV file exists. + * @param fileList list of files which has to be loaded. + * @return CarbonWriterBuilder + */ + public CarbonWriterBuilder withCsvPath(String filePath, List fileList) { +this.fileList = fileList; +this.withCsvPath(filePath); +return this; + } + + /** + * to build a {@link CarbonWriter}, which accepts loading Parquet files. + * + * @param filePath absolute path under which files should be loaded. + * @return CarbonWriterBuilder + */ + public CarbonWriterBuilder withParquetPath(String filePath) throws IOException { +if (filePath.length() == 0) { + throw new IllegalArgumentException("filePath can not be empty"); +} +this.filePath = filePath; +this.isDirectory = new File(filePath).isDirectory(); +this.writerType = WRITER_TYPE.PARQUET; +this.buildParquetReader(); +return this; + } + + /** + * to build a {@link CarbonWriter}, which accepts parquet files directory and + * list of file which has to be loaded. + * + * @param filePath directory where the parquet file exists. + * @param fileList list of files which has to be loaded. + * @return CarbonWriterBuilder + * @throws IOException + */ + public CarbonWriterBuilder withParquetPath(String filePath, List fileList) + throws IOException { +this.fileList = fileList; +this.withParquetPath(filePath); +return this; + } + + private void buildParquetReader() throws IOException { +AvroReadSupport avroReadSupport = new AvroReadSupport<>(); +ParquetReader parquetReader; +if (this.isDirectory) { + if (this.fileList == null || this.fileList.size() == 0) { +File[] dataFiles = new File(this.filePath).listFiles(); +if (dataFiles == null || dataFiles.length == 0) { + throw new RuntimeException("No Parquet file found at given location. Please provide" + + "the correct folder location."); +} +parquetReader = ParquetReader.builder(avroReadSupport, Review comment: Please check below points 1. If filePath consists of other files such as orc or csv or avro, and you are trying to build parquet reader with those files, which throws error. Better to check FileFormat also 2. When writer.write is called, you are doing listFiles and directly trying to create respective readers. This may fail, if user adds a non-parquet to filePath after building CarbonWriterBuilder This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [carbondata] Indhumathi27 commented on a change in pull request #3819: [CARBONDATA-3855]support carbon SDK to load data from different files
Indhumathi27 commented on a change in pull request #3819: URL: https://github.com/apache/carbondata/pull/3819#discussion_r448837383 ## File path: sdk/sdk/src/main/java/org/apache/carbondata/sdk/file/CarbonWriterBuilder.java ## @@ -594,6 +613,332 @@ public CarbonWriterBuilder withJsonInput(Schema carbonSchema) { return this; } + /** + * to build a {@link CarbonWriter}, which accepts loading CSV files. + * + * @param filePath absolute path under which files should be loaded. + * @return CarbonWriterBuilder + */ + public CarbonWriterBuilder withCsvPath(String filePath) { +if (filePath.length() == 0) { + throw new IllegalArgumentException("filePath can not be empty"); +} +this.filePath = filePath; +this.isDirectory = new File(filePath).isDirectory(); +this.withCsvInput(); +return this; + } + + /** + * to build a {@link CarbonWriter}, which accepts CSV files directory and + * list of file which has to be loaded. + * + * @param filePath directory where the CSV file exists. + * @param fileList list of files which has to be loaded. + * @return CarbonWriterBuilder + */ + public CarbonWriterBuilder withCsvPath(String filePath, List fileList) { +this.fileList = fileList; +this.withCsvPath(filePath); +return this; + } + + /** + * to build a {@link CarbonWriter}, which accepts loading Parquet files. + * + * @param filePath absolute path under which files should be loaded. + * @return CarbonWriterBuilder + */ + public CarbonWriterBuilder withParquetPath(String filePath) throws IOException { +if (filePath.length() == 0) { + throw new IllegalArgumentException("filePath can not be empty"); +} +this.filePath = filePath; +this.isDirectory = new File(filePath).isDirectory(); +this.writerType = WRITER_TYPE.PARQUET; +this.buildParquetReader(); +return this; + } + + /** + * to build a {@link CarbonWriter}, which accepts parquet files directory and + * list of file which has to be loaded. + * + * @param filePath directory where the parquet file exists. + * @param fileList list of files which has to be loaded. + * @return CarbonWriterBuilder + * @throws IOException + */ + public CarbonWriterBuilder withParquetPath(String filePath, List fileList) + throws IOException { +this.fileList = fileList; +this.withParquetPath(filePath); +return this; + } + + private void buildParquetReader() throws IOException { +AvroReadSupport avroReadSupport = new AvroReadSupport<>(); +ParquetReader parquetReader; +if (this.isDirectory) { + if (this.fileList == null || this.fileList.size() == 0) { +File[] dataFiles = new File(this.filePath).listFiles(); +if (dataFiles == null || dataFiles.length == 0) { + throw new RuntimeException("No Parquet file found at given location. Please provide" + + "the correct folder location."); +} +parquetReader = ParquetReader.builder(avroReadSupport, +new Path(String.valueOf(dataFiles[0]))).build(); + } else { +parquetReader = ParquetReader.builder(avroReadSupport, Review comment: What if files has different schema in a directory? How is it handled? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [carbondata] Indhumathi27 commented on a change in pull request #3819: [CARBONDATA-3855]support carbon SDK to load data from different files
Indhumathi27 commented on a change in pull request #3819: URL: https://github.com/apache/carbondata/pull/3819#discussion_r448840587 ## File path: sdk/sdk/src/main/java/org/apache/carbondata/sdk/file/CarbonWriterBuilder.java ## @@ -594,6 +613,332 @@ public CarbonWriterBuilder withJsonInput(Schema carbonSchema) { return this; } + /** + * to build a {@link CarbonWriter}, which accepts loading CSV files. + * + * @param filePath absolute path under which files should be loaded. + * @return CarbonWriterBuilder + */ + public CarbonWriterBuilder withCsvPath(String filePath) { +if (filePath.length() == 0) { + throw new IllegalArgumentException("filePath can not be empty"); +} +this.filePath = filePath; +this.isDirectory = new File(filePath).isDirectory(); +this.withCsvInput(); +return this; + } + + /** + * to build a {@link CarbonWriter}, which accepts CSV files directory and + * list of file which has to be loaded. + * + * @param filePath directory where the CSV file exists. + * @param fileList list of files which has to be loaded. + * @return CarbonWriterBuilder + */ + public CarbonWriterBuilder withCsvPath(String filePath, List fileList) { +this.fileList = fileList; +this.withCsvPath(filePath); +return this; + } + + /** + * to build a {@link CarbonWriter}, which accepts loading Parquet files. + * + * @param filePath absolute path under which files should be loaded. + * @return CarbonWriterBuilder + */ + public CarbonWriterBuilder withParquetPath(String filePath) throws IOException { +if (filePath.length() == 0) { + throw new IllegalArgumentException("filePath can not be empty"); +} +this.filePath = filePath; +this.isDirectory = new File(filePath).isDirectory(); +this.writerType = WRITER_TYPE.PARQUET; +this.buildParquetReader(); +return this; + } + + /** + * to build a {@link CarbonWriter}, which accepts parquet files directory and + * list of file which has to be loaded. + * + * @param filePath directory where the parquet file exists. + * @param fileList list of files which has to be loaded. + * @return CarbonWriterBuilder + * @throws IOException + */ + public CarbonWriterBuilder withParquetPath(String filePath, List fileList) + throws IOException { +this.fileList = fileList; +this.withParquetPath(filePath); +return this; + } + + private void buildParquetReader() throws IOException { +AvroReadSupport avroReadSupport = new AvroReadSupport<>(); +ParquetReader parquetReader; +if (this.isDirectory) { + if (this.fileList == null || this.fileList.size() == 0) { +File[] dataFiles = new File(this.filePath).listFiles(); +if (dataFiles == null || dataFiles.length == 0) { + throw new RuntimeException("No Parquet file found at given location. Please provide" + + "the correct folder location."); +} +parquetReader = ParquetReader.builder(avroReadSupport, +new Path(String.valueOf(dataFiles[0]))).build(); + } else { +parquetReader = ParquetReader.builder(avroReadSupport, +new Path(this.filePath + "/" + this.fileList.get(0))).build(); Review comment: What if fileList does not exists in filePath. Better to catch and throw File does not exists exception This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [carbondata] Indhumathi27 commented on a change in pull request #3819: [CARBONDATA-3855]support carbon SDK to load data from different files
Indhumathi27 commented on a change in pull request #3819: URL: https://github.com/apache/carbondata/pull/3819#discussion_r448842092 ## File path: sdk/sdk/src/main/java/org/apache/carbondata/sdk/file/CarbonWriterBuilder.java ## @@ -594,6 +613,332 @@ public CarbonWriterBuilder withJsonInput(Schema carbonSchema) { return this; } + /** + * to build a {@link CarbonWriter}, which accepts loading CSV files. + * + * @param filePath absolute path under which files should be loaded. + * @return CarbonWriterBuilder + */ + public CarbonWriterBuilder withCsvPath(String filePath) { +if (filePath.length() == 0) { + throw new IllegalArgumentException("filePath can not be empty"); +} +this.filePath = filePath; +this.isDirectory = new File(filePath).isDirectory(); +this.withCsvInput(); +return this; + } + + /** + * to build a {@link CarbonWriter}, which accepts CSV files directory and + * list of file which has to be loaded. + * + * @param filePath directory where the CSV file exists. + * @param fileList list of files which has to be loaded. + * @return CarbonWriterBuilder + */ + public CarbonWriterBuilder withCsvPath(String filePath, List fileList) { +this.fileList = fileList; +this.withCsvPath(filePath); +return this; + } + + /** + * to build a {@link CarbonWriter}, which accepts loading Parquet files. + * + * @param filePath absolute path under which files should be loaded. + * @return CarbonWriterBuilder + */ + public CarbonWriterBuilder withParquetPath(String filePath) throws IOException { +if (filePath.length() == 0) { + throw new IllegalArgumentException("filePath can not be empty"); +} +this.filePath = filePath; +this.isDirectory = new File(filePath).isDirectory(); +this.writerType = WRITER_TYPE.PARQUET; +this.buildParquetReader(); +return this; + } + + /** + * to build a {@link CarbonWriter}, which accepts parquet files directory and + * list of file which has to be loaded. + * + * @param filePath directory where the parquet file exists. + * @param fileList list of files which has to be loaded. + * @return CarbonWriterBuilder + * @throws IOException + */ + public CarbonWriterBuilder withParquetPath(String filePath, List fileList) + throws IOException { +this.fileList = fileList; +this.withParquetPath(filePath); +return this; + } + + private void buildParquetReader() throws IOException { +AvroReadSupport avroReadSupport = new AvroReadSupport<>(); +ParquetReader parquetReader; +if (this.isDirectory) { + if (this.fileList == null || this.fileList.size() == 0) { +File[] dataFiles = new File(this.filePath).listFiles(); +if (dataFiles == null || dataFiles.length == 0) { + throw new RuntimeException("No Parquet file found at given location. Please provide" + + "the correct folder location."); +} +parquetReader = ParquetReader.builder(avroReadSupport, +new Path(String.valueOf(dataFiles[0]))).build(); + } else { +parquetReader = ParquetReader.builder(avroReadSupport, +new Path(this.filePath + "/" + this.fileList.get(0))).build(); + } +} else { + parquetReader = ParquetReader.builder(avroReadSupport, + new Path(this.filePath)).build(); Review comment: Same as previous comment This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [carbondata] Indhumathi27 commented on a change in pull request #3819: [CARBONDATA-3855]support carbon SDK to load data from different files
Indhumathi27 commented on a change in pull request #3819: URL: https://github.com/apache/carbondata/pull/3819#discussion_r448842092 ## File path: sdk/sdk/src/main/java/org/apache/carbondata/sdk/file/CarbonWriterBuilder.java ## @@ -594,6 +613,332 @@ public CarbonWriterBuilder withJsonInput(Schema carbonSchema) { return this; } + /** + * to build a {@link CarbonWriter}, which accepts loading CSV files. + * + * @param filePath absolute path under which files should be loaded. + * @return CarbonWriterBuilder + */ + public CarbonWriterBuilder withCsvPath(String filePath) { +if (filePath.length() == 0) { + throw new IllegalArgumentException("filePath can not be empty"); +} +this.filePath = filePath; +this.isDirectory = new File(filePath).isDirectory(); +this.withCsvInput(); +return this; + } + + /** + * to build a {@link CarbonWriter}, which accepts CSV files directory and + * list of file which has to be loaded. + * + * @param filePath directory where the CSV file exists. + * @param fileList list of files which has to be loaded. + * @return CarbonWriterBuilder + */ + public CarbonWriterBuilder withCsvPath(String filePath, List fileList) { +this.fileList = fileList; +this.withCsvPath(filePath); +return this; + } + + /** + * to build a {@link CarbonWriter}, which accepts loading Parquet files. + * + * @param filePath absolute path under which files should be loaded. + * @return CarbonWriterBuilder + */ + public CarbonWriterBuilder withParquetPath(String filePath) throws IOException { +if (filePath.length() == 0) { + throw new IllegalArgumentException("filePath can not be empty"); +} +this.filePath = filePath; +this.isDirectory = new File(filePath).isDirectory(); +this.writerType = WRITER_TYPE.PARQUET; +this.buildParquetReader(); +return this; + } + + /** + * to build a {@link CarbonWriter}, which accepts parquet files directory and + * list of file which has to be loaded. + * + * @param filePath directory where the parquet file exists. + * @param fileList list of files which has to be loaded. + * @return CarbonWriterBuilder + * @throws IOException + */ + public CarbonWriterBuilder withParquetPath(String filePath, List fileList) + throws IOException { +this.fileList = fileList; +this.withParquetPath(filePath); +return this; + } + + private void buildParquetReader() throws IOException { +AvroReadSupport avroReadSupport = new AvroReadSupport<>(); +ParquetReader parquetReader; +if (this.isDirectory) { + if (this.fileList == null || this.fileList.size() == 0) { +File[] dataFiles = new File(this.filePath).listFiles(); +if (dataFiles == null || dataFiles.length == 0) { + throw new RuntimeException("No Parquet file found at given location. Please provide" + + "the correct folder location."); +} +parquetReader = ParquetReader.builder(avroReadSupport, +new Path(String.valueOf(dataFiles[0]))).build(); + } else { +parquetReader = ParquetReader.builder(avroReadSupport, +new Path(this.filePath + "/" + this.fileList.get(0))).build(); + } +} else { + parquetReader = ParquetReader.builder(avroReadSupport, + new Path(this.filePath)).build(); Review comment: Same as previous comment for FilePAth does not exists This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3809: [CARBONDATA-3881] Fix concurrent main table compaction and SI load issue
CarbonDataQA1 commented on pull request #3809: URL: https://github.com/apache/carbondata/pull/3809#issuecomment-652896458 Build Success with Spark 2.3.4, Please check CI http://121.244.95.60:12545/job/ApacheCarbonPRBuilder2.3/3290/ This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3809: [CARBONDATA-3881] Fix concurrent main table compaction and SI load issue
CarbonDataQA1 commented on pull request #3809: URL: https://github.com/apache/carbondata/pull/3809#issuecomment-652896859 Build Success with Spark 2.4.5, Please check CI http://121.244.95.60:12545/job/ApacheCarbon_PR_Builder_2.4.5/1553/ This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [carbondata] Indhumathi27 commented on pull request #3809: [CARBONDATA-3881] Fix concurrent main table compaction and SI load issue
Indhumathi27 commented on pull request #3809: URL: https://github.com/apache/carbondata/pull/3809#issuecomment-652998410 LGTM This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [carbondata] Indhumathi27 commented on a change in pull request #3817: [CARBONDATA-3845] Bucket table creation fails with exception for empt…
Indhumathi27 commented on a change in pull request #3817: URL: https://github.com/apache/carbondata/pull/3817#discussion_r448997627 ## File path: integration/spark/src/main/scala/org/apache/spark/sql/parser/CarbonSpark2SqlParser.scala ## @@ -766,13 +766,13 @@ class CarbonSpark2SqlParser extends CarbonDDLSqlParser { throw new MalformedCarbonCommandException("Invalid table properties") } if (options.isBucketingEnabled) { - if (options.bucketNumber.toString.contains("-") || - options.bucketNumber.toString.contains("+") || options.bucketNumber == 0) { + if (options.bucketNumber.isEmpty || options.bucketNumber.get.toString.contains("-") || Review comment: Since you have wrapped options with Try, i guess in case if Bucket number as "+" and "-", it will be empty. Can check and remove those checks ## File path: integration/spark/src/main/scala/org/apache/spark/sql/parser/CarbonSpark2SqlParser.scala ## @@ -766,13 +766,13 @@ class CarbonSpark2SqlParser extends CarbonDDLSqlParser { throw new MalformedCarbonCommandException("Invalid table properties") } if (options.isBucketingEnabled) { - if (options.bucketNumber.toString.contains("-") || - options.bucketNumber.toString.contains("+") || options.bucketNumber == 0) { + if (options.bucketNumber.isEmpty || options.bucketNumber.get.toString.contains("-") || Review comment: Since you have wrapped bucket options with Try, i guess in case if Bucket number as "+" and "-", it will be empty. Can check and remove those checks This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [carbondata] QiangCai opened a new pull request #3820: [WIP] Improve filter performance on decimal column
QiangCai opened a new pull request #3820: URL: https://github.com/apache/carbondata/pull/3820 ### Why is this PR needed? ### What changes were proposed in this PR? ### Does this PR introduce any user interface change? - No - Yes. (please explain the change and update document) ### Is any new testcase added? - No - Yes This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [carbondata] ShreelekhyaG commented on a change in pull request #3817: [CARBONDATA-3845] Bucket table creation fails with exception for empt…
ShreelekhyaG commented on a change in pull request #3817: URL: https://github.com/apache/carbondata/pull/3817#discussion_r449060268 ## File path: integration/spark/src/main/scala/org/apache/spark/sql/parser/CarbonSpark2SqlParser.scala ## @@ -766,13 +766,13 @@ class CarbonSpark2SqlParser extends CarbonDDLSqlParser { throw new MalformedCarbonCommandException("Invalid table properties") } if (options.isBucketingEnabled) { - if (options.bucketNumber.toString.contains("-") || - options.bucketNumber.toString.contains("+") || options.bucketNumber == 0) { + if (options.bucketNumber == None || options.bucketNumber.get.toString.contains("-") || + options.bucketNumber.get.toString.contains("+") || options.bucketNumber.get == 0) { Review comment: Done This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3817: [CARBONDATA-3845] Bucket table creation fails with exception for empt…
CarbonDataQA1 commented on pull request #3817: URL: https://github.com/apache/carbondata/pull/3817#issuecomment-653058106 Build Success with Spark 2.3.4, Please check CI http://121.244.95.60:12545/job/ApacheCarbonPRBuilder2.3/3291/ This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3817: [CARBONDATA-3845] Bucket table creation fails with exception for empt…
CarbonDataQA1 commented on pull request #3817: URL: https://github.com/apache/carbondata/pull/3817#issuecomment-653059304 Build Success with Spark 2.4.5, Please check CI http://121.244.95.60:12545/job/ApacheCarbon_PR_Builder_2.4.5/1554/ This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3773: [CARBONDATA-3830]Presto complex columns read support
CarbonDataQA1 commented on pull request #3773: URL: https://github.com/apache/carbondata/pull/3773#issuecomment-653065406 Build Failed with Spark 2.4.5, Please check CI http://121.244.95.60:12545/job/ApacheCarbon_PR_Builder_2.4.5/1555/ This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [carbondata] ShreelekhyaG commented on a change in pull request #3817: [CARBONDATA-3845] Bucket table creation fails with exception for empt…
ShreelekhyaG commented on a change in pull request #3817: URL: https://github.com/apache/carbondata/pull/3817#discussion_r449076519 ## File path: integration/spark/src/main/scala/org/apache/spark/sql/parser/CarbonSpark2SqlParser.scala ## @@ -766,13 +766,13 @@ class CarbonSpark2SqlParser extends CarbonDDLSqlParser { throw new MalformedCarbonCommandException("Invalid table properties") } if (options.isBucketingEnabled) { - if (options.bucketNumber.toString.contains("-") || - options.bucketNumber.toString.contains("+") || options.bucketNumber == 0) { + if (options.bucketNumber.isEmpty || options.bucketNumber.get.toString.contains("-") || Review comment: The check for "-" is needed to avoid negative values as input. The case for "+" is not required and removed the check for it. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3820: [WIP] Improve filter performance on decimal column
CarbonDataQA1 commented on pull request #3820: URL: https://github.com/apache/carbondata/pull/3820#issuecomment-653084576 Build Success with Spark 2.3.4, Please check CI http://121.244.95.60:12545/job/ApacheCarbonPRBuilder2.3/3293/ This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3773: [CARBONDATA-3830]Presto complex columns read support
CarbonDataQA1 commented on pull request #3773: URL: https://github.com/apache/carbondata/pull/3773#issuecomment-653085499 Build Failed with Spark 2.3.4, Please check CI http://121.244.95.60:12545/job/ApacheCarbonPRBuilder2.3/3292/ This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3820: [WIP] Improve filter performance on decimal column
CarbonDataQA1 commented on pull request #3820: URL: https://github.com/apache/carbondata/pull/3820#issuecomment-653088148 Build Success with Spark 2.4.5, Please check CI http://121.244.95.60:12545/job/ApacheCarbon_PR_Builder_2.4.5/1556/ This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3817: [CARBONDATA-3845] Bucket table creation fails with exception for empt…
CarbonDataQA1 commented on pull request #3817: URL: https://github.com/apache/carbondata/pull/3817#issuecomment-653131655 Build Success with Spark 2.3.4, Please check CI http://121.244.95.60:12545/job/ApacheCarbonPRBuilder2.3/3294/ This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3817: [CARBONDATA-3845] Bucket table creation fails with exception for empt…
CarbonDataQA1 commented on pull request #3817: URL: https://github.com/apache/carbondata/pull/3817#issuecomment-653131882 Build Success with Spark 2.4.5, Please check CI http://121.244.95.60:12545/job/ApacheCarbon_PR_Builder_2.4.5/1557/ This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [carbondata] akkio-97 commented on pull request #3773: [CARBONDATA-3830]Presto complex columns read support
akkio-97 commented on pull request #3773: URL: https://github.com/apache/carbondata/pull/3773#issuecomment-653212349 retest this please This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3773: [CARBONDATA-3830]Presto complex columns read support
CarbonDataQA1 commented on pull request #3773: URL: https://github.com/apache/carbondata/pull/3773#issuecomment-653254442 Build Success with Spark 2.3.4, Please check CI http://121.244.95.60:12545/job/ApacheCarbonPRBuilder2.3/3295/ This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3773: [CARBONDATA-3830]Presto complex columns read support
CarbonDataQA1 commented on pull request #3773: URL: https://github.com/apache/carbondata/pull/3773#issuecomment-653254762 Build Success with Spark 2.4.5, Please check CI http://121.244.95.60:12545/job/ApacheCarbon_PR_Builder_2.4.5/1558/ This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [carbondata] ajantha-bhat opened a new pull request #3821: [WIP] Use qualified table name for global sort compaction
ajantha-bhat opened a new pull request #3821: URL: https://github.com/apache/carbondata/pull/3821 ### Why is this PR needed? ### What changes were proposed in this PR? ### Does this PR introduce any user interface change? - No - Yes. (please explain the change and update document) ### Is any new testcase added? - No - Yes This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [carbondata] Indhumathi27 commented on a change in pull request #3817: [CARBONDATA-3845] Bucket table creation fails with exception for empt…
Indhumathi27 commented on a change in pull request #3817: URL: https://github.com/apache/carbondata/pull/3817#discussion_r449371217 ## File path: integration/spark/src/main/scala/org/apache/spark/sql/parser/CarbonSpark2SqlParser.scala ## @@ -766,13 +766,13 @@ class CarbonSpark2SqlParser extends CarbonDDLSqlParser { throw new MalformedCarbonCommandException("Invalid table properties") } if (options.isBucketingEnabled) { - if (options.bucketNumber.toString.contains("-") || - options.bucketNumber.toString.contains("+") || options.bucketNumber == 0) { + if (options.bucketNumber.isEmpty || options.bucketNumber.get.toString.contains("-") +|| options.bucketNumber.get == 0) { Review comment: Please format these two lines This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[jira] [Created] (CARBONDATA-3886) Global sort compaction not using qualified name
Ajantha Bhat created CARBONDATA-3886: Summary: Global sort compaction not using qualified name Key: CARBONDATA-3886 URL: https://issues.apache.org/jira/browse/CARBONDATA-3886 Project: CarbonData Issue Type: Bug Reporter: Ajantha Bhat Assignee: Ajantha Bhat problem : Global sort compaction is not using database name while making database. some times it uses default database when spark cannot match this table name belong to which database. solution: Use qualified table name (dbname + table name) while creating a datafame. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[GitHub] [carbondata] kumarvishal09 commented on a change in pull request #3776: [CARBONDATA-3834]Segment directory and the segment file in metadata are not created for partitioned table when 'carbon.
kumarvishal09 commented on a change in pull request #3776: URL: https://github.com/apache/carbondata/pull/3776#discussion_r449384678 ## File path: hadoop/src/main/java/org/apache/carbondata/hadoop/api/CarbonOutputCommitter.java ## @@ -302,6 +318,61 @@ private void commitJobForPartition(JobContext context, boolean overwriteSet, commitJobFinal(context, loadModel, operationContext, carbonTable, uniqueId); } + /** + * Method to create and write the segment file, removes the temporary directories from all the + * respective partition directories. This method is invoked only when {@link + * CarbonCommonConstants#CARBON_MERGE_INDEX_IN_SEGMENT} is disabled. + * @param context Job context + * @param loadModel Load model + * @param segmentFileName Segment file name to write + * @param partitionPath Serialized list of partition location + * @throws IOException + */ + @SuppressWarnings("unchecked") + private void writeSegmentWithoutMergeIndex(JobContext context, CarbonLoadModel loadModel, + String segmentFileName, String partitionPath) throws IOException { +Map indexFileNameMap = (Map) ObjectSerializationUtil + .convertStringToObject(context.getConfiguration().get("carbon.index.files.name")); +List partitionList = +(List) ObjectSerializationUtil.convertStringToObject(partitionPath); +SegmentFileStore.SegmentFile finalSegmentFile = null; +boolean isRelativePath; +String partitionLoc; +for (String partition : partitionList) { + isRelativePath = false; + partitionLoc = partition; + if (partitionLoc.startsWith(loadModel.getTablePath())) { +partitionLoc = partitionLoc.substring(loadModel.getTablePath().length()); +isRelativePath = true; + } + SegmentFileStore.SegmentFile segmentFile = new SegmentFileStore.SegmentFile(); + SegmentFileStore.FolderDetails folderDetails = new SegmentFileStore.FolderDetails(); + folderDetails.setFiles(Collections.singleton(indexFileNameMap.get(partition))); + folderDetails.setPartitions( + Collections.singletonList(partitionLoc.substring(partitionLoc.indexOf("/") + 1))); + folderDetails.setRelative(isRelativePath); + folderDetails.setStatus(SegmentStatus.SUCCESS.getMessage()); + segmentFile.getLocationMap().put(partitionLoc, folderDetails); + if (finalSegmentFile != null) { +finalSegmentFile = finalSegmentFile.merge(segmentFile); + } else { +finalSegmentFile = segmentFile; + } +} +Objects.requireNonNull(finalSegmentFile); +String segmentFilesLocation = Review comment: its better to move this code inside SegmentFileStore itself, pass the table path and segment file name and internally it will handle folder creation. Pls check may be its already present String segmentFilesLocation = CarbonTablePath.getSegmentFilesLocation(loadModel.getTablePath()); CarbonFile locationFile = FileFactory.getCarbonFile(segmentFilesLocation); if (!locationFile.exists()) { locationFile.mkdirs(); } SegmentFileStore.writeSegmentFile(finalSegmentFile, segmentFilesLocation + "/" + segmentFileName + CarbonTablePath.SEGMENT_EXT); This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [carbondata] kumarvishal09 commented on a change in pull request #3776: [CARBONDATA-3834]Segment directory and the segment file in metadata are not created for partitioned table when 'carbon.
kumarvishal09 commented on a change in pull request #3776: URL: https://github.com/apache/carbondata/pull/3776#discussion_r449386951 ## File path: hadoop/src/main/java/org/apache/carbondata/hadoop/api/CarbonOutputCommitter.java ## @@ -302,6 +318,61 @@ private void commitJobForPartition(JobContext context, boolean overwriteSet, commitJobFinal(context, loadModel, operationContext, carbonTable, uniqueId); } + /** + * Method to create and write the segment file, removes the temporary directories from all the + * respective partition directories. This method is invoked only when {@link + * CarbonCommonConstants#CARBON_MERGE_INDEX_IN_SEGMENT} is disabled. + * @param context Job context + * @param loadModel Load model + * @param segmentFileName Segment file name to write + * @param partitionPath Serialized list of partition location + * @throws IOException + */ + @SuppressWarnings("unchecked") + private void writeSegmentWithoutMergeIndex(JobContext context, CarbonLoadModel loadModel, + String segmentFileName, String partitionPath) throws IOException { +Map indexFileNameMap = (Map) ObjectSerializationUtil + .convertStringToObject(context.getConfiguration().get("carbon.index.files.name")); +List partitionList = +(List) ObjectSerializationUtil.convertStringToObject(partitionPath); +SegmentFileStore.SegmentFile finalSegmentFile = null; +boolean isRelativePath; +String partitionLoc; +for (String partition : partitionList) { + isRelativePath = false; + partitionLoc = partition; + if (partitionLoc.startsWith(loadModel.getTablePath())) { +partitionLoc = partitionLoc.substring(loadModel.getTablePath().length()); +isRelativePath = true; + } + SegmentFileStore.SegmentFile segmentFile = new SegmentFileStore.SegmentFile(); + SegmentFileStore.FolderDetails folderDetails = new SegmentFileStore.FolderDetails(); + folderDetails.setFiles(Collections.singleton(indexFileNameMap.get(partition))); + folderDetails.setPartitions( + Collections.singletonList(partitionLoc.substring(partitionLoc.indexOf("/") + 1))); + folderDetails.setRelative(isRelativePath); + folderDetails.setStatus(SegmentStatus.SUCCESS.getMessage()); + segmentFile.getLocationMap().put(partitionLoc, folderDetails); + if (finalSegmentFile != null) { Review comment: @ajantha-bhat code looks fine, it's in a loop This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [carbondata] ajantha-bhat commented on a change in pull request #3776: [CARBONDATA-3834]Segment directory and the segment file in metadata are not created for partitioned table when 'carbon.m
ajantha-bhat commented on a change in pull request #3776: URL: https://github.com/apache/carbondata/pull/3776#discussion_r449387619 ## File path: hadoop/src/main/java/org/apache/carbondata/hadoop/api/CarbonOutputCommitter.java ## @@ -302,6 +318,61 @@ private void commitJobForPartition(JobContext context, boolean overwriteSet, commitJobFinal(context, loadModel, operationContext, carbonTable, uniqueId); } + /** + * Method to create and write the segment file, removes the temporary directories from all the + * respective partition directories. This method is invoked only when {@link + * CarbonCommonConstants#CARBON_MERGE_INDEX_IN_SEGMENT} is disabled. + * @param context Job context + * @param loadModel Load model + * @param segmentFileName Segment file name to write + * @param partitionPath Serialized list of partition location + * @throws IOException + */ + @SuppressWarnings("unchecked") + private void writeSegmentWithoutMergeIndex(JobContext context, CarbonLoadModel loadModel, + String segmentFileName, String partitionPath) throws IOException { +Map indexFileNameMap = (Map) ObjectSerializationUtil + .convertStringToObject(context.getConfiguration().get("carbon.index.files.name")); +List partitionList = +(List) ObjectSerializationUtil.convertStringToObject(partitionPath); +SegmentFileStore.SegmentFile finalSegmentFile = null; +boolean isRelativePath; +String partitionLoc; +for (String partition : partitionList) { + isRelativePath = false; + partitionLoc = partition; + if (partitionLoc.startsWith(loadModel.getTablePath())) { +partitionLoc = partitionLoc.substring(loadModel.getTablePath().length()); +isRelativePath = true; + } + SegmentFileStore.SegmentFile segmentFile = new SegmentFileStore.SegmentFile(); + SegmentFileStore.FolderDetails folderDetails = new SegmentFileStore.FolderDetails(); + folderDetails.setFiles(Collections.singleton(indexFileNameMap.get(partition))); + folderDetails.setPartitions( + Collections.singletonList(partitionLoc.substring(partitionLoc.indexOf("/") + 1))); + folderDetails.setRelative(isRelativePath); + folderDetails.setStatus(SegmentStatus.SUCCESS.getMessage()); + segmentFile.getLocationMap().put(partitionLoc, folderDetails); + if (finalSegmentFile != null) { Review comment: yesh, it is in loop. ignore this comment. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3821: [CARBONDATA-3886] Use qualified table name for global sort compaction
CarbonDataQA1 commented on pull request #3821: URL: https://github.com/apache/carbondata/pull/3821#issuecomment-653383759 Build Success with Spark 2.3.4, Please check CI http://121.244.95.60:12545/job/ApacheCarbonPRBuilder2.3/3296/ This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3821: [CARBONDATA-3886] Use qualified table name for global sort compaction
CarbonDataQA1 commented on pull request #3821: URL: https://github.com/apache/carbondata/pull/3821#issuecomment-653384017 Build Success with Spark 2.4.5, Please check CI http://121.244.95.60:12545/job/ApacheCarbon_PR_Builder_2.4.5/1559/ This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [carbondata] ajantha-bhat commented on pull request #3821: [CARBONDATA-3886] Use qualified table name for global sort compaction
ajantha-bhat commented on pull request #3821: URL: https://github.com/apache/carbondata/pull/3821#issuecomment-653384296 @jackylk : please check and merge This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [carbondata] jackylk commented on a change in pull request #3821: [CARBONDATA-3886] Use qualified table name for global sort compaction
jackylk commented on a change in pull request #3821: URL: https://github.com/apache/carbondata/pull/3821#discussion_r449408180 ## File path: integration/spark/src/main/scala/org/apache/spark/sql/util/SparkSQLUtil.scala ## @@ -165,9 +165,9 @@ object SparkSQLUtil { * datatype of column data and corresponding datatype in schema provided to create dataframe. * Since carbonScanRDD gives Long data for timestamp column and corresponding column datatype in * schema is Timestamp, this validation fails if we use createDataFrame API which takes rdd as - * input. Hence, using below API which creates dataframe from tablename. + * input. Hence, using below API which creates dataframe from qualified tablename. */ -sparkSession.sqlContext.table(carbonTable.getTableName) +sparkSession.sqlContext.table(carbonTable.getDatabaseName + "." + carbonTable.getTableName) Review comment: Is there a utility for this? I guess no need to construct it ourself This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [carbondata] ajantha-bhat commented on a change in pull request #3821: [CARBONDATA-3886] Use qualified table name for global sort compaction
ajantha-bhat commented on a change in pull request #3821: URL: https://github.com/apache/carbondata/pull/3821#discussion_r449409262 ## File path: integration/spark/src/main/scala/org/apache/spark/sql/util/SparkSQLUtil.scala ## @@ -165,9 +165,9 @@ object SparkSQLUtil { * datatype of column data and corresponding datatype in schema provided to create dataframe. * Since carbonScanRDD gives Long data for timestamp column and corresponding column datatype in * schema is Timestamp, this validation fails if we use createDataFrame API which takes rdd as - * input. Hence, using below API which creates dataframe from tablename. + * input. Hence, using below API which creates dataframe from qualified tablename. */ -sparkSession.sqlContext.table(carbonTable.getTableName) +sparkSession.sqlContext.table(carbonTable.getDatabaseName + "." + carbonTable.getTableName) Review comment: qualified name is spark usage, so carbon doesn't have utility class for it. And carbon's uniqueTableName is dbname_tableName, so cannot use it either. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org