date:20201012

[jira] [Resolved] (CARBONDATA-4010) "Alter table set tblproperties should support long string columns" and bad record handling of long string data for string columns need to be updated in https://gith

2020-10-12 Thread Indhumathi Muthu Murugesh (Jira)



 [ 
https://issues.apache.org/jira/browse/CARBONDATA-4010?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Indhumathi Muthu Murugesh resolved CARBONDATA-4010.
---
Fix Version/s: 2.1.0
   Resolution: Fixed

> "Alter table set tblproperties should support long string columns" and bad 
> record handling of long string data for string columns need to be updated in 
> https://github.com/apache/carbondata/blob/master/docs
> -
>
> Key: CARBONDATA-4010
> URL: https://issues.apache.org/jira/browse/CARBONDATA-4010
> Project: CarbonData
>  Issue Type: Bug
>  Components: docs
>Affects Versions: 2.1.0
> Environment: https://github.com/apache/carbondata/blob/master/docs
>Reporter: Chetan Bhat
>Priority: Minor
> Fix For: 2.1.0
>
>  Time Spent: 6.5h
>  Remaining Estimate: 0h
>
> "Alter table set tblproperties should support long string columns" and bad 
> record handling of long string data for string columns need to be updated in 
> https://github.com/apache/carbondata/blob/master/docs



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[GitHub] [carbondata] asfgit closed pull request #3959: [CARBONDATA-4010] Doc changes for long strings.

2020-10-12 Thread GitBox



asfgit closed pull request #3959:
URL: https://github.com/apache/carbondata/pull/3959


   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3977: [CARBONDATA-4027] Fix the wrong modifiedtime of loading files in inse…

2020-10-12 Thread GitBox



CarbonDataQA1 commented on pull request #3977:
URL: https://github.com/apache/carbondata/pull/3977#issuecomment-707504154


   Build Success with Spark 2.4.5, Please check CI 
http://121.244.95.60:12545/job/ApacheCarbon_PR_Builder_2.4.5/2642/
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3977: [CARBONDATA-4027] Fix the wrong modifiedtime of loading files in inse…

2020-10-12 Thread GitBox



CarbonDataQA1 commented on pull request #3977:
URL: https://github.com/apache/carbondata/pull/3977#issuecomment-707502953


   Build Success with Spark 2.3.4, Please check CI 
http://121.244.95.60:12545/job/ApacheCarbonPRBuilder2.3/4394/
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [carbondata] Indhumathi27 commented on a change in pull request #3977: [CARBONDATA-4027] Fix the wrong modifiedtime of loading files in inse…

2020-10-12 Thread GitBox



Indhumathi27 commented on a change in pull request #3977:
URL: https://github.com/apache/carbondata/pull/3977#discussion_r503670575



##
File path: 
integration/spark/src/main/scala/org/apache/spark/sql/execution/command/management/CarbonInsertFromStageCommand.scala
##
@@ -533,14 +533,13 @@ case class CarbonInsertFromStageCommand(
 val stageLoadingFile =
   FileFactory.getCarbonFile(stagePath +
 File.separator + files._1.getName + 
CarbonTablePath.LOADING_FILE_SUFFIX);
-// Try to create loading files
-// make isFailed to be true if createNewFile return false.
-// the reason can be file exists or exceptions.
-var isFailed = !stageLoadingFile.createNewFile()
-// if file exists, modify the lastmodifiedtime of the file.
-if (isFailed) {
-  // make isFailed to be true if setLastModifiedTime return false.
-  isFailed = 
!stageLoadingFile.setLastModifiedTime(System.currentTimeMillis());
+// Try to recreate loading files if the loading file exists
+// or create loading files directly if the loading file doesn't 
exist
+// set isFailed to be false when (delete and) createfile success
+var isFailed = if (stageLoadingFile.exists()) {

Review comment:
   isFailed will always be false. can remove it and update the comment





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [carbondata] Indhumathi27 commented on a change in pull request #3977: [CARBONDATA-4027] Fix the wrong modifiedtime of loading files in inse…

2020-10-12 Thread GitBox



Indhumathi27 commented on a change in pull request #3977:
URL: https://github.com/apache/carbondata/pull/3977#discussion_r503670575



##
File path: 
integration/spark/src/main/scala/org/apache/spark/sql/execution/command/management/CarbonInsertFromStageCommand.scala
##
@@ -533,14 +533,13 @@ case class CarbonInsertFromStageCommand(
 val stageLoadingFile =
   FileFactory.getCarbonFile(stagePath +
 File.separator + files._1.getName + 
CarbonTablePath.LOADING_FILE_SUFFIX);
-// Try to create loading files
-// make isFailed to be true if createNewFile return false.
-// the reason can be file exists or exceptions.
-var isFailed = !stageLoadingFile.createNewFile()
-// if file exists, modify the lastmodifiedtime of the file.
-if (isFailed) {
-  // make isFailed to be true if setLastModifiedTime return false.
-  isFailed = 
!stageLoadingFile.setLastModifiedTime(System.currentTimeMillis());
+// Try to recreate loading files if the loading file exists
+// or create loading files directly if the loading file doesn't 
exist
+// set isFailed to be false when (delete and) createfile success
+var isFailed = if (stageLoadingFile.exists()) {

Review comment:
   isFailed will always be false. can remove it





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [carbondata] Indhumathi27 commented on pull request #3977: [CARBONDATA-4027] Fix the wrong modifiedtime of loading files in inse…

2020-10-12 Thread GitBox



Indhumathi27 commented on pull request #3977:
URL: https://github.com/apache/carbondata/pull/3977#issuecomment-707488860


   @marchpure If setLastModifiedTime operation do not take effect on S3, in 
other places also, we need to check and update



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [carbondata] marchpure commented on pull request #3977: [CARBONDATA-4027] Fix the wrong modifiedtime of loading files in inse…

2020-10-12 Thread GitBox



marchpure commented on pull request #3977:
URL: https://github.com/apache/carbondata/pull/3977#issuecomment-707468141


   retest this please



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [carbondata] asfgit closed pull request #3949: [CARBONDATA-4030] Fix issue in concurrent SI global sort

2020-10-12 Thread GitBox



asfgit closed pull request #3949:
URL: https://github.com/apache/carbondata/pull/3949


   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [carbondata] QiangCai commented on pull request #3949: [CARBONDATA-4030] Fix issue in concurrent SI global sort

2020-10-12 Thread GitBox



QiangCai commented on pull request #3949:
URL: https://github.com/apache/carbondata/pull/3949#issuecomment-707463119


   LGTM



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [carbondata] QiangCai commented on a change in pull request #3935: [CARBONDATA-3993] Remove auto data deletion in IUD processs

2020-10-12 Thread GitBox



QiangCai commented on a change in pull request #3935:
URL: https://github.com/apache/carbondata/pull/3935#discussion_r503631339



##
File path: 
integration/spark/src/main/scala/org/apache/spark/sql/execution/command/mutation/CarbonProjectForDeleteCommand.scala
##
@@ -108,9 +108,6 @@ private[sql] case class CarbonProjectForDeleteCommand(
   }
   val executorErrors = ExecutionErrors(FailureCauses.NONE, "")
 
-  // handle the clean up of IUD.
-  CarbonUpdateUtil.cleanUpDeltaFiles(carbonTable, false)

Review comment:
   if we don't clean up stale delta files, it will be used after the next 
update.
   maybe we can't remove it now.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [carbondata] QiangCai commented on a change in pull request #3935: [CARBONDATA-3993] Remove auto data deletion in IUD processs

2020-10-12 Thread GitBox



QiangCai commented on a change in pull request #3935:
URL: https://github.com/apache/carbondata/pull/3935#discussion_r503629383



##
File path: 
integration/spark/src/main/scala/org/apache/carbondata/spark/rdd/CarbonDataRDDFactory.scala
##
@@ -267,9 +266,8 @@ object CarbonDataRDDFactory {
 throw new Exception("Exception in compaction " + 
exception.getMessage)
   }
 } finally {
-  executor.shutdownNow()
   try {
-compactor.deletePartialLoadsInCompaction()

Review comment:
   @akashrn5 After we avoid using listFiles during loading,  the stale 
segment (for example 0.1) will not impact data consistency. If the stale 
segment has stale index files and data files,  we will not add them to segment 
file.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3695: [WIP] partition optimization

2020-10-12 Thread GitBox



CarbonDataQA1 commented on pull request #3695:
URL: https://github.com/apache/carbondata/pull/3695#issuecomment-707316276


   Build Failed  with Spark 2.4.5, Please check CI 
http://121.244.95.60:12545/job/ApacheCarbon_PR_Builder_2.4.5/2641/
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3695: [WIP] partition optimization

2020-10-12 Thread GitBox



CarbonDataQA1 commented on pull request #3695:
URL: https://github.com/apache/carbondata/pull/3695#issuecomment-707315699


   Build Failed  with Spark 2.3.4, Please check CI 
http://121.244.95.60:12545/job/ApacheCarbonPRBuilder2.3/4393/
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3932: [CARBONDATA-3994] Skip Order by for map task if it is sort column and use limit pushdown for array_contains filter

2020-10-12 Thread GitBox



CarbonDataQA1 commented on pull request #3932:
URL: https://github.com/apache/carbondata/pull/3932#issuecomment-707211367


   Build Success with Spark 2.4.5, Please check CI 
http://121.244.95.60:12545/job/ApacheCarbon_PR_Builder_2.4.5/2640/
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3932: [CARBONDATA-3994] Skip Order by for map task if it is sort column and use limit pushdown for array_contains filter

2020-10-12 Thread GitBox



CarbonDataQA1 commented on pull request #3932:
URL: https://github.com/apache/carbondata/pull/3932#issuecomment-707209130


   Build Success with Spark 2.3.4, Please check CI 
http://121.244.95.60:12545/job/ApacheCarbonPRBuilder2.3/4392/
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [carbondata] vikramahuja1001 commented on a change in pull request #3917: [CARBONDATA-3978] Clean Files Refactor and support for trash folder in carbondata

2020-10-12 Thread GitBox



vikramahuja1001 commented on a change in pull request #3917:
URL: https://github.com/apache/carbondata/pull/3917#discussion_r503370601



##
File path: 
core/src/main/java/org/apache/carbondata/core/metadata/SegmentFileStore.java
##
@@ -1106,23 +1107,55 @@ public static void cleanSegments(CarbonTable table, 
List partitio
*/
   public static void deleteSegment(String tablePath, Segment segment,
   List partitionSpecs,
-  SegmentUpdateStatusManager updateStatusManager) throws Exception {
+  SegmentUpdateStatusManager updateStatusManager, String tableName, String 
DatabaseName,
+   SegmentStatus segmentStatus, Boolean 
isPartitionTable)
+  throws Exception {
 SegmentFileStore fileStore = new SegmentFileStore(tablePath, 
segment.getSegmentFileName());
-List indexOrMergeFiles = 
fileStore.readIndexFiles(SegmentStatus.SUCCESS, true,
-FileFactory.getConfiguration());
+List indexOrMergeFiles = 
fileStore.readIndexFiles(SegmentStatus.SUCCESS,
+true, FileFactory.getConfiguration());
 Map> indexFilesMap = fileStore.getIndexFilesMap();
 for (Map.Entry> entry : indexFilesMap.entrySet()) {
+  // If the file to be deleted is a carbondata file, copy that file to the 
trash folder.
+  if (segmentStatus == SegmentStatus.INSERT_IN_PROGRESS) {
+if (!isPartitionTable) {
+  TrashUtil.copyDataToTrashFolder(tablePath, entry.getKey(), 
CarbonCommonConstants
+  .LOAD_FOLDER + segment.getSegmentNo());
+} else {
+  TrashUtil.copyDataToTrashFolder(tablePath, entry.getKey(), 
CarbonCommonConstants
+  .LOAD_FOLDER + segment.getSegmentNo() + 
CarbonCommonConstants.FILE_SEPARATOR +
+  entry.getKey().substring(tablePath.length() + 1, 
entry.getKey().length()));
+}
+  }
   FileFactory.deleteFile(entry.getKey());
   for (String file : entry.getValue()) {
 String[] deltaFilePaths =
 updateStatusManager.getDeleteDeltaFilePath(file, 
segment.getSegmentNo());
 for (String deltaFilePath : deltaFilePaths) {
+  // If the file to be deleted is a carbondata file, copy that file to 
the trash folder.
+  if (segmentStatus == SegmentStatus
+  .INSERT_IN_PROGRESS) {
+TrashUtil.copyDataToTrashFolder(tablePath, deltaFilePath, 
deltaFilePath
+.substring(tablePath.length() + 1, 
deltaFilePath.length()));
+  }
   FileFactory.deleteFile(deltaFilePath);
 }
+// If the file to be deleted is a carbondata file, copy that file to 
the trash folder.
+if (file.endsWith(CarbonCommonConstants.FACT_FILE_EXT) && 
segmentStatus ==
+SegmentStatus.INSERT_IN_PROGRESS) {

Review comment:
   the indexfile map will contain both the index files as well as the 
.carbondata file. `file` is entry.getValue and entry is indexfileMap which also 
has the carbondata file in it. So, we can have places where this condition will 
be true





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3949: [CARBONDATA-4030] Fix issue in concurrent SI global sort

2020-10-12 Thread GitBox



CarbonDataQA1 commented on pull request #3949:
URL: https://github.com/apache/carbondata/pull/3949#issuecomment-707152498


   Build Success with Spark 2.3.4, Please check CI 
http://121.244.95.60:12545/job/ApacheCarbonPRBuilder2.3/4391/
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3949: [CARBONDATA-4030] Fix issue in concurrent SI global sort

2020-10-12 Thread GitBox



CarbonDataQA1 commented on pull request #3949:
URL: https://github.com/apache/carbondata/pull/3949#issuecomment-707147440


   Build Success with Spark 2.4.5, Please check CI 
http://121.244.95.60:12545/job/ApacheCarbon_PR_Builder_2.4.5/2639/
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3917: [CARBONDATA-3978] Clean Files Refactor and support for trash folder in carbondata

2020-10-12 Thread GitBox



CarbonDataQA1 commented on pull request #3917:
URL: https://github.com/apache/carbondata/pull/3917#issuecomment-707122422


   Build Success with Spark 2.3.4, Please check CI 
http://121.244.95.60:12545/job/ApacheCarbonPRBuilder2.3/4387/
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3917: [CARBONDATA-3978] Clean Files Refactor and support for trash folder in carbondata

2020-10-12 Thread GitBox



CarbonDataQA1 commented on pull request #3917:
URL: https://github.com/apache/carbondata/pull/3917#issuecomment-707121149


   Build Success with Spark 2.4.5, Please check CI 
http://121.244.95.60:12545/job/ApacheCarbon_PR_Builder_2.4.5/2635/
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3948: [WIP][HOTFIX] Fix random 11 testcase failure in CI

2020-10-12 Thread GitBox



CarbonDataQA1 commented on pull request #3948:
URL: https://github.com/apache/carbondata/pull/3948#issuecomment-707114968


   Build Failed  with Spark 2.4.5, Please check CI 
http://121.244.95.60:12545/job/ApacheCarbon_PR_Builder_2.4.5/2634/
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3948: [WIP][HOTFIX] Fix random 11 testcase failure in CI

2020-10-12 Thread GitBox



CarbonDataQA1 commented on pull request #3948:
URL: https://github.com/apache/carbondata/pull/3948#issuecomment-707108327


   Build Success with Spark 2.3.4, Please check CI 
http://121.244.95.60:12545/job/ApacheCarbonPRBuilder2.3/4386/
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3695: [WIP] partition optimization

2020-10-12 Thread GitBox



CarbonDataQA1 commented on pull request #3695:
URL: https://github.com/apache/carbondata/pull/3695#issuecomment-707096882


   Build Failed  with Spark 2.4.5, Please check CI 
http://121.244.95.60:12545/job/ApacheCarbon_PR_Builder_2.4.5/2637/
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3967: [CARBONDATA-4004] [CARBONDATA-4012] Issue with select after update command

2020-10-12 Thread GitBox



CarbonDataQA1 commented on pull request #3967:
URL: https://github.com/apache/carbondata/pull/3967#issuecomment-707092263


   Build Success with Spark 2.4.5, Please check CI 
http://121.244.95.60:12545/job/ApacheCarbon_PR_Builder_2.4.5/2633/
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [carbondata] asfgit closed pull request #3978: [CARBONDATA-4028] Fix failed to unlock during update

2020-10-12 Thread GitBox



asfgit closed pull request #3978:
URL: https://github.com/apache/carbondata/pull/3978


   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3967: [CARBONDATA-4004] [CARBONDATA-4012] Issue with select after update command

2020-10-12 Thread GitBox



CarbonDataQA1 commented on pull request #3967:
URL: https://github.com/apache/carbondata/pull/3967#issuecomment-707090745


   Build Success with Spark 2.3.4, Please check CI 
http://121.244.95.60:12545/job/ApacheCarbonPRBuilder2.3/4385/
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [carbondata] QiangCai commented on pull request #3978: [CARBONDATA-4028] Fix failed to unlock during update

2020-10-12 Thread GitBox



QiangCai commented on pull request #3978:
URL: https://github.com/apache/carbondata/pull/3978#issuecomment-707089612


   LGTM



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [carbondata] QiangCai commented on pull request #3949: [CARBONDATA-4030] Fix issue in concurrent SI global sort

2020-10-12 Thread GitBox



QiangCai commented on pull request #3949:
URL: https://github.com/apache/carbondata/pull/3949#issuecomment-707087877


   retest this please



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3949: [CARBONDATA-4030] Fix issue in concurrent SI global sort

2020-10-12 Thread GitBox



CarbonDataQA1 commented on pull request #3949:
URL: https://github.com/apache/carbondata/pull/3949#issuecomment-707087252


   Build Failed  with Spark 2.4.5, Please check CI 
http://121.244.95.60:12545/job/ApacheCarbon_PR_Builder_2.4.5/2638/
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [carbondata] QiangCai commented on pull request #3949: [CARBONDATA-4030] Fix issue in concurrent SI global sort

2020-10-12 Thread GitBox



QiangCai commented on pull request #3949:
URL: https://github.com/apache/carbondata/pull/3949#issuecomment-707087176


   ok, for testing, it is ok.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3695: [WIP] partition optimization

2020-10-12 Thread GitBox



CarbonDataQA1 commented on pull request #3695:
URL: https://github.com/apache/carbondata/pull/3695#issuecomment-707087260


   Build Failed  with Spark 2.3.4, Please check CI 
http://121.244.95.60:12545/job/ApacheCarbonPRBuilder2.3/4389/
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [carbondata] QiangCai commented on pull request #3953: [CARBONDATA-4008]Fixed IN filter on date column is returning 0 results when 'carbon.push.rowfilters.for.vector' is true

2020-10-12 Thread GitBox



QiangCai commented on pull request #3953:
URL: https://github.com/apache/carbondata/pull/3953#issuecomment-707073901


   merge it.
   please raise another PR to improve the filter performance.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [carbondata] marchpure commented on a change in pull request #3978: [CARBONDATA-4028] Fix failed to unlock during update

2020-10-12 Thread GitBox



marchpure commented on a change in pull request #3978:
URL: https://github.com/apache/carbondata/pull/3978#discussion_r503242850



##
File path: 
integration/spark/src/main/scala/org/apache/spark/sql/execution/command/mutation/CarbonProjectForUpdateCommand.scala
##
@@ -235,14 +237,38 @@ private[sql] case class CarbonProjectForUpdateCommand(
 }
 sys.error("Update operation failed. please check logs.")
 } finally {
-  if (null != dataSet && isPersistEnabled) {
-dataSet.unpersist()
+  if (updateLock.unlock()) {
+LOGGER.info(s"updateLock unlocked successfully after update operation 
$tableName")
+  } else {
+LOGGER.error(s"Unable to unlock updateLock for table $tableName after 
table updation");

Review comment:
   I have modified code according to your suggestion





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [carbondata] ajantha-bhat commented on pull request #3949: [CARBONDATA-4030] Fix issue in concurrent SI global sort

2020-10-12 Thread GitBox



ajantha-bhat commented on pull request #3949:
URL: https://github.com/apache/carbondata/pull/3949#issuecomment-707085109


   @QiangCai : I will check, If I remove, getPositionId() Udf may not work in 
projection. But it was only test purpose I guess. Shall try to remove?



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [carbondata] akashrn5 commented on pull request #3948: [WIP][HOTFIX] Fix random 11 testcase failure in CI

2020-10-12 Thread GitBox



akashrn5 commented on pull request #3948:
URL: https://github.com/apache/carbondata/pull/3948#issuecomment-707072320


   > @akashrn5 : logs were removed as one issue was found. definitely some 
other issue also there. Added logs and pushed again.
   
   yeah



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [carbondata] QiangCai commented on pull request #3949: [CARBONDATA-4030] Fix issue in concurrent SI global sort

2020-10-12 Thread GitBox



QiangCai commented on pull request #3949:
URL: https://github.com/apache/carbondata/pull/3949#issuecomment-707084035


   @ajantha-bhat can we remove it?



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [carbondata] ajantha-bhat commented on pull request #3932: [CARBONDATA-3994] Skip Order by for map task if it is sort column and use limit pushdown for array_contains filter

2020-10-12 Thread GitBox



ajantha-bhat commented on pull request #3932:
URL: https://github.com/apache/carbondata/pull/3932#issuecomment-707071580


   @kunal642 : This optimization is only when first sort column is in order by 
column.
   so, the data in each task is completely sorted. so applying limit on each 
task and order by once on the limit data will give correct results.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [carbondata] QiangCai commented on pull request #3949: [CARBONDATA-4030] Fix issue in concurrent SI global sort

2020-10-12 Thread GitBox



QiangCai commented on pull request #3949:
URL: https://github.com/apache/carbondata/pull/3949#issuecomment-707084330


   retest this please



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3953: [CARBONDATA-4008]Fixed IN filter on date column is returning 0 results when 'carbon.push.rowfilters.for.vector' is true

2020-10-12 Thread GitBox



CarbonDataQA1 commented on pull request #3953:
URL: https://github.com/apache/carbondata/pull/3953#issuecomment-707070681


   Build Success with Spark 2.4.5, Please check CI 
http://121.244.95.60:12545/job/ApacheCarbon_PR_Builder_2.4.5/2631/
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [carbondata] ajantha-bhat commented on pull request #3932: [CARBONDATA-3994] Skip Order by for map task if it is sort column and use limit pushdown for array_contains filter

2020-10-12 Thread GitBox



ajantha-bhat commented on pull request #3932:
URL: https://github.com/apache/carbondata/pull/3932#issuecomment-707070709


   @kumarvishal09 : Agree with you. It has to be the first sort_column only.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [carbondata] asfgit closed pull request #3924: [CARBONDATA-3988] Allow SI creation on first dimension column

2020-10-12 Thread GitBox



asfgit closed pull request #3924:
URL: https://github.com/apache/carbondata/pull/3924


   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3953: [CARBONDATA-4008]Fixed IN filter on date column is returning 0 results when 'carbon.push.rowfilters.for.vector' is true

2020-10-12 Thread GitBox



CarbonDataQA1 commented on pull request #3953:
URL: https://github.com/apache/carbondata/pull/3953#issuecomment-707069675


   Build Success with Spark 2.3.4, Please check CI 
http://121.244.95.60:12545/job/ApacheCarbonPRBuilder2.3/4383/
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [carbondata] QiangCai commented on pull request #3924: [CARBONDATA-3988] Allow SI creation on first dimension column

2020-10-12 Thread GitBox



QiangCai commented on pull request #3924:
URL: https://github.com/apache/carbondata/pull/3924#issuecomment-707080395


   LGTM



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [carbondata] asfgit closed pull request #3953: [CARBONDATA-4008]Fixed IN filter on date column is returning 0 results when 'carbon.push.rowfilters.for.vector' is true

2020-10-12 Thread GitBox



asfgit closed pull request #3953:
URL: https://github.com/apache/carbondata/pull/3953


   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[jira] [Updated] (CARBONDATA-4030) Concurrent SI global sort cannot be success

2020-10-12 Thread Ajantha Bhat (Jira)



 [ 
https://issues.apache.org/jira/browse/CARBONDATA-4030?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ajantha Bhat updated CARBONDATA-4030:
-
Description: when concurrent SI global sort is in progress, one load was 
removing the table property added by the other load. So, the global sort insert 
for one load was failing with error that unable to find position id in the 
projection.

> Concurrent SI global sort cannot be success
> ---
>
> Key: CARBONDATA-4030
> URL: https://issues.apache.org/jira/browse/CARBONDATA-4030
> Project: CarbonData
>  Issue Type: Bug
>Reporter: Ajantha Bhat
>Assignee: Ajantha Bhat
>Priority: Minor
>
> when concurrent SI global sort is in progress, one load was removing the 
> table property added by the other load. So, the global sort insert for one 
> load was failing with error that unable to find position id in the projection.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[GitHub] [carbondata] asfgit closed pull request #3976: [CARBONDATA-4026] Fix Thread leakage while Loading

2020-10-12 Thread GitBox



asfgit closed pull request #3976:
URL: https://github.com/apache/carbondata/pull/3976


   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [carbondata] QiangCai commented on pull request #3976: [CARBONDATA-4026] Fix Thread leakage while Loading

2020-10-12 Thread GitBox



QiangCai commented on pull request #3976:
URL: https://github.com/apache/carbondata/pull/3976#issuecomment-707075552


   LGTM



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[jira] [Created] (CARBONDATA-4030) Concurrent SI global sort cannot be success

2020-10-12 Thread Ajantha Bhat (Jira)

Ajantha Bhat created CARBONDATA-4030:


 Summary: Concurrent SI global sort cannot be success
 Key: CARBONDATA-4030
 URL: https://issues.apache.org/jira/browse/CARBONDATA-4030
 Project: CarbonData
  Issue Type: Bug
Reporter: Ajantha Bhat
Assignee: Ajantha Bhat






--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[GitHub] [carbondata] ajantha-bhat commented on pull request #3949: [TEMP] [WIP] Fix issue in concurrent SI global sort

2020-10-12 Thread GitBox



ajantha-bhat commented on pull request #3949:
URL: https://github.com/apache/carbondata/pull/3949#issuecomment-707065386


   @QiangCai : It was introduced during secondary index development I guess, 
you can look up "isPositionIDRequested" in code. They basically use it to 
decide whether the position reference need to kept or removed during plan 
optimization. 



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [carbondata] ajantha-bhat commented on pull request #3948: [WIP][HOTFIX] Fix random 11 testcase failure in CI

2020-10-12 Thread GitBox



ajantha-bhat commented on pull request #3948:
URL: https://github.com/apache/carbondata/pull/3948#issuecomment-707058332


   @akashrn5 : logs were removed as one issue was found. definitely some other 
issue also there. Added logs and pushed again.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [carbondata] akashrn5 commented on pull request #3948: [WIP][HOTFIX] Fix random 11 testcase failure in CI

2020-10-12 Thread GitBox



akashrn5 commented on pull request #3948:
URL: https://github.com/apache/carbondata/pull/3948#issuecomment-707057298


   @ajantha-bhat failed again i think, please check if the same cause or 
something else



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3695: [WIP] partition optimization

2020-10-12 Thread GitBox



CarbonDataQA1 commented on pull request #3695:
URL: https://github.com/apache/carbondata/pull/3695#issuecomment-707057669


   Build Failed  with Spark 2.4.5, Please check CI 
http://121.244.95.60:12545/job/ApacheCarbon_PR_Builder_2.4.5/2632/
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3948: [HOTFIX] Fix random 11 testcase failure in CI

2020-10-12 Thread GitBox



CarbonDataQA1 commented on pull request #3948:
URL: https://github.com/apache/carbondata/pull/3948#issuecomment-707052086


   Build Failed  with Spark 2.3.4, Please check CI 
http://121.244.95.60:12545/job/ApacheCarbonPRBuilder2.3/4380/
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3948: [HOTFIX] Fix random 11 testcase failure in CI

2020-10-12 Thread GitBox



CarbonDataQA1 commented on pull request #3948:
URL: https://github.com/apache/carbondata/pull/3948#issuecomment-707048048


   Build Success with Spark 2.4.5, Please check CI 
http://121.244.95.60:12545/job/ApacheCarbon_PR_Builder_2.4.5/2629/
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3917: [CARBONDATA-3978] Clean Files Refactor and support for trash folder in carbondata

2020-10-12 Thread GitBox



CarbonDataQA1 commented on pull request #3917:
URL: https://github.com/apache/carbondata/pull/3917#issuecomment-707041413


   Build Failed  with Spark 2.3.4, Please check CI 
http://121.244.95.60:12545/job/ApacheCarbonPRBuilder2.3/4379/
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3917: [CARBONDATA-3978] Clean Files Refactor and support for trash folder in carbondata

2020-10-12 Thread GitBox



CarbonDataQA1 commented on pull request #3917:
URL: https://github.com/apache/carbondata/pull/3917#issuecomment-707039304


   Build Failed  with Spark 2.4.5, Please check CI 
http://121.244.95.60:12545/job/ApacheCarbon_PR_Builder_2.4.5/2628/
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3978: [CARBONDATA-4028] Fix failed to unlock during update

2020-10-12 Thread GitBox



CarbonDataQA1 commented on pull request #3978:
URL: https://github.com/apache/carbondata/pull/3978#issuecomment-707035119


   Build Success with Spark 2.4.5, Please check CI 
http://121.244.95.60:12545/job/ApacheCarbon_PR_Builder_2.4.5/2627/
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3935: [CARBONDATA-3993] Remove auto data deletion in IUD processs

2020-10-12 Thread GitBox



CarbonDataQA1 commented on pull request #3935:
URL: https://github.com/apache/carbondata/pull/3935#issuecomment-707031327


   Build Success with Spark 2.3.4, Please check CI 
http://121.244.95.60:12545/job/ApacheCarbonPRBuilder2.3/4376/
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3978: [CARBONDATA-4028] Fix failed to unlock during update

2020-10-12 Thread GitBox



CarbonDataQA1 commented on pull request #3978:
URL: https://github.com/apache/carbondata/pull/3978#issuecomment-707031326


   Build Success with Spark 2.3.4, Please check CI 
http://121.244.95.60:12545/job/ApacheCarbonPRBuilder2.3/4378/
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3950: [CARBONDATA-3889] Enable scalastyle check for all scala test code

2020-10-12 Thread GitBox



CarbonDataQA1 commented on pull request #3950:
URL: https://github.com/apache/carbondata/pull/3950#issuecomment-707017547


   Build Failed  with Spark 2.3.4, Please check CI 
http://121.244.95.60:12545/job/ApacheCarbonPRBuilder2.3/4382/
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3950: [CARBONDATA-3889] Enable scalastyle check for all scala test code

2020-10-12 Thread GitBox



CarbonDataQA1 commented on pull request #3950:
URL: https://github.com/apache/carbondata/pull/3950#issuecomment-707016545


   Build Failed  with Spark 2.4.5, Please check CI 
http://121.244.95.60:12545/job/ApacheCarbon_PR_Builder_2.4.5/2630/
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3935: [CARBONDATA-3993] Remove auto data deletion in IUD processs

2020-10-12 Thread GitBox



CarbonDataQA1 commented on pull request #3935:
URL: https://github.com/apache/carbondata/pull/3935#issuecomment-707009914


   Build Success with Spark 2.4.5, Please check CI 
http://121.244.95.60:12545/job/ApacheCarbon_PR_Builder_2.4.5/2625/
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [carbondata] QiangCai commented on pull request #3953: [CARBONDATA-4008]Fixed IN filter on date column is returning 0 results when 'carbon.push.rowfilters.for.vector' is true

2020-10-12 Thread GitBox



QiangCai commented on pull request #3953:
URL: https://github.com/apache/carbondata/pull/3953#issuecomment-707009265


   LGTM



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [carbondata] QiangCai commented on pull request #3953: [CARBONDATA-4008]Fixed IN filter on date column is returning 0 results when 'carbon.push.rowfilters.for.vector' is true

2020-10-12 Thread GitBox



QiangCai commented on pull request #3953:
URL: https://github.com/apache/carbondata/pull/3953#issuecomment-707009405


   retest this please



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [carbondata] akashrn5 commented on a change in pull request #3912: [CARBONDATA-3977] Global sort partitions should be determined dynamically

2020-10-12 Thread GitBox



akashrn5 commented on a change in pull request #3912:
URL: https://github.com/apache/carbondata/pull/3912#discussion_r503160924



##
File path: 
integration/spark/src/main/scala/org/apache/carbondata/spark/load/DataLoadProcessBuilderOnSpark.scala
##
@@ -227,9 +235,19 @@ object DataLoadProcessBuilderOnSpark {
 // 2. sort
 var numPartitions = CarbonDataProcessorUtil.getGlobalSortPartitions(
   
configuration.getDataLoadProperty(CarbonCommonConstants.LOAD_GLOBAL_SORT_PARTITIONS))
-if (numPartitions <= 0) {
+
+// if numPartitions user does not specify then dynamically calculate
+if (numPartitions == 0) {
+  // get the size in bytes and convert to size in MB
+  val sizeOfDataFrame = SizeEstimator.estimate(rdd)/100
+  // data frame size can not be more than Int size
+  numPartitions = sizeOfDataFrame.toInt/rdd.getNumPartitions

Review comment:
   @maheshrajus can you please explain the logic used to determine the 
partitions in the comment.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3969: [CARBONDATA-3932] [CARBONDATA-3903] change discovery.uri in presto guide and dml document update

2020-10-12 Thread GitBox



CarbonDataQA1 commented on pull request #3969:
URL: https://github.com/apache/carbondata/pull/3969#issuecomment-706997746


   Build Success with Spark 2.3.4, Please check CI 
http://121.244.95.60:12545/job/ApacheCarbonPRBuilder2.3/4373/
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [carbondata] ajantha-bhat edited a comment on pull request #3948: [HOTFIX] Fix random 11 testcase failure in CI

2020-10-12 Thread GitBox



ajantha-bhat edited a comment on pull request #3948:
URL: https://github.com/apache/carbondata/pull/3948#issuecomment-706992959


   @Karan980, @marchpure : please check this.  
   
   @akashrn5 , @QiangCai : Please check and merge this once new build passes. 



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [carbondata] ajantha-bhat commented on pull request #3948: [HOTFIX] Fix random 11 testcase failure in CI

2020-10-12 Thread GitBox



ajantha-bhat commented on pull request #3948:
URL: https://github.com/apache/carbondata/pull/3948#issuecomment-706992959


   @Karan980, @akashrn5 , @QiangCai : Please check and merge this once new 
build passes. 



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3950: [CARBONDATA-3889] Enable scalastyle check for all scala test code

2020-10-12 Thread GitBox



CarbonDataQA1 commented on pull request #3950:
URL: https://github.com/apache/carbondata/pull/3950#issuecomment-706988518


   Build Failed  with Spark 2.4.5, Please check CI 
http://121.244.95.60:12545/job/ApacheCarbon_PR_Builder_2.4.5/2624/
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3979: [Carbondata-3954] Fix insertion from ORC table into carbon table when sort scope is global sort

2020-10-12 Thread GitBox



CarbonDataQA1 commented on pull request #3979:
URL: https://github.com/apache/carbondata/pull/3979#issuecomment-706985772


   Build Success with Spark 2.3.4, Please check CI 
http://121.244.95.60:12545/job/ApacheCarbonPRBuilder2.3/4372/
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3969: [CARBONDATA-3932] [CARBONDATA-3903] change discovery.uri in presto guide and dml document update

2020-10-12 Thread GitBox



CarbonDataQA1 commented on pull request #3969:
URL: https://github.com/apache/carbondata/pull/3969#issuecomment-706982841


   Build Success with Spark 2.4.5, Please check CI 
http://121.244.95.60:12545/job/ApacheCarbon_PR_Builder_2.4.5/2623/
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [carbondata] akkio-97 commented on pull request #3967: [CARBONDATA-4004] Issue with select after update command

2020-10-12 Thread GitBox



akkio-97 commented on pull request #3967:
URL: https://github.com/apache/carbondata/pull/3967#issuecomment-706982107


   Test cases should be written using SDK **update** API to generate carbon 
files. And currently there is an issue in that when queried either from spark 
or presto. I have anyway checked it on the cluster by updating from spark and 
querying from presto. It works fine.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [carbondata] Karan980 commented on a change in pull request #3979: [Carbondata-3954] Fix insertion from ORC table into carbon table when sort scope is global sort

2020-10-12 Thread GitBox



Karan980 commented on a change in pull request #3979:
URL: https://github.com/apache/carbondata/pull/3979#discussion_r503140543



##
File path: 
integration/spark/src/main/scala/org/apache/carbondata/spark/util/CommonUtil.scala
##
@@ -1011,38 +1011,46 @@ object CommonUtil {
   objectDataType: DataType): AnyRef = {
 objectDataType match {
   case _: ArrayType =>
-val arrayDataType = objectDataType.asInstanceOf[ArrayType]
-val arrayData = data.asInstanceOf[UnsafeArrayData]
-val size = arrayData.numElements()
-val childDataType = arrayDataType.elementType
-val arrayChildObjects = new Array[AnyRef](size)
-var i = 0
-while (i < size) {
-  arrayChildObjects(i) = 
convertSparkComplexTypeToCarbonObject(arrayData.get(i,
-childDataType), childDataType)
-  i = i + 1
+if (data == null) {

Review comment:
   For dateType if data is null that is handled by some other way which is 
not returning null. That's why i didn't put it after line 1011





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [carbondata] Karan980 commented on a change in pull request #3979: [Carbondata-3954] Fix insertion from ORC table into carbon table when sort scope is global sort

2020-10-12 Thread GitBox



Karan980 commented on a change in pull request #3979:
URL: https://github.com/apache/carbondata/pull/3979#discussion_r503139533



##
File path: 
integration/spark/src/main/scala/org/apache/carbondata/spark/util/CommonUtil.scala
##
@@ -1011,38 +1011,46 @@ object CommonUtil {
   objectDataType: DataType): AnyRef = {
 objectDataType match {
   case _: ArrayType =>
-val arrayDataType = objectDataType.asInstanceOf[ArrayType]
-val arrayData = data.asInstanceOf[UnsafeArrayData]
-val size = arrayData.numElements()
-val childDataType = arrayDataType.elementType
-val arrayChildObjects = new Array[AnyRef](size)
-var i = 0
-while (i < size) {
-  arrayChildObjects(i) = 
convertSparkComplexTypeToCarbonObject(arrayData.get(i,
-childDataType), childDataType)
-  i = i + 1
+if (data == null) {

Review comment:
   For dateType if data is null that is handled by some other way which is 
not returning null. That's why i didn't put it after line 1011





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3950: [CARBONDATA-3889] Enable scalastyle check for all scala test code

2020-10-12 Thread GitBox



CarbonDataQA1 commented on pull request #3950:
URL: https://github.com/apache/carbondata/pull/3950#issuecomment-706977245


   Build Failed  with Spark 2.3.4, Please check CI 
http://121.244.95.60:12545/job/ApacheCarbonPRBuilder2.3/4374/
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3979: [Carbondata-3954] Fix insertion from ORC table into carbon table when sort scope is global sort

2020-10-12 Thread GitBox



CarbonDataQA1 commented on pull request #3979:
URL: https://github.com/apache/carbondata/pull/3979#issuecomment-706972701


   Build Success with Spark 2.4.5, Please check CI 
http://121.244.95.60:12545/job/ApacheCarbon_PR_Builder_2.4.5/2622/
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [carbondata] vikramahuja1001 commented on pull request #3917: [CARBONDATA-3978] Clean files refactor and added support for a trash folder where all the carbondata files will be copied to afte

2020-10-12 Thread GitBox



vikramahuja1001 commented on pull request #3917:
URL: https://github.com/apache/carbondata/pull/3917#issuecomment-706971916


   @akashrn5 , pr description is very old, the scope has changed since then, i 
will update it



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3695: [WIP] partition optimization

2020-10-12 Thread GitBox



CarbonDataQA1 commented on pull request #3695:
URL: https://github.com/apache/carbondata/pull/3695#issuecomment-706962594


   Build Failed  with Spark 2.4.5, Please check CI 
http://121.244.95.60:12545/job/ApacheCarbon_PR_Builder_2.4.5/2626/
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3695: [WIP] partition optimization

2020-10-12 Thread GitBox



CarbonDataQA1 commented on pull request #3695:
URL: https://github.com/apache/carbondata/pull/3695#issuecomment-706960978


   Build Failed  with Spark 2.3.4, Please check CI 
http://121.244.95.60:12545/job/ApacheCarbonPRBuilder2.3/4377/
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [carbondata] vikramahuja1001 commented on a change in pull request #3917: [CARBONDATA-3978] Clean files refactor and added support for a trash folder where all the carbondata files will be co

2020-10-12 Thread GitBox



vikramahuja1001 commented on a change in pull request #3917:
URL: https://github.com/apache/carbondata/pull/3917#discussion_r503116909



##
File path: 
integration/spark/src/main/scala/org/apache/spark/sql/execution/command/management/CarbonCleanFilesCommand.scala
##
@@ -108,12 +112,51 @@ case class CarbonCleanFilesCommand(
 Seq.empty
   }
 
+  def deleteStashInMetadataFolder(carbonTable: CarbonTable): Unit = {
+val tableStatusLock = CarbonLockFactory
+  .getCarbonLockObj(carbonTable.getAbsoluteTableIdentifier, 
LockUsage.TABLE_STATUS_LOCK)
+val carbonLoadModel = new CarbonLoadModel
+try {
+  if (tableStatusLock.lockWithRetries()) {
+val tableStatusFilePath = CarbonTablePath
+  .getTableStatusFilePath(carbonTable.getTablePath)
+val loadMetaDataDetails = SegmentStatusManager
+  .readTableStatusFile(tableStatusFilePath).filter(details => 
details.getSegmentStatus ==
+  SegmentStatus.SUCCESS || details.getSegmentStatus == 
SegmentStatus.LOAD_PARTIAL_SUCCESS)
+  .sortWith(_.getLoadName < _.getLoadName)
+
carbonLoadModel.setLoadMetadataDetails(loadMetaDataDetails.toList.asJava)
+  } else {
+throw new ConcurrentOperationException(carbonTable.getDatabaseName,
+  carbonTable.getTableName, "table status read", "clean files command")
+  }
+} finally {
+  tableStatusLock.unlock()
+}
+val loadMetaDataDetails = carbonLoadModel.getLoadMetadataDetails.asScala
+val segmentFileList = loadMetaDataDetails.map(f => 
CarbonTablePath.getSegmentFilesLocation(

Review comment:
   okay





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3959: [CARBONDATA-4010] Doc changes for long strings.

2020-10-12 Thread GitBox



CarbonDataQA1 commented on pull request #3959:
URL: https://github.com/apache/carbondata/pull/3959#issuecomment-706959506


   Build Success with Spark 2.3.4, Please check CI 
http://121.244.95.60:12545/job/ApacheCarbonPRBuilder2.3/4370/
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [carbondata] vikramahuja1001 commented on a change in pull request #3917: [CARBONDATA-3978] Clean files refactor and added support for a trash folder where all the carbondata files will be co

2020-10-12 Thread GitBox



vikramahuja1001 commented on a change in pull request #3917:
URL: https://github.com/apache/carbondata/pull/3917#discussion_r503115666



##
File path: 
integration/spark/src/main/scala/org/apache/carbondata/cleanfiles/CleanFilesUtil.scala
##
@@ -0,0 +1,259 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.carbondata.cleanfiles
+
+import java.util.concurrent.{Executors, ScheduledExecutorService, TimeUnit}
+
+import scala.collection.JavaConverters._
+
+import org.apache.hadoop.fs.permission.{FsAction, FsPermission}
+
+import org.apache.carbondata.common.logging.LogServiceFactory
+import org.apache.carbondata.core.constants.CarbonCommonConstants
+import org.apache.carbondata.core.datastore.filesystem.CarbonFile
+import org.apache.carbondata.core.datastore.impl.FileFactory
+import org.apache.carbondata.core.indexstore.PartitionSpec
+import org.apache.carbondata.core.locks.{CarbonLockUtil, ICarbonLock, 
LockUsage}
+import org.apache.carbondata.core.metadata.{AbsoluteTableIdentifier, 
CarbonMetadata, SegmentFileStore}
+import org.apache.carbondata.core.metadata.schema.table.CarbonTable
+import org.apache.carbondata.core.mutate.CarbonUpdateUtil
+import org.apache.carbondata.core.statusmanager.{SegmentStatus, 
SegmentStatusManager}
+import org.apache.carbondata.core.util.{CarbonProperties, CarbonUtil}
+import org.apache.carbondata.core.util.path.CarbonTablePath
+
+object CleanFilesUtil {
+  private val LOGGER = 
LogServiceFactory.getLogService(this.getClass.getCanonicalName)
+
+  /**
+   * The method deletes all data if forceTableCLean  and lean garbage 
segment
+   * (MARKED_FOR_DELETE state) if forceTableCLean 
+   *
+   * @param dbName : Database name
+   * @param tableName  : Table name
+   * @param tablePath  : Table path
+   * @param carbonTable: CarbonTable Object  in case of 
force clean
+   * @param forceTableClean:  for force clean it will delete all 
data
+   *it will clean garbage segment 
(MARKED_FOR_DELETE state)
+   * @param currentTablePartitions : Hive Partitions  details
+   */
+  def cleanFiles(
+  dbName: String,
+  tableName: String,
+  tablePath: String,
+  carbonTable: CarbonTable,
+  forceTableClean: Boolean,
+  currentTablePartitions: Option[Seq[PartitionSpec]] = None,
+  truncateTable: Boolean = false): Unit = {
+var carbonCleanFilesLock: ICarbonLock = null
+val absoluteTableIdentifier = if (forceTableClean) {
+  AbsoluteTableIdentifier.from(tablePath, dbName, tableName, tableName)
+} else {
+  carbonTable.getAbsoluteTableIdentifier
+}
+try {
+  val errorMsg = "Clean files request is failed for " +
+s"$dbName.$tableName" +
+". Not able to acquire the clean files lock due to another clean files 
" +
+"operation is running in the background."
+  // in case of force clean the lock is not required
+  if (forceTableClean) {
+FileFactory.deleteAllCarbonFilesOfDir(
+  FileFactory.getCarbonFile(absoluteTableIdentifier.getTablePath))
+  } else {
+carbonCleanFilesLock =
+  CarbonLockUtil
+.getLockObject(absoluteTableIdentifier, 
LockUsage.CLEAN_FILES_LOCK, errorMsg)
+if (truncateTable) {
+  SegmentStatusManager.truncateTable(carbonTable)
+}
+SegmentStatusManager.deleteLoadsAndUpdateMetadata(
+  carbonTable, true, currentTablePartitions.map(_.asJava).orNull)
+CarbonUpdateUtil.cleanUpDeltaFiles(carbonTable, true)
+currentTablePartitions match {
+  case Some(partitions) =>
+SegmentFileStore.cleanSegments(
+  carbonTable,
+  currentTablePartitions.map(_.asJava).orNull,
+  true)
+  case _ =>
+}
+  }
+} finally {
+  if (currentTablePartitions.equals(None)) {
+cleanUpPartitionFoldersRecursively(carbonTable, 
List.empty[PartitionSpec])
+  } else {
+cleanUpPartitionFoldersRecursively(carbonTable, 
currentTablePartitions.get.toList)
+  }
+
+

[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3959: [CARBONDATA-4010] Doc changes for long strings.

2020-10-12 Thread GitBox



CarbonDataQA1 commented on pull request #3959:
URL: https://github.com/apache/carbondata/pull/3959#issuecomment-706958228


   Build Success with Spark 2.4.5, Please check CI 
http://121.244.95.60:12545/job/ApacheCarbon_PR_Builder_2.4.5/2620/
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [carbondata] vikramahuja1001 commented on a change in pull request #3917: [CARBONDATA-3978] Clean files refactor and added support for a trash folder where all the carbondata files will be co

2020-10-12 Thread GitBox



vikramahuja1001 commented on a change in pull request #3917:
URL: https://github.com/apache/carbondata/pull/3917#discussion_r503115176



##
File path: 
core/src/main/java/org/apache/carbondata/core/util/DeleteLoadFolders.java
##
@@ -192,11 +204,33 @@ private static boolean 
checkIfLoadCanBeDeleted(LoadMetadataDetails oneLoad,
   }
 
   private static boolean checkIfLoadCanBeDeletedPhysically(LoadMetadataDetails 
oneLoad,
-  boolean isForceDelete) {
+  boolean isForceDelete, AbsoluteTableIdentifier absoluteTableIdentifier) {
 // Check if the segment is added externally and path is set then do not 
delete it
 if ((SegmentStatus.MARKED_FOR_DELETE == oneLoad.getSegmentStatus()
-|| SegmentStatus.COMPACTED == oneLoad.getSegmentStatus()) && 
(oneLoad.getPath() == null
+|| SegmentStatus.COMPACTED == oneLoad.getSegmentStatus() || 
SegmentStatus
+.INSERT_IN_PROGRESS == oneLoad.getSegmentStatus()) && 
(oneLoad.getPath() == null

Review comment:
   In the discussion it was decided that we would be deleting if we get the 
segment lock and not the timeout threshold





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3924: [CARBONDATA-3988] Allow SI creation on first dimension column

2020-10-12 Thread GitBox



CarbonDataQA1 commented on pull request #3924:
URL: https://github.com/apache/carbondata/pull/3924#issuecomment-706945830


   Build Success with Spark 2.3.4, Please check CI 
http://121.244.95.60:12545/job/ApacheCarbonPRBuilder2.3/4369/
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [carbondata] ajantha-bhat commented on a change in pull request #3979: [Carbondata-3954] Fix insertion from ORC table into carbon table when sort scope is global sort

2020-10-12 Thread GitBox



ajantha-bhat commented on a change in pull request #3979:
URL: https://github.com/apache/carbondata/pull/3979#discussion_r503099172



##
File path: 
integration/spark/src/test/scala/org/apache/carbondata/spark/testsuite/allqueries/InsertIntoCarbonTableTestCase.scala
##
@@ -67,6 +67,21 @@ class InsertIntoCarbonTableTestCase extends QueryTest with 
BeforeAndAfterAll {
 
   }
 
+  test("insert from orc-select columns with columns having null values and 
sort scope as global sort") {
+sql("drop table if exists TORCSource")
+sql("drop table if exists TCarbon")
+sql("create table TORCSource(name string,col array,fee int) STORED 
AS orc")
+sql("insert into TORCSource values('karan',null,2)")
+sql("create table TCarbon(name string, col array,fee int) STORED 
AS carbondata TBLPROPERTIES 
('SORT_COLUMNS'='name','TABLE_BLOCKSIZE'='128','TABLE_BLOCKLET_SIZE'='128','SORT_SCOPE'='global_SORT')")
+sql("insert overwrite table TCarbon select name,col,fee from TORCSource")
+val result = sql("show segments for table 
TCarbon").collect()(0).get(1).toString()
+if(!"Success".equalsIgnoreCase(result)) {
+  assert(false)
+}
+sql("drop table if exists TORCSource")
+sql("drop table if exists TCarbon")
+  }
+

Review comment:
   please handle and verify 4 scenarios by comparing with ORC.
   a) local sort insert with complex type null value data
   b) global sort insert with complex type null value data
   c) same point a) with `carbon.enable.bad.record.handling.for.insert `as 
`true`
   d) same point b) with `carbon.enable.bad.record.handling.for.insert` as 
`true`





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [carbondata] QiangCai commented on a change in pull request #3978: [CARBONDATA-4028] Fix failed to unlock during update

2020-10-12 Thread GitBox



QiangCai commented on a change in pull request #3978:
URL: https://github.com/apache/carbondata/pull/3978#discussion_r503098160



##
File path: 
integration/spark/src/main/scala/org/apache/spark/sql/execution/command/mutation/CarbonProjectForUpdateCommand.scala
##
@@ -235,14 +237,38 @@ private[sql] case class CarbonProjectForUpdateCommand(
 }
 sys.error("Update operation failed. please check logs.")
 } finally {
-  if (null != dataSet && isPersistEnabled) {
-dataSet.unpersist()
+  if (updateLock.unlock()) {
+LOGGER.info(s"updateLock unlocked successfully after update operation 
$tableName")
+  } else {
+LOGGER.error(s"Unable to unlock updateLock for table $tableName after 
table updation");

Review comment:
   ```suggestion
   LOGGER.error(s"Unable to unlock updateLock for table $tableName 
after table update");
   ```





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [carbondata] ajantha-bhat commented on a change in pull request #3979: [Carbondata-3954] Fix insertion from ORC table into carbon table when sort scope is global sort

2020-10-12 Thread GitBox



ajantha-bhat commented on a change in pull request #3979:
URL: https://github.com/apache/carbondata/pull/3979#discussion_r503097655



##
File path: 
integration/spark/src/main/scala/org/apache/carbondata/spark/util/CommonUtil.scala
##
@@ -1011,38 +1011,46 @@ object CommonUtil {
   objectDataType: DataType): AnyRef = {
 objectDataType match {
   case _: ArrayType =>
-val arrayDataType = objectDataType.asInstanceOf[ArrayType]
-val arrayData = data.asInstanceOf[UnsafeArrayData]
-val size = arrayData.numElements()
-val childDataType = arrayDataType.elementType
-val arrayChildObjects = new Array[AnyRef](size)
-var i = 0
-while (i < size) {
-  arrayChildObjects(i) = 
convertSparkComplexTypeToCarbonObject(arrayData.get(i,
-childDataType), childDataType)
-  i = i + 1
+if (data == null) {

Review comment:
   after line 1011, add a check that if data type is array or struct or map 
and the data is null, return null to avoid changing many lines.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [carbondata] ajantha-bhat commented on a change in pull request #3979: [Carbondata-3954] Fix insertion from ORC table into carbon table when sort scope is global sort

2020-10-12 Thread GitBox



ajantha-bhat commented on a change in pull request #3979:
URL: https://github.com/apache/carbondata/pull/3979#discussion_r503096192



##
File path: 
integration/spark/src/test/scala/org/apache/carbondata/spark/testsuite/allqueries/InsertIntoCarbonTableTestCase.scala
##
@@ -67,6 +67,21 @@ class InsertIntoCarbonTableTestCase extends QueryTest with 
BeforeAndAfterAll {
 
   }
 
+  test("insert from orc-select columns with columns having null values and 
sort scope as global sort") {
+sql("drop table if exists TORCSource")
+sql("drop table if exists TCarbon")
+sql("create table TORCSource(name string,col array,fee int) STORED 
AS orc")
+sql("insert into TORCSource values('karan',null,2)")
+sql("create table TCarbon(name string, col array,fee int) STORED 
AS carbondata TBLPROPERTIES 
('SORT_COLUMNS'='name','TABLE_BLOCKSIZE'='128','TABLE_BLOCKLET_SIZE'='128','SORT_SCOPE'='global_SORT')")
+sql("insert overwrite table TCarbon select name,col,fee from TORCSource")
+val result = sql("show segments for table 
TCarbon").collect()(0).get(1).toString()
+if(!"Success".equalsIgnoreCase(result)) {

Review comment:
   please do a select query and compare orc and carbon table results 





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3695: [WIP] partition optimization

2020-10-12 Thread GitBox



CarbonDataQA1 commented on pull request #3695:
URL: https://github.com/apache/carbondata/pull/3695#issuecomment-706935530


   Build Failed  with Spark 2.3.4, Please check CI 
http://121.244.95.60:12545/job/ApacheCarbonPRBuilder2.3/4371/
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [carbondata] QiangCai commented on pull request #3949: [TEMP] [WIP] Fix issue in concurrent SI global sort

2020-10-12 Thread GitBox



QiangCai commented on pull request #3949:
URL: https://github.com/apache/carbondata/pull/3949#issuecomment-706934495


   why we need table property "isPositionIDRequested"?



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3924: [CARBONDATA-3988] Allow SI creation on first dimension column

2020-10-12 Thread GitBox



CarbonDataQA1 commented on pull request #3924:
URL: https://github.com/apache/carbondata/pull/3924#issuecomment-706932861


   Build Success with Spark 2.4.5, Please check CI 
http://121.244.95.60:12545/job/ApacheCarbon_PR_Builder_2.4.5/2619/
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3948: [WIP] Analyze random 11 testcase failure in CI

2020-10-12 Thread GitBox



CarbonDataQA1 commented on pull request #3948:
URL: https://github.com/apache/carbondata/pull/3948#issuecomment-706928763


   Build Success with Spark 2.3.4, Please check CI 
http://121.244.95.60:12545/job/ApacheCarbonPRBuilder2.3/4368/
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[jira] [Resolved] (CARBONDATA-4023) Create MV failed on table with geospatial index

2020-10-12 Thread Indhumathi Muthu Murugesh (Jira)



 [ 
https://issues.apache.org/jira/browse/CARBONDATA-4023?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Indhumathi Muthu Murugesh resolved CARBONDATA-4023.
---
Fix Version/s: 2.1.0
   Resolution: Fixed

> Create MV failed on table with geospatial index
> ---
>
> Key: CARBONDATA-4023
> URL: https://issues.apache.org/jira/browse/CARBONDATA-4023
> Project: CarbonData
>  Issue Type: Bug
>Reporter: SHREELEKHYA GAMPA
>Priority: Minor
> Fix For: 2.1.0
>
>  Time Spent: 2h
>  Remaining Estimate: 0h
>
> Create MV failed on the table with geospatial index using carbonsession.
> Failed with, java.lang.ClassNotFoundException: 
> org.apache.carbondata.geo.geohashindex



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Resolved] (CARBONDATA-3763) wrong insert result during insert stage command

2020-10-12 Thread Ajantha Bhat (Jira)



 [ 
https://issues.apache.org/jira/browse/CARBONDATA-3763?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ajantha Bhat resolved CARBONDATA-3763.
--
Fix Version/s: 2.0.0
   Resolution: Fixed

> wrong insert result during insert stage command
> ---
>
> Key: CARBONDATA-3763
> URL: https://issues.apache.org/jira/browse/CARBONDATA-3763
> Project: CarbonData
>  Issue Type: Bug
>Reporter: Ajantha Bhat
>Assignee: Ajantha Bhat
>Priority: Major
> Fix For: 2.0.0
>
>  Time Spent: 2.5h
>  Remaining Estimate: 0h
>
> problem:
> For insertStageCommand, spark is reusing the internalRow as two times we 
> transform from RDD[InternalRow] -> dataframe -> logical Plan -> 
> RDD[InternalRow]. So, same data is inserted on other rows
>  
> solution: Copy the internalRow after the last transform.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[GitHub] [carbondata] asfgit closed pull request #3966: [CARBONDATA-4023] Create MV failed on table with geospatial index using carbonsession.

2020-10-12 Thread GitBox



asfgit closed pull request #3966:
URL: https://github.com/apache/carbondata/pull/3966


   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[jira] [Resolved] (CARBONDATA-3786) presto carbon reader should use tablePath from hive catalog

2020-10-12 Thread Ajantha Bhat (Jira)



 [ 
https://issues.apache.org/jira/browse/CARBONDATA-3786?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ajantha Bhat resolved CARBONDATA-3786.
--
Fix Version/s: 2.1.0
   Resolution: Fixed

https://github.com/apache/carbondata/pull/3731

> presto carbon reader should use tablePath from hive catalog 
> 
>
> Key: CARBONDATA-3786
> URL: https://issues.apache.org/jira/browse/CARBONDATA-3786
> Project: CarbonData
>  Issue Type: Bug
>Reporter: Ajantha Bhat
>Assignee: Ajantha Bhat
>Priority: Minor
> Fix For: 2.1.0
>
>  Time Spent: 3h 50m
>  Remaining Estimate: 0h
>
> h3. Problem:
> h3. In upgrade scenarios of 1.6 to 2.0, when sparl.sql.warehouse is not 
> configured.
> Hive storage location is not proper. so, presto carbon integration should use 
> tablePath from hive storage instead of location.
> h3. Solution:
> use tablePath instead of location from hive metatstroe table.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Resolved] (CARBONDATA-3788) Fix insert failure during global sort with huge data in new insert flow

2020-10-12 Thread Ajantha Bhat (Jira)



 [ 
https://issues.apache.org/jira/browse/CARBONDATA-3788?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ajantha Bhat resolved CARBONDATA-3788.
--
Fix Version/s: 2.1.0
   Resolution: Fixed

https://github.com/apache/carbondata/pull/3732

> Fix insert failure during global sort with huge data in new insert flow 
> 
>
> Key: CARBONDATA-3788
> URL: https://issues.apache.org/jira/browse/CARBONDATA-3788
> Project: CarbonData
>  Issue Type: Bug
>Reporter: Ajantha Bhat
>Assignee: Ajantha Bhat
>Priority: Major
> Fix For: 2.1.0
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> Spark is resuing the internalRow in global sort partition flow with huge data.
> As RDD of Internal row is persisted for global sort.
>  
> Need to have a copy and work on the internalRow before the last transform for 
> global sort partition flow.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Resolved] (CARBONDATA-3843) Fix Merge index is not created for normal segment on streaming table

2020-10-12 Thread Ajantha Bhat (Jira)



 [ 
https://issues.apache.org/jira/browse/CARBONDATA-3843?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ajantha Bhat resolved CARBONDATA-3843.
--
Fix Version/s: 2.1.0
   Resolution: Fixed

https://github.com/apache/carbondata/pull/3785

> Fix Merge index is not created for normal segment on streaming table
> 
>
> Key: CARBONDATA-3843
> URL: https://issues.apache.org/jira/browse/CARBONDATA-3843
> Project: CarbonData
>  Issue Type: Bug
>Reporter: Ajantha Bhat
>Assignee: Ajantha Bhat
>Priority: Major
> Fix For: 2.1.0
>
>  Time Spent: 2.5h
>  Remaining Estimate: 0h
>
> problem :
> Merge index is not created for normal segment on streaming table
>  
> Solution: 
> For a streaming table other than streaming segment (Row_V1), allow merge 
> index creation for all kinds of segments.
>  
>  
>  
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

1 2 >

1 - 100 of 126 matches

Mail list logo