[jira] [Closed] (CARBONDATA-4012) Documentations issues.

2020-10-22 Thread Prasanna Ravichandran (Jira)


 [ 
https://issues.apache.org/jira/browse/CARBONDATA-4012?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Prasanna Ravichandran closed CARBONDATA-4012.
-

Complex features details are added to the Opensource document and verified.

> Documentations issues.
> --
>
> Key: CARBONDATA-4012
> URL: https://issues.apache.org/jira/browse/CARBONDATA-4012
> Project: CarbonData
>  Issue Type: Bug
>Reporter: Prasanna Ravichandran
>Priority: Minor
> Fix For: 2.1.0
>
>
> Support Array and Struct of all primitive type reading on presto from Spark 
> Carbon tables. This feature details have to be added in the below opensource 
> link:
> [https://github.com/apache/carbondata/blob/master/docs/prestosql-guide.md]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3995: [WIP] Fix data load failure issue with legacy store

2020-10-22 Thread GitBox


CarbonDataQA1 commented on pull request #3995:
URL: https://github.com/apache/carbondata/pull/3995#issuecomment-714927630


   Build Success with Spark 2.4.5, Please check CI 
http://121.244.95.60:12545/job/ApacheCarbon_PR_Builder_2.4.5/2896/
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3995: [WIP] Fix data load failure issue with legacy store

2020-10-22 Thread GitBox


CarbonDataQA1 commented on pull request #3995:
URL: https://github.com/apache/carbondata/pull/3995#issuecomment-714927328


   Build Success with Spark 2.3.4, Please check CI 
http://121.244.95.60:12545/job/ApacheCarbonPRBuilder2.3/4652/
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [carbondata] ajantha-bhat commented on pull request #3972: [WIP]Launch same number of task as select query for insert into select and ctas cases when target table is of no_sort

2020-10-22 Thread GitBox


ajantha-bhat commented on pull request #3972:
URL: https://github.com/apache/carbondata/pull/3972#issuecomment-714925420


   @QiangCai , @VenuReddy2103 : Not just line 95, In this file 
`TestSIWithSecondaryIndex`, look for all checkAnswerWithoutSort and replace 
with checkAnswer as it can cause random failure based on which query task 
finishes first 



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [carbondata] ajantha-bhat edited a comment on pull request #3987: [CARBONDATA-4039] Support Local dictionary for Presto complex datatypes

2020-10-22 Thread GitBox


ajantha-bhat edited a comment on pull request #3987:
URL: https://github.com/apache/carbondata/pull/3987#issuecomment-714924009


   please rebase, compile and push.
   And I hope you have locally compiled prestodb and prestosql both.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [carbondata] ajantha-bhat commented on pull request #3987: [CARBONDATA-4039] Support Local dictionary for Presto complex datatypes

2020-10-22 Thread GitBox


ajantha-bhat commented on pull request #3987:
URL: https://github.com/apache/carbondata/pull/3987#issuecomment-714924009


   please rebase, compile and push



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[jira] [Resolved] (CARBONDATA-3994) Skip Order by for map task if it is sort column and use limit pushdown for array_contains filter

2020-10-22 Thread Kunal Kapoor (Jira)


 [ 
https://issues.apache.org/jira/browse/CARBONDATA-3994?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kunal Kapoor resolved CARBONDATA-3994.
--
Fix Version/s: 2.1.0
   Resolution: Fixed

> Skip Order by for map task if it is sort column and use limit pushdown for 
> array_contains filter
> 
>
> Key: CARBONDATA-3994
> URL: https://issues.apache.org/jira/browse/CARBONDATA-3994
> Project: CarbonData
>  Issue Type: Bug
>Reporter: Ajantha Bhat
>Assignee: Ajantha Bhat
>Priority: Major
> Fix For: 2.1.0
>
>  Time Spent: 6h
>  Remaining Estimate: 0h
>
> When the order by column is in sort column, every map task output will be 
> already sorted. No need to sort the data again.
> Hence skipping the order at map task by changing plan node from 
> {{TakeOrderedAndProject}} --> {{CarbonTakeOrderedAndProjectExec}}
> Also in this scenario collecting the limit at map task and Array_contains() 
> will use this limit value for row scan filtering to break scan once limit 
> value is reached.
> Also added a carbon property to control this .
> {{carbon.mapOrderPushDown._.column}}
> Note: later we can improve this for other filters also to use the limit value.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[GitHub] [carbondata] asfgit closed pull request #3932: [CARBONDATA-3994] Skip Order by for map task if it is a first sort column and use limit pushdown for array_contains filter

2020-10-22 Thread GitBox


asfgit closed pull request #3932:
URL: https://github.com/apache/carbondata/pull/3932


   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [carbondata] kunal642 commented on pull request #3932: [CARBONDATA-3994] Skip Order by for map task if it is a first sort column and use limit pushdown for array_contains filter

2020-10-22 Thread GitBox


kunal642 commented on pull request #3932:
URL: https://github.com/apache/carbondata/pull/3932#issuecomment-714918693


   LGTM



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [carbondata] ajantha-bhat commented on a change in pull request #3994: [CARBONDATA-4040] Fix data mismatch incase of compaction failure and retry success

2020-10-22 Thread GitBox


ajantha-bhat commented on a change in pull request #3994:
URL: https://github.com/apache/carbondata/pull/3994#discussion_r510596257



##
File path: 
core/src/main/java/org/apache/carbondata/core/metadata/SegmentFileStore.java
##
@@ -398,27 +398,29 @@ public static void 
mergeIndexAndWriteSegmentFile(CarbonTable carbonTable, String
* @throws IOException
*/
   public static String writeSegmentFile(CarbonTable carbonTable, String 
segmentId, String UUID,

Review comment:
   Agree, Let me check and modify





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [carbondata] ajantha-bhat commented on pull request #3932: [CARBONDATA-3994] Skip Order by for map task if it is a first sort column and use limit pushdown for array_contains filter

2020-10-22 Thread GitBox


ajantha-bhat commented on pull request #3932:
URL: https://github.com/apache/carbondata/pull/3932#issuecomment-714906800


   @kunal642 : PR is ready, please check



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3974: [Carbondata-3999] Fix permission issue of indexServerTmp directory

2020-10-22 Thread GitBox


CarbonDataQA1 commented on pull request #3974:
URL: https://github.com/apache/carbondata/pull/3974#issuecomment-714895968


   Build Success with Spark 2.4.5, Please check CI 
http://121.244.95.60:12545/job/ApacheCarbon_PR_Builder_2.4.5/2895/
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3974: [Carbondata-3999] Fix permission issue of indexServerTmp directory

2020-10-22 Thread GitBox


CarbonDataQA1 commented on pull request #3974:
URL: https://github.com/apache/carbondata/pull/3974#issuecomment-714892804


   Build Success with Spark 2.3.4, Please check CI 
http://121.244.95.60:12545/job/ApacheCarbonPRBuilder2.3/4651/
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3977: [CARBONDATA-4027] Fix the wrong modifiedtime of loading files in inse…

2020-10-22 Thread GitBox


CarbonDataQA1 commented on pull request #3977:
URL: https://github.com/apache/carbondata/pull/3977#issuecomment-714891383


   Build Success with Spark 2.3.4, Please check CI 
http://121.244.95.60:12545/job/ApacheCarbonPRBuilder2.3/4650/
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3977: [CARBONDATA-4027] Fix the wrong modifiedtime of loading files in inse…

2020-10-22 Thread GitBox


CarbonDataQA1 commented on pull request #3977:
URL: https://github.com/apache/carbondata/pull/3977#issuecomment-714891281


   Build Success with Spark 2.4.5, Please check CI 
http://121.244.95.60:12545/job/ApacheCarbon_PR_Builder_2.4.5/2894/
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3982: [CARBONDATA-4032] Fix drop partition command clean data issue

2020-10-22 Thread GitBox


CarbonDataQA1 commented on pull request #3982:
URL: https://github.com/apache/carbondata/pull/3982#issuecomment-714891090


   Build Success with Spark 2.4.5, Please check CI 
http://121.244.95.60:12545/job/ApacheCarbon_PR_Builder_2.4.5/2893/
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3982: [CARBONDATA-4032] Fix drop partition command clean data issue

2020-10-22 Thread GitBox


CarbonDataQA1 commented on pull request #3982:
URL: https://github.com/apache/carbondata/pull/3982#issuecomment-714889033


   Build Success with Spark 2.3.4, Please check CI 
http://121.244.95.60:12545/job/ApacheCarbonPRBuilder2.3/4649/
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3945: [CARBONDATA-3991]Fix the set modified time function on S3 and Alluxio…

2020-10-22 Thread GitBox


CarbonDataQA1 commented on pull request #3945:
URL: https://github.com/apache/carbondata/pull/3945#issuecomment-714886597


   Build Failed  with Spark 2.4.5, Please check CI 
http://121.244.95.60:12545/job/ApacheCarbon_PR_Builder_2.4.5/2892/
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3945: [CARBONDATA-3991]Fix the set modified time function on S3 and Alluxio…

2020-10-22 Thread GitBox


CarbonDataQA1 commented on pull request #3945:
URL: https://github.com/apache/carbondata/pull/3945#issuecomment-714884825


   Build Success with Spark 2.3.4, Please check CI 
http://121.244.95.60:12545/job/ApacheCarbonPRBuilder2.3/4648/
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [carbondata] QiangCai commented on a change in pull request #3935: [CARBONDATA-3993] Remove auto data deletion in IUD processs

2020-10-22 Thread GitBox


QiangCai commented on a change in pull request #3935:
URL: https://github.com/apache/carbondata/pull/3935#discussion_r510566102



##
File path: 
processing/src/main/java/org/apache/carbondata/processing/util/CarbonLoaderUtil.java
##
@@ -323,10 +320,6 @@ public static boolean 
recordNewLoadMetadata(LoadMetadataDetails newMetaEntry,
 for (LoadMetadataDetails entry : listOfLoadFolderDetails) {

Review comment:
   Let's discuss insertOverwrite behavior.
   Should we delete old data immediately?
   





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [carbondata] QiangCai commented on a change in pull request #3935: [CARBONDATA-3993] Remove auto data deletion in IUD processs

2020-10-22 Thread GitBox


QiangCai commented on a change in pull request #3935:
URL: https://github.com/apache/carbondata/pull/3935#discussion_r510564217



##
File path: 
core/src/main/java/org/apache/carbondata/core/util/DeleteLoadFolders.java
##
@@ -227,34 +224,9 @@ public static boolean deleteLoadFoldersFromFileSystem(
 if (details != null && details.length != 0) {
   for (LoadMetadataDetails oneLoad : details) {
 if (checkIfLoadCanBeDeleted(oneLoad, isForceDelete)) {
-  ICarbonLock segmentLock = 
CarbonLockFactory.getCarbonLockObj(absoluteTableIdentifier,

Review comment:
   after this PR, method deleteLoadFoldersFromFileSystem will be invoked by 
clean files only, right?
   if yes, why we remove this code?





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [carbondata] QiangCai commented on pull request #3916: [CARBONDATA-3935]Support partition table transactional write in presto

2020-10-22 Thread GitBox


QiangCai commented on pull request #3916:
URL: https://github.com/apache/carbondata/pull/3916#issuecomment-714868177


   please do rebase



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [carbondata] QiangCai commented on a change in pull request #3912: [CARBONDATA-3977] Global sort partitions should be determined dynamically

2020-10-22 Thread GitBox


QiangCai commented on a change in pull request #3912:
URL: https://github.com/apache/carbondata/pull/3912#discussion_r510558896



##
File path: 
integration/spark/src/main/scala/org/apache/carbondata/spark/load/DataLoadProcessBuilderOnSpark.scala
##
@@ -143,10 +143,18 @@ object DataLoadProcessBuilderOnSpark {
 
 var numPartitions = CarbonDataProcessorUtil.getGlobalSortPartitions(
   
configuration.getDataLoadProperty(CarbonCommonConstants.LOAD_GLOBAL_SORT_PARTITIONS))
-if (numPartitions <= 0) {
-  numPartitions = convertRDD.partitions.length
+
+// if numPartitions user does not specify and not specified in config then 
dynamically calculate
+if (numPartitions == 0) {
+  // get the size in bytes and convert to size in MB
+  val sizeOfDataFrame = SizeEstimator.estimate(inputRDD)/100
+  // data frame size can not be more than Int size
+  numPartitions = sizeOfDataFrame.toInt/inputRDD.getNumPartitions

Review comment:
   it should be   numberOfParitions = totalSize/partitionSize





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [carbondata] QiangCai edited a comment on pull request #3972: [WIP]Launch same number of task as select query for insert into select and ctas cases when target table is of no_sort

2020-10-22 Thread GitBox


QiangCai edited a comment on pull request #3972:
URL: https://github.com/apache/carbondata/pull/3972#issuecomment-714861784


   TestSIWithSecondaryIndex
   1. change line 92, add order by
   `
   checkAnswerWithoutSort(sql("select id, country from table1_index order by 
id"),
 Seq(Row("1", "india"), Row("2", "china")))
   `
   
   2. change line 115, add order by



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [carbondata] QiangCai commented on pull request #3972: [WIP]Launch same number of task as select query for insert into select and ctas cases when target table is of no_sort

2020-10-22 Thread GitBox


QiangCai commented on pull request #3972:
URL: https://github.com/apache/carbondata/pull/3972#issuecomment-714861784


   please fix TestSIWithSecondaryIndex line 92:
   `
   checkAnswerWithoutSort(sql("select id, country from table1_index order by 
id"),
 Seq(Row("1", "india"), Row("2", "china")))
   `



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [carbondata] QiangCai commented on pull request #3977: [CARBONDATA-4027] Fix the wrong modifiedtime of loading files in inse…

2020-10-22 Thread GitBox


QiangCai commented on pull request #3977:
URL: https://github.com/apache/carbondata/pull/3977#issuecomment-714860865


   retest this please



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [carbondata] QiangCai commented on pull request #3974: [Carbondata-3999] Fix permission issue of indexServerTmp directory

2020-10-22 Thread GitBox


QiangCai commented on pull request #3974:
URL: https://github.com/apache/carbondata/pull/3974#issuecomment-714860972


   retest this please



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [carbondata] QiangCai commented on pull request #3979: [Carbondata-3954] Fix insertion from ORC table into carbon table when sort scope is global sort

2020-10-22 Thread GitBox


QiangCai commented on pull request #3979:
URL: https://github.com/apache/carbondata/pull/3979#issuecomment-714860572


   please fix scala style issue



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [carbondata] QiangCai commented on pull request #3982: [CARBONDATA-4032] Fix drop partition command clean data issue

2020-10-22 Thread GitBox


QiangCai commented on pull request #3982:
URL: https://github.com/apache/carbondata/pull/3982#issuecomment-714860028


   retest this please



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [carbondata] QiangCai commented on pull request #3945: [CARBONDATA-3991]Fix the set modified time function on S3 and Alluxio…

2020-10-22 Thread GitBox


QiangCai commented on pull request #3945:
URL: https://github.com/apache/carbondata/pull/3945#issuecomment-714855465


   retest this please



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [carbondata] QiangCai commented on a change in pull request #3994: [CARBONDATA-4040] Fix data mismatch incase of compaction failure and retry success

2020-10-22 Thread GitBox


QiangCai commented on a change in pull request #3994:
URL: https://github.com/apache/carbondata/pull/3994#discussion_r510548190



##
File path: 
core/src/main/java/org/apache/carbondata/core/metadata/SegmentFileStore.java
##
@@ -398,27 +398,29 @@ public static void 
mergeIndexAndWriteSegmentFile(CarbonTable carbonTable, String
* @throws IOException
*/
   public static String writeSegmentFile(CarbonTable carbonTable, String 
segmentId, String UUID,

Review comment:
   for the update, it will have more than one timstamp, right





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [carbondata] QiangCai commented on a change in pull request #3994: [CARBONDATA-4040] Fix data mismatch incase of compaction failure and retry success

2020-10-22 Thread GitBox


QiangCai commented on a change in pull request #3994:
URL: https://github.com/apache/carbondata/pull/3994#discussion_r510548190



##
File path: 
core/src/main/java/org/apache/carbondata/core/metadata/SegmentFileStore.java
##
@@ -398,27 +398,29 @@ public static void 
mergeIndexAndWriteSegmentFile(CarbonTable carbonTable, String
* @throws IOException
*/
   public static String writeSegmentFile(CarbonTable carbonTable, String 
segmentId, String UUID,

Review comment:
   for the update, it will have more than one timestamp, right

##
File path: 
core/src/main/java/org/apache/carbondata/core/metadata/SegmentFileStore.java
##
@@ -398,27 +398,29 @@ public static void 
mergeIndexAndWriteSegmentFile(CarbonTable carbonTable, String
* @throws IOException
*/
   public static String writeSegmentFile(CarbonTable carbonTable, String 
segmentId, String UUID,

Review comment:
   for the update, it will have more than one timestamp, right?





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3972: [WIP]Launch same number of task as select query for insert into select and ctas cases when target table is of no_sort

2020-10-22 Thread GitBox


CarbonDataQA1 commented on pull request #3972:
URL: https://github.com/apache/carbondata/pull/3972#issuecomment-714797045


   Build Success with Spark 2.4.5, Please check CI 
http://121.244.95.60:12545/job/ApacheCarbon_PR_Builder_2.4.5/2891/
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3972: [WIP]Launch same number of task as select query for insert into select and ctas cases when target table is of no_sort

2020-10-22 Thread GitBox


CarbonDataQA1 commented on pull request #3972:
URL: https://github.com/apache/carbondata/pull/3972#issuecomment-714794831


   Build Failed  with Spark 2.3.4, Please check CI 
http://121.244.95.60:12545/job/ApacheCarbonPRBuilder2.3/4647/
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3988: [CARBONDATA-4037] Improve the table status and segment file writing

2020-10-22 Thread GitBox


CarbonDataQA1 commented on pull request #3988:
URL: https://github.com/apache/carbondata/pull/3988#issuecomment-714738027


   Build Success with Spark 2.4.5, Please check CI 
http://121.244.95.60:12545/job/ApacheCarbon_PR_Builder_2.4.5/2890/
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3988: [CARBONDATA-4037] Improve the table status and segment file writing

2020-10-22 Thread GitBox


CarbonDataQA1 commented on pull request #3988:
URL: https://github.com/apache/carbondata/pull/3988#issuecomment-714736288


   Build Failed  with Spark 2.3.4, Please check CI 
http://121.244.95.60:12545/job/ApacheCarbonPRBuilder2.3/4646/
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3987: [CARBONDATA-4039] Support Local dictionary for Presto complex datatypes

2020-10-22 Thread GitBox


CarbonDataQA1 commented on pull request #3987:
URL: https://github.com/apache/carbondata/pull/3987#issuecomment-714733234


   Build Failed  with Spark 2.4.5, Please check CI 
http://121.244.95.60:12545/job/ApacheCarbon_PR_Builder_2.4.5/2888/
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3970: [CARBONDATA-4007] Fix multiple issues in SDK

2020-10-22 Thread GitBox


CarbonDataQA1 commented on pull request #3970:
URL: https://github.com/apache/carbondata/pull/3970#issuecomment-714733235


   Build Success with Spark 2.3.4, Please check CI 
http://121.244.95.60:12545/job/ApacheCarbonPRBuilder2.3/4643/
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3987: [CARBONDATA-4039] Support Local dictionary for Presto complex datatypes

2020-10-22 Thread GitBox


CarbonDataQA1 commented on pull request #3987:
URL: https://github.com/apache/carbondata/pull/3987#issuecomment-714721854


   Build Failed  with Spark 2.3.4, Please check CI 
http://121.244.95.60:12545/job/ApacheCarbonPRBuilder2.3/4644/
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3994: [CARBONDATA-4040] Fix data mismatch incase of compaction failure and retry success

2020-10-22 Thread GitBox


CarbonDataQA1 commented on pull request #3994:
URL: https://github.com/apache/carbondata/pull/3994#issuecomment-714711458


   Build Failed  with Spark 2.4.5, Please check CI 
http://121.244.95.60:12545/job/ApacheCarbon_PR_Builder_2.4.5/2884/
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3979: [Carbondata-3954] Fix insertion from ORC table into carbon table when sort scope is global sort

2020-10-22 Thread GitBox


CarbonDataQA1 commented on pull request #3979:
URL: https://github.com/apache/carbondata/pull/3979#issuecomment-714709707


   Build Failed  with Spark 2.4.5, Please check CI 
http://121.244.95.60:12545/job/ApacheCarbon_PR_Builder_2.4.5/2885/
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3970: [CARBONDATA-4007] Fix multiple issues in SDK

2020-10-22 Thread GitBox


CarbonDataQA1 commented on pull request #3970:
URL: https://github.com/apache/carbondata/pull/3970#issuecomment-714709360


   Build Success with Spark 2.4.5, Please check CI 
http://121.244.95.60:12545/job/ApacheCarbon_PR_Builder_2.4.5/2887/
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3979: [Carbondata-3954] Fix insertion from ORC table into carbon table when sort scope is global sort

2020-10-22 Thread GitBox


CarbonDataQA1 commented on pull request #3979:
URL: https://github.com/apache/carbondata/pull/3979#issuecomment-714704138


   Build Failed  with Spark 2.3.4, Please check CI 
http://121.244.95.60:12545/job/ApacheCarbonPRBuilder2.3/4641/
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3995: [WIP] Fix data load failure issue with legacy store

2020-10-22 Thread GitBox


CarbonDataQA1 commented on pull request #3995:
URL: https://github.com/apache/carbondata/pull/3995#issuecomment-714703567


   Build Success with Spark 2.4.5, Please check CI 
http://121.244.95.60:12545/job/ApacheCarbon_PR_Builder_2.4.5/2886/
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3932: [CARBONDATA-3994] Skip Order by for map task if it is a first sort column and use limit pushdown for array_contains filter

2020-10-22 Thread GitBox


CarbonDataQA1 commented on pull request #3932:
URL: https://github.com/apache/carbondata/pull/3932#issuecomment-714702328


   Build Success with Spark 2.3.4, Please check CI 
http://121.244.95.60:12545/job/ApacheCarbonPRBuilder2.3/4639/
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3917: [CARBONDATA-3978] Clean Files Refactor and support for trash folder in carbondata

2020-10-22 Thread GitBox


CarbonDataQA1 commented on pull request #3917:
URL: https://github.com/apache/carbondata/pull/3917#issuecomment-714702324


   Build Success with Spark 2.3.4, Please check CI 
http://121.244.95.60:12545/job/ApacheCarbonPRBuilder2.3/4637/
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3875: [CARBONDATA-3934]Support write transactional table with presto.

2020-10-22 Thread GitBox


CarbonDataQA1 commented on pull request #3875:
URL: https://github.com/apache/carbondata/pull/3875#issuecomment-714673040


   Build Failed  with Spark 2.3.4, Please check CI 
http://121.244.95.60:12545/job/ApacheCarbonPRBuilder2.3/4645/
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3875: [CARBONDATA-3934]Support write transactional table with presto.

2020-10-22 Thread GitBox


CarbonDataQA1 commented on pull request #3875:
URL: https://github.com/apache/carbondata/pull/3875#issuecomment-714672363


   Build Failed  with Spark 2.4.5, Please check CI 
http://121.244.95.60:12545/job/ApacheCarbon_PR_Builder_2.4.5/2889/
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3994: [CARBONDATA-4040] Fix data mismatch incase of compaction failure and retry success

2020-10-22 Thread GitBox


CarbonDataQA1 commented on pull request #3994:
URL: https://github.com/apache/carbondata/pull/3994#issuecomment-714671495


   Build Failed  with Spark 2.3.4, Please check CI 
http://121.244.95.60:12545/job/ApacheCarbonPRBuilder2.3/4640/
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3917: [CARBONDATA-3978] Clean Files Refactor and support for trash folder in carbondata

2020-10-22 Thread GitBox


CarbonDataQA1 commented on pull request #3917:
URL: https://github.com/apache/carbondata/pull/3917#issuecomment-714670301


   Build Success with Spark 2.4.5, Please check CI 
http://121.244.95.60:12545/job/ApacheCarbon_PR_Builder_2.4.5/2881/
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3932: [CARBONDATA-3994] Skip Order by for map task if it is a first sort column and use limit pushdown for array_contains filter

2020-10-22 Thread GitBox


CarbonDataQA1 commented on pull request #3932:
URL: https://github.com/apache/carbondata/pull/3932#issuecomment-714668792


   Build Success with Spark 2.4.5, Please check CI 
http://121.244.95.60:12545/job/ApacheCarbon_PR_Builder_2.4.5/2882/
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [carbondata] brijoobopanna commented on pull request #3875: [CARBONDATA-3934]Support write transactional table with presto.

2020-10-22 Thread GitBox


brijoobopanna commented on pull request #3875:
URL: https://github.com/apache/carbondata/pull/3875#issuecomment-714660442


   retest this please
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [carbondata] brijoobopanna commented on pull request #3987: [CARBONDATA-4039] Support Local dictionary for Presto complex datatypes

2020-10-22 Thread GitBox


brijoobopanna commented on pull request #3987:
URL: https://github.com/apache/carbondata/pull/3987#issuecomment-714658897


   retest this please
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[jira] [Commented] (CARBONDATA-3917) The rows of data loading is not accurate, more rows has been loaded

2020-10-22 Thread Akshay (Jira)


[ 
https://issues.apache.org/jira/browse/CARBONDATA-3917?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17219223#comment-17219223
 ] 

Akshay commented on CARBONDATA-3917:


please provide further details for this issue.

> The rows of data loading is not accurate, more rows has been loaded
> ---
>
> Key: CARBONDATA-3917
> URL: https://issues.apache.org/jira/browse/CARBONDATA-3917
> Project: CarbonData
>  Issue Type: Bug
>  Components: data-load
>Affects Versions: 2.0.0
>Reporter: Taoli
>Priority: Blocker
>
> 2020-07-18 18:46:23,856 | INFO | [Executor task launch worker for task 28380] 
> | Total rows processed in step Data Writer: 1277745 | 
> org.apache.carbondata.processing.loading.AbstractDataLoadProcessorStep.close(AbstractDataLoadProcessorStep.java:138)
> 2020-07-18 18:46:23,857 | INFO | [Executor task launch worker for task 28380] 
> | Total rows processed in step Sort Processor: 1189959 | 
> org.apache.carbondata.processing.loading.AbstractDataLoadProcessorStep.close(AbstractDataLoadProcessorStep.java:138)
> 2020-07-18 18:46:23,856 | DEBUG | 
> [LocalFolderDeletionPool:detail_cdr_s1mme_18461_1595087183856] | 
> PrivilegedAction as:omm (auth:SIMPLE) 
> from:org.apache.carbondata.core.util.CarbonUtil.deleteFoldersAndFiles(CarbonUtil.java:298)
>  | 
> org.apache.hadoop.security.UserGroupInformation.logPrivilegedAction(UserGroupInformation.java:1756)
> 2020-07-18 18:46:23,857 | INFO | [Executor task launch worker for task 28380] 
> | Total rows processed in step Data Converter: 1189959 | 
> org.apache.carbondata.processing.loading.AbstractDataLoadProcessorStep.close(AbstractDataLoadProcessorStep.java:138)
> 2020-07-18 18:46:23,857 | INFO | [Executor task launch worker for task 28380] 
> | Total rows processed in step Input Processor: 1189959 | 
> org.apache.carbondata.processing.loading.AbstractDataLoadProcessorStep.close(AbstractDataLoadProcessorStep.java:138)



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3995: [WIP] Fix data load failure issue with legacy store

2020-10-22 Thread GitBox


CarbonDataQA1 commented on pull request #3995:
URL: https://github.com/apache/carbondata/pull/3995#issuecomment-714647878


   Build Failed  with Spark 2.3.4, Please check CI 
http://121.244.95.60:12545/job/ApacheCarbonPRBuilder2.3/4642/
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [carbondata] Karan980 commented on pull request #3970: [CARBONDATA-4007] Fix multiple issues in SDK

2020-10-22 Thread GitBox


Karan980 commented on pull request #3970:
URL: https://github.com/apache/carbondata/pull/3970#issuecomment-714647261


   retest this please



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [carbondata] Indhumathi27 opened a new pull request #3995: [WIP] Fix data load failure issue with legacy store

2020-10-22 Thread GitBox


Indhumathi27 opened a new pull request #3995:
URL: https://github.com/apache/carbondata/pull/3995


### Why is this PR needed?


### What changes were proposed in this PR?
   
   
### Does this PR introduce any user interface change?
- No
- Yes. (please explain the change and update document)
   
### Is any new testcase added?
- No
- Yes
   
   
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3932: [CARBONDATA-3994] Skip Order by for map task if it is a first sort column and use limit pushdown for array_contains filter

2020-10-22 Thread GitBox


CarbonDataQA1 commented on pull request #3932:
URL: https://github.com/apache/carbondata/pull/3932#issuecomment-714617328


   Build Success with Spark 2.3.4, Please check CI 
http://121.244.95.60:12545/job/ApacheCarbonPRBuilder2.3/4635/
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [carbondata] ajantha-bhat commented on pull request #3994: [CARBONDATA-4040] Fix data mismatch incase of compaction failure and retry success

2020-10-22 Thread GitBox


ajantha-bhat commented on pull request #3994:
URL: https://github.com/apache/carbondata/pull/3994#issuecomment-714616941


   retest this please



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[jira] [Reopened] (CARBONDATA-4039) Support Local dictionary for presto complex datatypes

2020-10-22 Thread Akshay (Jira)


 [ 
https://issues.apache.org/jira/browse/CARBONDATA-4039?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Akshay reopened CARBONDATA-4039:


> Support Local dictionary for presto complex datatypes
> -
>
> Key: CARBONDATA-4039
> URL: https://issues.apache.org/jira/browse/CARBONDATA-4039
> Project: CarbonData
>  Issue Type: New Feature
>  Components: core, presto-integration
>Reporter: Akshay
>Priority: Major
>  Time Spent: 4h 50m
>  Remaining Estimate: 0h
>
> Support Local dictionary for presto complex datatypes - 
> Presto complex datatypes - array and struct only.
> [https://github.com/apache/carbondata/pull/3987]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Closed] (CARBONDATA-4004) Wrong result in Presto select query after executing update

2020-10-22 Thread Akshay (Jira)


 [ 
https://issues.apache.org/jira/browse/CARBONDATA-4004?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Akshay closed CARBONDATA-4004.
--

> Wrong result in Presto select query after executing update
> --
>
> Key: CARBONDATA-4004
> URL: https://issues.apache.org/jira/browse/CARBONDATA-4004
> Project: CarbonData
>  Issue Type: Bug
>  Components: core, presto-integration
>Reporter: Akshay
>Priority: Major
> Fix For: 2.1.0
>
>  Time Spent: 5h 50m
>  Remaining Estimate: 0h
>
> Presto select query after update operation returns different number of rows.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3970: [CARBONDATA-4007] Fix multiple issues in SDK

2020-10-22 Thread GitBox


CarbonDataQA1 commented on pull request #3970:
URL: https://github.com/apache/carbondata/pull/3970#issuecomment-714616484


   Build Failed  with Spark 2.3.4, Please check CI 
http://121.244.95.60:12545/job/ApacheCarbonPRBuilder2.3/4638/
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[jira] [Closed] (CARBONDATA-4039) Support Local dictionary for presto complex datatypes

2020-10-22 Thread Akshay (Jira)


 [ 
https://issues.apache.org/jira/browse/CARBONDATA-4039?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Akshay closed CARBONDATA-4039.
--
Resolution: Fixed

> Support Local dictionary for presto complex datatypes
> -
>
> Key: CARBONDATA-4039
> URL: https://issues.apache.org/jira/browse/CARBONDATA-4039
> Project: CarbonData
>  Issue Type: New Feature
>  Components: core, presto-integration
>Reporter: Akshay
>Priority: Major
>  Time Spent: 4h 50m
>  Remaining Estimate: 0h
>
> Support Local dictionary for presto complex datatypes - 
> Presto complex datatypes - array and struct only.
> [https://github.com/apache/carbondata/pull/3987]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[GitHub] [carbondata] Karan980 commented on pull request #3970: [CARBONDATA-4007] Fix multiple issues in SDK

2020-10-22 Thread GitBox


Karan980 commented on pull request #3970:
URL: https://github.com/apache/carbondata/pull/3970#issuecomment-714611999


   retest this please



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [carbondata] ajantha-bhat commented on pull request #3932: [CARBONDATA-3994] Skip Order by for map task if it is a first sort column and use limit pushdown for array_contains filter

2020-10-22 Thread GitBox


ajantha-bhat commented on pull request #3932:
URL: https://github.com/apache/carbondata/pull/3932#issuecomment-714611460


   retest this please



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3935: [CARBONDATA-3993] Remove auto data deletion in IUD processs

2020-10-22 Thread GitBox


CarbonDataQA1 commented on pull request #3935:
URL: https://github.com/apache/carbondata/pull/3935#issuecomment-714608884


   Build Success with Spark 2.3.4, Please check CI 
http://121.244.95.60:12545/job/ApacheCarbonPRBuilder2.3/4634/
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3984: [CARBONDATA-4035]Fix MV query issue with aggregation on decimal column

2020-10-22 Thread GitBox


CarbonDataQA1 commented on pull request #3984:
URL: https://github.com/apache/carbondata/pull/3984#issuecomment-714585851


   Build Success with Spark 2.3.4, Please check CI 
http://121.244.95.60:12545/job/ApacheCarbonPRBuilder2.3/4633/
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3994: [CARBONDATA-4040] Fix data mismatch incase of compaction failure and retry success

2020-10-22 Thread GitBox


CarbonDataQA1 commented on pull request #3994:
URL: https://github.com/apache/carbondata/pull/3994#issuecomment-714569205


   Build Failed  with Spark 2.3.4, Please check CI 
http://121.244.95.60:12545/job/ApacheCarbonPRBuilder2.3/4632/
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3935: [CARBONDATA-3993] Remove auto data deletion in IUD processs

2020-10-22 Thread GitBox


CarbonDataQA1 commented on pull request #3935:
URL: https://github.com/apache/carbondata/pull/3935#issuecomment-714562618


   Build Success with Spark 2.4.5, Please check CI 
http://121.244.95.60:12545/job/ApacheCarbon_PR_Builder_2.4.5/2878/
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3875: [CARBONDATA-3934]Support write transactional table with presto.

2020-10-22 Thread GitBox


CarbonDataQA1 commented on pull request #3875:
URL: https://github.com/apache/carbondata/pull/3875#issuecomment-714562870


   Build Failed  with Spark 2.3.4, Please check CI 
http://121.244.95.60:12545/job/ApacheCarbonPRBuilder2.3/4636/
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3932: [CARBONDATA-3994] Skip Order by for map task if it is a first sort column and use limit pushdown for array_contains filter

2020-10-22 Thread GitBox


CarbonDataQA1 commented on pull request #3932:
URL: https://github.com/apache/carbondata/pull/3932#issuecomment-714560511


   Build Failed  with Spark 2.4.5, Please check CI 
http://121.244.95.60:12545/job/ApacheCarbon_PR_Builder_2.4.5/2879/
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3984: [CARBONDATA-4035]Fix MV query issue with aggregation on decimal column

2020-10-22 Thread GitBox


CarbonDataQA1 commented on pull request #3984:
URL: https://github.com/apache/carbondata/pull/3984#issuecomment-714559131


   Build Success with Spark 2.4.5, Please check CI 
http://121.244.95.60:12545/job/ApacheCarbon_PR_Builder_2.4.5/2877/
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3994: [CARBONDATA-4040] Fix data mismatch incase of compaction failure and retry success

2020-10-22 Thread GitBox


CarbonDataQA1 commented on pull request #3994:
URL: https://github.com/apache/carbondata/pull/3994#issuecomment-714556030


   Build Failed  with Spark 2.4.5, Please check CI 
http://121.244.95.60:12545/job/ApacheCarbon_PR_Builder_2.4.5/2876/
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3932: [CARBONDATA-3994] Skip Order by for map task if it is a first sort column and use limit pushdown for array_contains filter

2020-10-22 Thread GitBox


CarbonDataQA1 commented on pull request #3932:
URL: https://github.com/apache/carbondata/pull/3932#issuecomment-714546867


   Build Success with Spark 2.3.4, Please check CI 
http://121.244.95.60:12545/job/ApacheCarbonPRBuilder2.3/4630/
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3986: [CARBONDATA-4034] Improve the time-consuming of Horizontal Compaction for update

2020-10-22 Thread GitBox


CarbonDataQA1 commented on pull request #3986:
URL: https://github.com/apache/carbondata/pull/3986#issuecomment-714520638


   Build Success with Spark 2.3.4, Please check CI 
http://121.244.95.60:12545/job/ApacheCarbonPRBuilder2.3/4629/
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3875: [CARBONDATA-3934]Support write transactional table with presto.

2020-10-22 Thread GitBox


CarbonDataQA1 commented on pull request #3875:
URL: https://github.com/apache/carbondata/pull/3875#issuecomment-714504266


   Build Failed  with Spark 2.4.5, Please check CI 
http://121.244.95.60:12545/job/ApacheCarbon_PR_Builder_2.4.5/2880/
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3988: [CARBONDATA-4037] Improve the table status and segment file writing

2020-10-22 Thread GitBox


CarbonDataQA1 commented on pull request #3988:
URL: https://github.com/apache/carbondata/pull/3988#issuecomment-714500597


   Build Failed  with Spark 2.4.5, Please check CI 
http://121.244.95.60:12545/job/ApacheCarbon_PR_Builder_2.4.5/2874/
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[jira] [Updated] (CARBONDATA-4041) carbondata-processing's apache-spark versions and vulnerabilities

2020-10-22 Thread openlookeng (Jira)


 [ 
https://issues.apache.org/jira/browse/CARBONDATA-4041?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

openlookeng updated CARBONDATA-4041:

Affects Version/s: 2.0.1

> carbondata-processing's apache-spark versions and vulnerabilities
> -
>
> Key: CARBONDATA-4041
> URL: https://issues.apache.org/jira/browse/CARBONDATA-4041
> Project: CarbonData
>  Issue Type: Improvement
>  Components: other
>Affects Versions: 2.0.1
>Reporter: openlookeng
>Priority: Blocker
>
> carbondata-processing  dependency spark-unsafe 2.4.5 component, but have 
> vulnerabilities of *CVE-2020-9480* , do team have plan to update it ?



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (CARBONDATA-4041) carbondata-processing's apache-spark versions and vulnerabilities

2020-10-22 Thread openlookeng (Jira)


 [ 
https://issues.apache.org/jira/browse/CARBONDATA-4041?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

openlookeng updated CARBONDATA-4041:

Component/s: other

> carbondata-processing's apache-spark versions and vulnerabilities
> -
>
> Key: CARBONDATA-4041
> URL: https://issues.apache.org/jira/browse/CARBONDATA-4041
> Project: CarbonData
>  Issue Type: Improvement
>  Components: other
>Reporter: openlookeng
>Priority: Blocker
>
> carbondata-processing  dependency spark-unsafe 2.4.5 component, but have 
> vulnerabilities of *CVE-2020-9480* , do team have plan to update it ?



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (CARBONDATA-4041) carbondata-processing's apache-spark versions and vulnerabilities

2020-10-22 Thread openlookeng (Jira)


 [ 
https://issues.apache.org/jira/browse/CARBONDATA-4041?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

openlookeng updated CARBONDATA-4041:

Description: carbondata-processing  dependency spark-unsafe 2.4.5 
component, but have vulnerabilities of *CVE-2020-9480* , do team have plan to 
update it ?  (was: the vulnerabilities is *CVE-2020-9480*)

> carbondata-processing's apache-spark versions and vulnerabilities
> -
>
> Key: CARBONDATA-4041
> URL: https://issues.apache.org/jira/browse/CARBONDATA-4041
> Project: CarbonData
>  Issue Type: Improvement
>Reporter: openlookeng
>Priority: Blocker
>
> carbondata-processing  dependency spark-unsafe 2.4.5 component, but have 
> vulnerabilities of *CVE-2020-9480* , do team have plan to update it ?



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (CARBONDATA-4041) carbondata-processing's apache-spark versions and vulnerabilities

2020-10-22 Thread openlookeng (Jira)


 [ 
https://issues.apache.org/jira/browse/CARBONDATA-4041?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

openlookeng updated CARBONDATA-4041:

Summary: carbondata-processing's apache-spark versions and vulnerabilities  
(was: apache-spark versions and vulnerabilities)

> carbondata-processing's apache-spark versions and vulnerabilities
> -
>
> Key: CARBONDATA-4041
> URL: https://issues.apache.org/jira/browse/CARBONDATA-4041
> Project: CarbonData
>  Issue Type: Improvement
>Reporter: openlookeng
>Priority: Blocker
>
> the vulnerabilities is *CVE-2020-9480*



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3875: [CARBONDATA-3934]Support write transactional table with presto.

2020-10-22 Thread GitBox


CarbonDataQA1 commented on pull request #3875:
URL: https://github.com/apache/carbondata/pull/3875#issuecomment-714494286


   Build Failed  with Spark 2.3.4, Please check CI 
http://121.244.95.60:12545/job/ApacheCarbonPRBuilder2.3/4631/
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3988: [CARBONDATA-4037] Improve the table status and segment file writing

2020-10-22 Thread GitBox


CarbonDataQA1 commented on pull request #3988:
URL: https://github.com/apache/carbondata/pull/3988#issuecomment-714492288


   Build Failed  with Spark 2.3.4, Please check CI 
http://121.244.95.60:12545/job/ApacheCarbonPRBuilder2.3/4628/
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3986: [CARBONDATA-4034] Improve the time-consuming of Horizontal Compaction for update

2020-10-22 Thread GitBox


CarbonDataQA1 commented on pull request #3986:
URL: https://github.com/apache/carbondata/pull/3986#issuecomment-714490521


   Build Success with Spark 2.4.5, Please check CI 
http://121.244.95.60:12545/job/ApacheCarbon_PR_Builder_2.4.5/2873/
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3974: [Carbondata-3999] Fix permission issue of indexServerTmp directory

2020-10-22 Thread GitBox


CarbonDataQA1 commented on pull request #3974:
URL: https://github.com/apache/carbondata/pull/3974#issuecomment-714489321


   Build Success with Spark 2.4.5, Please check CI 
http://121.244.95.60:12545/job/ApacheCarbon_PR_Builder_2.4.5/2872/
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3917: [CARBONDATA-3978] Clean Files Refactor and support for trash folder in carbondata

2020-10-22 Thread GitBox


CarbonDataQA1 commented on pull request #3917:
URL: https://github.com/apache/carbondata/pull/3917#issuecomment-714482013


   Build Failed  with Spark 2.3.4, Please check CI 
http://121.244.95.60:12545/job/ApacheCarbonPRBuilder2.3/4627/
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[jira] [Commented] (CARBONDATA-3953) Dead lock when doing dataframe persist and loading

2020-10-22 Thread Akshay (Jira)


[ 
https://issues.apache.org/jira/browse/CARBONDATA-3953?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17218983#comment-17218983
 ] 

Akshay commented on CARBONDATA-3953:


Please provide proper test case to show the dataframe creation

> Dead lock when doing dataframe persist and loading
> --
>
> Key: CARBONDATA-3953
> URL: https://issues.apache.org/jira/browse/CARBONDATA-3953
> Project: CarbonData
>  Issue Type: Bug
>Affects Versions: 2.1.0
>Reporter: ChenKai
>Priority: Major
> Attachments: image-2020-08-18-15-59-46-108.png, 
> image-2020-08-18-16-03-33-370.png
>
>
> Thread-1
>  !image-2020-08-18-15-59-46-108.png! 
> Thread-2
>  !image-2020-08-18-16-03-33-370.png! 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[GitHub] [carbondata] akashrn5 commented on a change in pull request #3986: [CARBONDATA-4034] Improve the time-consuming of Horizontal Compaction for update

2020-10-22 Thread GitBox


akashrn5 commented on a change in pull request #3986:
URL: https://github.com/apache/carbondata/pull/3986#discussion_r510123816



##
File path: 
processing/src/main/java/org/apache/carbondata/processing/merger/CarbonDataMergerUtil.java
##
@@ -1138,73 +1126,36 @@ private static Boolean 
checkUpdateDeltaFilesInSeg(Segment seg,
   }
 
   /**
-   * Check is the segment passed qualifies for IUD delete delta compaction or 
not i.e.
-   * if the number of delete delta files present in the segment is more than
-   * numberDeltaFilesThreshold.
+   * Check whether the segment passed qualifies for IUD delete delta 
compaction or not,
+   * i.e., if the number of delete delta files present in the segment is more 
than
+   * numberDeltaFilesThreshold, this segment will be selected.
*
-   * @param seg
-   * @param segmentUpdateStatusManager
-   * @param numberDeltaFilesThreshold
-   * @return
+   * @param seg segment to be qualified
+   * @param segmentUpdateStatusManager segments & blocks details management
+   * @param numberDeltaFilesThreshold threshold of delete delta files
+   * @return block list of the segment
*/
-  private static boolean checkDeleteDeltaFilesInSeg(Segment seg,
+  private static List checkDeleteDeltaFilesInSeg(Segment seg,
   SegmentUpdateStatusManager segmentUpdateStatusManager, int 
numberDeltaFilesThreshold) {
 
+List blockLists = new ArrayList<>();
 Set uniqueBlocks = new HashSet();
 List blockNameList =
 segmentUpdateStatusManager.getBlockNameFromSegment(seg.getSegmentNo());
-
-for (final String blockName : blockNameList) {
-
-  CarbonFile[] deleteDeltaFiles =
+for (String blockName : blockNameList) {
+  List deleteDeltaFiles =
   segmentUpdateStatusManager.getDeleteDeltaFilesList(seg, blockName);
-  if (null != deleteDeltaFiles) {
-// The Delete Delta files may have Spill over blocks. Will consider 
multiple spill over
-// blocks as one. Currently DeleteDeltaFiles array contains Delete 
Delta Block name which
-// lies within Delete Delta Start TimeStamp and End TimeStamp. In 
order to eliminate
-// Spill Over Blocks will choose files with unique taskID.
-for (CarbonFile blocks : deleteDeltaFiles) {
-  // Get Task ID and the Timestamp from the Block name for e.g.
-  // part-0-3-1481084721319.carbondata => "3-1481084721319"
-  String task = 
CarbonTablePath.DataFileUtil.getTaskNo(blocks.getName());
-  String timestamp =
-  
CarbonTablePath.DataFileUtil.getTimeStampFromDeleteDeltaFile(blocks.getName());
-  String taskAndTimeStamp = task + "-" + timestamp;
+  if (null != deleteDeltaFiles && deleteDeltaFiles.size() > 
numberDeltaFilesThreshold) {
+for (String file : deleteDeltaFiles) {

Review comment:
   i gave comment to change the variable name and remove spaces after and 
before :, please check





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [carbondata] ajantha-bhat commented on pull request #3934: [WIP] Support Global Unique Id for SegmentNo

2020-10-22 Thread GitBox


ajantha-bhat commented on pull request #3934:
URL: https://github.com/apache/carbondata/pull/3934#issuecomment-714461537


   @QiangCai :I have fixed the list files issue in #3994 as it was urgent.
   
   @marchpure : If you can use UUID and make it work. we can still have this 
feature. If not better to close this PR as the actual issue is fixed. 



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [carbondata] QiangCai commented on pull request #3935: [CARBONDATA-3993] Remove auto data deletion in IUD processs

2020-10-22 Thread GitBox


QiangCai commented on pull request #3935:
URL: https://github.com/apache/carbondata/pull/3935#issuecomment-714451012


   retest this please



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [carbondata] QiangCai commented on pull request #3981: [CARBONDATA-4031] Incorrect query result after Update/Delete and Inse…

2020-10-22 Thread GitBox


QiangCai commented on pull request #3981:
URL: https://github.com/apache/carbondata/pull/3981#issuecomment-714450676


   please do rebase



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [carbondata] akashrn5 commented on a change in pull request #3875: [CARBONDATA-3934]Support write transactional table with presto.

2020-10-22 Thread GitBox


akashrn5 commented on a change in pull request #3875:
URL: https://github.com/apache/carbondata/pull/3875#discussion_r510102591



##
File path: 
integration/hive/src/main/java/org/apache/carbondata/hive/MapredCarbonOutputFormat.java
##
@@ -92,6 +95,14 @@ public void checkOutputSpecs(FileSystem fileSystem, JobConf 
jobConf) throws IOEx
 }
 String tablePath = 
FileFactory.getCarbonFile(carbonLoadModel.getTablePath()).getAbsolutePath();
 TaskAttemptID taskAttemptID = 
TaskAttemptID.forName(jc.get("mapred.task.id"));
+// taskAttemptID will be null when the insert job is fired from presto. 
Presto send the JobConf
+// and since presto does not use the MR framework for execution, the 
mapred.task.id will be
+// null, so prepare a new ID.
+if (taskAttemptID == null) {
+  SimpleDateFormat formatter = new SimpleDateFormat("MMddHHmm");
+  String jobTrackerId = formatter.format(new Date());
+  taskAttemptID = new TaskAttemptID(jobTrackerId, 0, TaskType.MAP, 0, 0);

Review comment:
   > ok, If this task number is used in file name, in case of 
non-transactional concurrent write. two files can have same file name leading 
to many issues. so, I suggested UUID. you can check again.
   
   I set the taskID to loadmodel only of the mapred.task.id is present and 
taskAttempt is not null, if null i dont set taskID to loadmodel, when we call 
super.getRecordWriter, CarbonTableOutputFormat will set load model based on 
DEFAULT_TASK_NO. Please have a look, transactional tables also shouldn't be 
problem





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[jira] [Updated] (CARBONDATA-4041) apache-spark versions and vulnerabilities

2020-10-22 Thread openlookeng (Jira)


 [ 
https://issues.apache.org/jira/browse/CARBONDATA-4041?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

openlookeng updated CARBONDATA-4041:

Priority: Blocker  (was: Major)

> apache-spark versions and vulnerabilities
> -
>
> Key: CARBONDATA-4041
> URL: https://issues.apache.org/jira/browse/CARBONDATA-4041
> Project: CarbonData
>  Issue Type: Improvement
>Reporter: openlookeng
>Priority: Blocker
>
> the vulnerabilities is *CVE-2020-9480*



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (CARBONDATA-4041) apache-spark versions and vulnerabilities

2020-10-22 Thread openlookeng (Jira)
openlookeng created CARBONDATA-4041:
---

 Summary: apache-spark versions and vulnerabilities
 Key: CARBONDATA-4041
 URL: https://issues.apache.org/jira/browse/CARBONDATA-4041
 Project: CarbonData
  Issue Type: Improvement
Reporter: openlookeng


the vulnerabilities is *CVE-2020-9480*



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[GitHub] [carbondata] akashrn5 commented on a change in pull request #3984: [CARBONDATA-4035]Fix MV query issue with aggregation on decimal column

2020-10-22 Thread GitBox


akashrn5 commented on a change in pull request #3984:
URL: https://github.com/apache/carbondata/pull/3984#discussion_r510104944



##
File path: 
integration/spark/src/test/scala/org/apache/carbondata/view/rewrite/MVCreateTestCase.scala
##
@@ -1471,6 +1471,16 @@ class MVCreateTestCase extends QueryTest with 
BeforeAndAfterAll {
 sql("drop table if exists t2")
   }
 
+  test("test sum aggregations on decimal columns") {
+sql("drop table if exists sum_agg_decimal")

Review comment:
   done





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [carbondata] akashrn5 commented on a change in pull request #3984: [CARBONDATA-4035]Fix MV query issue with aggregation on decimal column

2020-10-22 Thread GitBox


akashrn5 commented on a change in pull request #3984:
URL: https://github.com/apache/carbondata/pull/3984#discussion_r510104859



##
File path: 
integration/spark/src/test/scala/org/apache/carbondata/view/rewrite/MVCreateTestCase.scala
##
@@ -1471,6 +1471,16 @@ class MVCreateTestCase extends QueryTest with 
BeforeAndAfterAll {
 sql("drop table if exists t2")
   }
 
+  test("test sum aggregations on decimal columns") {
+sql("drop table if exists sum_agg_decimal")
+sql("create table sum_agg_decimal(salary1 decimal(7,2),salary2 
decimal(7,2),salary3 decimal(7,2),salary4 decimal(7,2),empname string) stored 
as carbondata")
+sql("drop materialized view if exists decimal_mv")
+sql("create materialized view decimal_mv as select empname, sum(salary1 - 
salary2) from sum_agg_decimal group by empname")
+sql("explain select empname, sum( salary1 - salary2) from sum_agg_decimal 
group by empname").show(false)

Review comment:
   done





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [carbondata] ajantha-bhat commented on pull request #3994: [CARBONDATA-4040] Fix data mismatch incase of compaction failure and retry success

2020-10-22 Thread GitBox


ajantha-bhat commented on pull request #3994:
URL: https://github.com/apache/carbondata/pull/3994#issuecomment-714445721


   @QiangCai : please check this.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [carbondata] ajantha-bhat opened a new pull request #3994: [CARBONDATA-4040] Fix data mismatch incase of compaction failure and retry success

2020-10-22 Thread GitBox


ajantha-bhat opened a new pull request #3994:
URL: https://github.com/apache/carbondata/pull/3994


### Why is this PR needed?
For compaction, we don't register in-progress segment. so, when unable to 
get a table status lock. compaction can fail. That time compaction partial 
segment needs to be cleaned. If the partial segment is failed to clean up due 
to unable to get lock or IO issues. When the user retries the compaction. 
carbon uses the same segment id. so while writing the segment file for new 
compaction. list only the files mapping to the current compaction, not all the 
files which contain stale files.

### What changes were proposed in this PR?
   While writing the segment file, consider index files belongs to the current 
load only in the segment folder.
   
### Does this PR introduce any user interface change?
- No
   
### Is any new testcase added?
- No [As it happens in concurrent scenario randomly, manually verified]
   
   
   
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[jira] [Created] (CARBONDATA-4040) Data mismatch incase of compaction failure and retry success

2020-10-22 Thread Ajantha Bhat (Jira)
Ajantha Bhat created CARBONDATA-4040:


 Summary: Data mismatch incase of compaction failure and retry 
success
 Key: CARBONDATA-4040
 URL: https://issues.apache.org/jira/browse/CARBONDATA-4040
 Project: CarbonData
  Issue Type: Bug
Reporter: Ajantha Bhat
Assignee: Ajantha Bhat


For compaction we don't register inprogress segment. so, when unable to get 
table status lock. compaction can fail. That time compaction partial segment 
need to be cleaned. If the partial segment is failed to cleanup due to unable 
to get lock or IO issues. When the user retries the compaction. carbon uses 
same segment id. so while writing the segment file for new compaction. list 
only the files mapping to the current compaction, not all the files which 
contains stale files.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[GitHub] [carbondata] akashrn5 commented on a change in pull request #3875: [CARBONDATA-3934]Support write transactional table with presto.

2020-10-22 Thread GitBox


akashrn5 commented on a change in pull request #3875:
URL: https://github.com/apache/carbondata/pull/3875#discussion_r510102591



##
File path: 
integration/hive/src/main/java/org/apache/carbondata/hive/MapredCarbonOutputFormat.java
##
@@ -92,6 +95,14 @@ public void checkOutputSpecs(FileSystem fileSystem, JobConf 
jobConf) throws IOEx
 }
 String tablePath = 
FileFactory.getCarbonFile(carbonLoadModel.getTablePath()).getAbsolutePath();
 TaskAttemptID taskAttemptID = 
TaskAttemptID.forName(jc.get("mapred.task.id"));
+// taskAttemptID will be null when the insert job is fired from presto. 
Presto send the JobConf
+// and since presto does not use the MR framework for execution, the 
mapred.task.id will be
+// null, so prepare a new ID.
+if (taskAttemptID == null) {
+  SimpleDateFormat formatter = new SimpleDateFormat("MMddHHmm");
+  String jobTrackerId = formatter.format(new Date());
+  taskAttemptID = new TaskAttemptID(jobTrackerId, 0, TaskType.MAP, 0, 0);

Review comment:
   > ok, If this task number is used in file name, in case of 
non-transactional concurrent write. two files can have same file name leading 
to many issues. so, I suggested UUID. you can check again.
   
   I set the taskID to loadmodel only of the mapred.task.id is present and 
taskAttempt is not null, if null i dont set taskID to loadmodel, when we call 
super.getRecordWriter, CarbonTableOutputFormat will set load model based on 
DEFAULT_TASK_NO. Please have a look





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




  1   2   >