[GitHub] carbondata issue #1788: [CARBONDATA-1592] Added analysis exception to handle...
Github user ravipesala commented on the issue: https://github.com/apache/carbondata/pull/1788 SDV Build Success , Please check CI http://144.76.159.231:8080/job/ApacheSDVTests/2826/ ---
[GitHub] carbondata pull request #1787: [CARBONDATA-2017] Fix input path checking whe...
Github user SangeetaGulia commented on a diff in the pull request: https://github.com/apache/carbondata/pull/1787#discussion_r160882178 --- Diff: integration/spark-common/src/main/scala/org/apache/spark/util/FileUtils.scala --- @@ -73,7 +73,8 @@ object FileUtils { val stringBuild = new StringBuilder() val filePaths = inputPath.split(",") for (i <- 0 until filePaths.size) { -val fileType = FileFactory.getFileType(filePaths(i)) +val filePath = CarbonUtil.checkAndAppendHDFSUrl(filePaths(i)) --- End diff -- @jackylk I have verified this. It is working fine with S3 also. We will now be able to use the carbon property **carbon.ddl.base.hdfs.url** for s3 also to provide base URL. ---
[GitHub] carbondata issue #71: [CARBONDATA-155] Code refactor to avoid the Type Casti...
Github user CarbonDataQA commented on the issue: https://github.com/apache/carbondata/pull/71 Build Success with Spark 2.2.1, Please check CI http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/1461/ ---
[GitHub] carbondata pull request #1792: [CARBONDATA-2018][DataLoad] Optimization in r...
GitHub user xuchuanyin opened a pull request: https://github.com/apache/carbondata/pull/1792 [CARBONDATA-2018][DataLoad] Optimization in reading/writing for sort temp row pack the no-sort fields in the row as a byte array during merge sort to save CPU consumption I've tested it in my cluster and seen about 8% performance gained (74MB/s/Node -> 81MB/s/Node) in data loading. Please note that global_sort will not gain benefit from this feature since there are no sort temp file in that procedure. Be sure to do all of the following checklist to help us incorporate your contribution quickly and easily: - [x] Any interfaces changed? `Some internal used interface has been changed` - [x] Any backward compatibility impacted? `No` - [x] Document update required? `No` - [x] Testing done Please provide details on - Whether new unit test cases have been added or why no new tests are required? `No` - How it is tested? Please attach test report. `Tested in 3-node cluster with real business data` - Is it a performance related change? Please attach the performance test report. `Yes, I've tested it in my cluster and seen about 8% performance gained (74MB/s/Node -> 81MB/s/Node) in data loading.` - Any additional information to help reviewers in testing this change. `No` - [x] For large changes, please consider breaking it into sub-tasks under an umbrella JIRA. `Unrelated` You can merge this pull request into a Git repository by running: $ git pull https://github.com/xuchuanyin/carbondata opt_sort_temp_serializeation Alternatively you can review and apply these changes as the patch at: https://github.com/apache/carbondata/pull/1792.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #1792 commit 1cf4efbd5f3065cb996fa4d6a133df68f2cca585 Author: xuchuanyinDate: 2018-01-10T12:39:02Z pack no sort fields pack the no-sort fields in the row as a byte array during merge sort to save CPU consumption ---
[GitHub] carbondata issue #71: [CARBONDATA-155] Code refactor to avoid the Type Casti...
Github user CarbonDataQA commented on the issue: https://github.com/apache/carbondata/pull/71 Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/2694/ ---
[GitHub] carbondata pull request #1791: [CARBONDATA-2010] Block streaming on main tab...
GitHub user QiangCai opened a pull request: https://github.com/apache/carbondata/pull/1791 [CARBONDATA-2010] Block streaming on main table of preaggregate datamap If the table has 'preaggregate' DataMap, it doesn't support streaming now - [x] Any interfaces changed? no - [x] Any backward compatibility impacted? no - [x] Document update required? yes, i will add this limitation into - [x] Testing done Please provide details on - Whether new unit test cases have been added or why no new tests are required? added new ut - How it is tested? Please attach test report. CI run ut - Is it a performance related change? Please attach the performance test report. no - Any additional information to help reviewers in testing this change. added code comment - [x] For large changes, please consider breaking it into sub-tasks under an umbrella JIRA. small changes You can merge this pull request into a Git repository by running: $ git pull https://github.com/QiangCai/carbondata agg_block_streaming Alternatively you can review and apply these changes as the patch at: https://github.com/apache/carbondata/pull/1791.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #1791 commit 6ccd1a060351ed7cbe1bc653623210c7e7e234f5 Author: QiangCaiDate: 2018-01-11T07:04:22Z block streaming on main table of preaggregate datamap ---
[GitHub] carbondata issue #1104: [CARBONDATA-1239] Add validation for set command par...
Github user CarbonDataQA commented on the issue: https://github.com/apache/carbondata/pull/1104 Build Success with Spark 2.2.1, Please check CI http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/1460/ ---
[jira] [Created] (CARBONDATA-2018) Optimization in reading/writing for sort temp row during data loading
xuchuanyin created CARBONDATA-2018: -- Summary: Optimization in reading/writing for sort temp row during data loading Key: CARBONDATA-2018 URL: https://issues.apache.org/jira/browse/CARBONDATA-2018 Project: CarbonData Issue Type: Improvement Components: data-load Affects Versions: 1.3.0 Reporter: xuchuanyin Assignee: xuchuanyin Fix For: 1.3.0 # SCENARIO Currently in carbondata data loading, during sort process step, records will be sorted partially and spilled to the disk. And then carbondata will read these records and do merge sort. Since sort step is CPU-tense, during writing/reading these records, we can optimize the serialization/deserialization for these rows and reduce CPU consumption in parsing the rows. This should enhance the data loading performance. # RESOLVE We can pick up the un-sorted fields in the row and pack them as bytes array and skip paring them. # RESULT I've tested it in my cluster and seen about 8% performance gained (74MB/s/Node -> 81MB/s/Node). -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[GitHub] carbondata issue #1788: [CARBONDATA-1592] Added analysis exception to handle...
Github user CarbonDataQA commented on the issue: https://github.com/apache/carbondata/pull/1788 Build Success with Spark 2.2.1, Please check CI http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/1457/ ---
[GitHub] carbondata issue #1788: [CARBONDATA-1592] Added analysis exception to handle...
Github user CarbonDataQA commented on the issue: https://github.com/apache/carbondata/pull/1788 Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/2691/ ---
[GitHub] carbondata issue #1766: [WIP]enable hive metastore and test
Github user CarbonDataQA commented on the issue: https://github.com/apache/carbondata/pull/1766 Build Failed with Spark 2.2.1, Please check CI http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/1459/ ---
[GitHub] carbondata issue #1766: [WIP]enable hive metastore and test
Github user CarbonDataQA commented on the issue: https://github.com/apache/carbondata/pull/1766 Build Failed with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/2692/ ---
[GitHub] carbondata issue #1724: [CARBONDATA-1940][PreAgg] Fixed bug for creation of ...
Github user ravipesala commented on the issue: https://github.com/apache/carbondata/pull/1724 SDV Build Success , Please check CI http://144.76.159.231:8080/job/ApacheSDVTests/2825/ ---
[GitHub] carbondata issue #1781: [CARBONDATA-2012] Add support to load pre-aggregate ...
Github user CarbonDataQA commented on the issue: https://github.com/apache/carbondata/pull/1781 Build Success with Spark 2.2.1, Please check CI http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/1456/ ---
[GitHub] carbondata issue #1770: [CARBONDATA-1994] Remove CarbonInputFormat
Github user CarbonDataQA commented on the issue: https://github.com/apache/carbondata/pull/1770 Build Failed with Spark 2.2.1, Please check CI http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/1455/ ---
[GitHub] carbondata issue #1770: [CARBONDATA-1994] Remove CarbonInputFormat
Github user CarbonDataQA commented on the issue: https://github.com/apache/carbondata/pull/1770 Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/2690/ ---
[GitHub] carbondata issue #1781: [CARBONDATA-2012] Add support to load pre-aggregate ...
Github user CarbonDataQA commented on the issue: https://github.com/apache/carbondata/pull/1781 Build Failed with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/2689/ ---
[jira] [Closed] (CARBONDATA-1758) Carbon1.3.0- No Inverted Index : Select column with is null for no_inverted_index column throws java.lang.ArrayIndexOutOfBoundsException
[ https://issues.apache.org/jira/browse/CARBONDATA-1758?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chetan Bhat closed CARBONDATA-1758. --- Resolution: Fixed Defect is closed as fixed. Its working fine in latest Carbon 1.3.0 build. > Carbon1.3.0- No Inverted Index : Select column with is null for > no_inverted_index column throws java.lang.ArrayIndexOutOfBoundsException > > > Key: CARBONDATA-1758 > URL: https://issues.apache.org/jira/browse/CARBONDATA-1758 > Project: CarbonData > Issue Type: Bug > Components: data-query >Affects Versions: 1.3.0 > Environment: 3 node cluster >Reporter: Chetan Bhat > Labels: Functional > > Steps : > In Beeline user executes the queries in sequence. > CREATE TABLE uniqdata_DI_int (CUST_ID int,CUST_NAME > String,ACTIVE_EMUI_VERSION string, DOB timestamp, DOJ timestamp, > BIGINT_COLUMN1 bigint,BIGINT_COLUMN2 bigint,DECIMAL_COLUMN1 decimal(30,10), > DECIMAL_COLUMN2 decimal(36,10),Double_COLUMN1 double, Double_COLUMN2 > double,INTEGER_COLUMN1 int) STORED BY 'org.apache.carbondata.format' > TBLPROPERTIES('DICTIONARY_INCLUDE'='cust_id','NO_INVERTED_INDEX'='cust_id'); > LOAD DATA INPATH 'hdfs://hacluster/chetan/3000_UniqData.csv' into table > uniqdata_DI_int OPTIONS('DELIMITER'=',', > 'QUOTECHAR'='"','BAD_RECORDS_ACTION'='FORCE','FILEHEADER'='CUST_ID,CUST_NAME,ACTIVE_EMUI_VERSION,DOB,DOJ,BIGINT_COLUMN1,BIGINT_COLUMN2,DECIMAL_COLUMN1,DECIMAL_COLUMN2,Double_COLUMN1,Double_COLUMN2,INTEGER_COLUMN1'); > Select count(CUST_ID) from uniqdata_DI_int; > Select count(CUST_ID)*10 as multiple from uniqdata_DI_int; > Select avg(CUST_ID) as average from uniqdata_DI_int; > Select floor(CUST_ID) as average from uniqdata_DI_int; > Select ceil(CUST_ID) as average from uniqdata_DI_int; > Select ceiling(CUST_ID) as average from uniqdata_DI_int; > Select CUST_ID*integer_column1 as multiple from uniqdata_DI_int; > Select CUST_ID from uniqdata_DI_int where CUST_ID is null; > *Issue : Select column with is null for no_inverted_index column throws > java.lang.ArrayIndexOutOfBoundsException* > 0: jdbc:hive2://10.18.98.34:23040> Select CUST_ID from uniqdata_DI_int where > CUST_ID is null; > Error: org.apache.spark.SparkException: Job aborted due to stage failure: > Task 0 in stage 79.0 failed 4 times, most recent failure: Lost task 0.3 in > stage 79.0 (TID 123, BLR114278, executor 18): > org.apache.spark.util.TaskCompletionListenerException: > java.util.concurrent.ExecutionException: > java.lang.ArrayIndexOutOfBoundsException: 0 > at > org.apache.spark.TaskContextImpl.markTaskCompleted(TaskContextImpl.scala:105) > at org.apache.spark.scheduler.Task.run(Task.scala:112) > at > org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:282) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) > at java.lang.Thread.run(Thread.java:745) > Driver stacktrace: (state=,code=0) > Expected : Select column with is null for no_inverted_index column should be > successful displaying the correct result set. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[GitHub] carbondata issue #1774: [CARBONDATA-2001] Unable to Save DataFrame As Carbon...
Github user ravipesala commented on the issue: https://github.com/apache/carbondata/pull/1774 SDV Build Success , Please check CI http://144.76.159.231:8080/job/ApacheSDVTests/2824/ ---
[GitHub] carbondata issue #1774: [CARBONDATA-2001] Unable to Save DataFrame As Carbon...
Github user CarbonDataQA commented on the issue: https://github.com/apache/carbondata/pull/1774 Build Success with Spark 2.2.1, Please check CI http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/1454/ ---
[GitHub] carbondata issue #1724: [CARBONDATA-1940][PreAgg] Fixed bug for creation of ...
Github user CarbonDataQA commented on the issue: https://github.com/apache/carbondata/pull/1724 Build Failed with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/2688/ ---
[GitHub] carbondata issue #1724: [CARBONDATA-1940][PreAgg] Fixed bug for creation of ...
Github user CarbonDataQA commented on the issue: https://github.com/apache/carbondata/pull/1724 Build Success with Spark 2.2.1, Please check CI http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/1453/ ---
[GitHub] carbondata issue #1774: [CARBONDATA-2001] Unable to Save DataFrame As Carbon...
Github user CarbonDataQA commented on the issue: https://github.com/apache/carbondata/pull/1774 Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/2687/ ---
[GitHub] carbondata issue #1781: [CARBONDATA-2012] Add support to load pre-aggregate ...
Github user kunal642 commented on the issue: https://github.com/apache/carbondata/pull/1781 retest this please ---
[jira] [Commented] (CARBONDATA-1758) Carbon1.3.0- No Inverted Index : Select column with is null for no_inverted_index column throws java.lang.ArrayIndexOutOfBoundsException
[ https://issues.apache.org/jira/browse/CARBONDATA-1758?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16321705#comment-16321705 ] Akash R Nilugal commented on CARBONDATA-1758: - i have also executed, queries are working fine > Carbon1.3.0- No Inverted Index : Select column with is null for > no_inverted_index column throws java.lang.ArrayIndexOutOfBoundsException > > > Key: CARBONDATA-1758 > URL: https://issues.apache.org/jira/browse/CARBONDATA-1758 > Project: CarbonData > Issue Type: Bug > Components: data-query >Affects Versions: 1.3.0 > Environment: 3 node cluster >Reporter: Chetan Bhat > Labels: Functional > > Steps : > In Beeline user executes the queries in sequence. > CREATE TABLE uniqdata_DI_int (CUST_ID int,CUST_NAME > String,ACTIVE_EMUI_VERSION string, DOB timestamp, DOJ timestamp, > BIGINT_COLUMN1 bigint,BIGINT_COLUMN2 bigint,DECIMAL_COLUMN1 decimal(30,10), > DECIMAL_COLUMN2 decimal(36,10),Double_COLUMN1 double, Double_COLUMN2 > double,INTEGER_COLUMN1 int) STORED BY 'org.apache.carbondata.format' > TBLPROPERTIES('DICTIONARY_INCLUDE'='cust_id','NO_INVERTED_INDEX'='cust_id'); > LOAD DATA INPATH 'hdfs://hacluster/chetan/3000_UniqData.csv' into table > uniqdata_DI_int OPTIONS('DELIMITER'=',', > 'QUOTECHAR'='"','BAD_RECORDS_ACTION'='FORCE','FILEHEADER'='CUST_ID,CUST_NAME,ACTIVE_EMUI_VERSION,DOB,DOJ,BIGINT_COLUMN1,BIGINT_COLUMN2,DECIMAL_COLUMN1,DECIMAL_COLUMN2,Double_COLUMN1,Double_COLUMN2,INTEGER_COLUMN1'); > Select count(CUST_ID) from uniqdata_DI_int; > Select count(CUST_ID)*10 as multiple from uniqdata_DI_int; > Select avg(CUST_ID) as average from uniqdata_DI_int; > Select floor(CUST_ID) as average from uniqdata_DI_int; > Select ceil(CUST_ID) as average from uniqdata_DI_int; > Select ceiling(CUST_ID) as average from uniqdata_DI_int; > Select CUST_ID*integer_column1 as multiple from uniqdata_DI_int; > Select CUST_ID from uniqdata_DI_int where CUST_ID is null; > *Issue : Select column with is null for no_inverted_index column throws > java.lang.ArrayIndexOutOfBoundsException* > 0: jdbc:hive2://10.18.98.34:23040> Select CUST_ID from uniqdata_DI_int where > CUST_ID is null; > Error: org.apache.spark.SparkException: Job aborted due to stage failure: > Task 0 in stage 79.0 failed 4 times, most recent failure: Lost task 0.3 in > stage 79.0 (TID 123, BLR114278, executor 18): > org.apache.spark.util.TaskCompletionListenerException: > java.util.concurrent.ExecutionException: > java.lang.ArrayIndexOutOfBoundsException: 0 > at > org.apache.spark.TaskContextImpl.markTaskCompleted(TaskContextImpl.scala:105) > at org.apache.spark.scheduler.Task.run(Task.scala:112) > at > org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:282) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) > at java.lang.Thread.run(Thread.java:745) > Driver stacktrace: (state=,code=0) > Expected : Select column with is null for no_inverted_index column should be > successful displaying the correct result set. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[GitHub] carbondata issue #1770: [CARBONDATA-1994] Remove CarbonInputFormat
Github user jackylk commented on the issue: https://github.com/apache/carbondata/pull/1770 retest this please ---
[GitHub] carbondata pull request #1774: [CARBONDATA-2001] Unable to Save DataFrame As...
Github user anubhav100 commented on a diff in the pull request: https://github.com/apache/carbondata/pull/1774#discussion_r160864606 --- Diff: integration/spark2/src/main/scala/org/apache/spark/sql/CarbonDataFrameWriter.scala --- @@ -167,13 +167,19 @@ class CarbonDataFrameWriter(sqlContext: SQLContext, val dataFrame: DataFrame) { val carbonSchema = schema.map { field => s"${ field.name } ${ convertToCarbonType(field.dataType) }" } + val isStreaming = if (options.isStreaming) Some("true") else None + val property = Map( "SORT_COLUMNS" -> options.sortColumns, "DICTIONARY_INCLUDE" -> options.dictionaryInclude, "DICTIONARY_EXCLUDE" -> options.dictionaryExclude, - "TABLE_BLOCKSIZE" -> options.tableBlockSize -).filter(_._2.isDefined).map(p => s"'${p._1}' = '${p._2.get}'").mkString(",") + "TABLE_BLOCKSIZE" -> options.tableBlockSize, + "STREAMING" -> isStreaming +).filter(_._2.isDefined). --- End diff -- done ---
[GitHub] carbondata pull request #1774: [CARBONDATA-2001] Unable to Save DataFrame As...
Github user anubhav100 commented on a diff in the pull request: https://github.com/apache/carbondata/pull/1774#discussion_r160864615 --- Diff: integration/spark2/src/main/scala/org/apache/spark/sql/CarbonDataFrameWriter.scala --- @@ -167,13 +167,19 @@ class CarbonDataFrameWriter(sqlContext: SQLContext, val dataFrame: DataFrame) { val carbonSchema = schema.map { field => s"${ field.name } ${ convertToCarbonType(field.dataType) }" } + val isStreaming = if (options.isStreaming) Some("true") else None + val property = Map( "SORT_COLUMNS" -> options.sortColumns, "DICTIONARY_INCLUDE" -> options.dictionaryInclude, "DICTIONARY_EXCLUDE" -> options.dictionaryExclude, - "TABLE_BLOCKSIZE" -> options.tableBlockSize -).filter(_._2.isDefined).map(p => s"'${p._1}' = '${p._2.get}'").mkString(",") + "TABLE_BLOCKSIZE" -> options.tableBlockSize, + "STREAMING" -> isStreaming --- End diff -- @jackylk reason that i not used options.isStreaming is because it is giving back a boolean value so i just converted the boolean to option ---
[GitHub] carbondata issue #1724: [CARBONDATA-1940][PreAgg] Fixed bug for creation of ...
Github user kunal642 commented on the issue: https://github.com/apache/carbondata/pull/1724 @geetikagupta16 can you squash the commits. ---
[GitHub] carbondata issue #1770: [CARBONDATA-1994] Remove CarbonInputFormat
Github user CarbonDataQA commented on the issue: https://github.com/apache/carbondata/pull/1770 Build Failed with Spark 2.2.1, Please check CI http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/1452/ ---
[GitHub] carbondata issue #1770: [CARBONDATA-1994] Remove CarbonInputFormat
Github user CarbonDataQA commented on the issue: https://github.com/apache/carbondata/pull/1770 Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/2686/ ---
[GitHub] carbondata issue #1770: [CARBONDATA-1994] Remove CarbonInputFormat
Github user ravipesala commented on the issue: https://github.com/apache/carbondata/pull/1770 SDV Build Success , Please check CI http://144.76.159.231:8080/job/ApacheSDVTests/2823/ ---
[GitHub] carbondata issue #1790: [CARBONDATA-2009][Documentation] Document Refresh co...
Github user CarbonDataQA commented on the issue: https://github.com/apache/carbondata/pull/1790 Can one of the admins verify this patch? ---
[GitHub] carbondata issue #1790: [CARBONDATA-2009][Documentation] Document Refresh co...
Github user ravipesala commented on the issue: https://github.com/apache/carbondata/pull/1790 Can one of the admins verify this patch? ---
[GitHub] carbondata issue #1790: [CARBONDATA-2009][Documentation] Document Refresh co...
Github user CarbonDataQA commented on the issue: https://github.com/apache/carbondata/pull/1790 Can one of the admins verify this patch? ---
[GitHub] carbondata pull request #1790: [CARBONDATA-2009][Documentation] Document Ref...
GitHub user arshadmohammad opened a pull request: https://github.com/apache/carbondata/pull/1790 [CARBONDATA-2009][Documentation] Document Refresh command constraint Be sure to do all of the following checklist to help us incorporate your contribution quickly and easily: - [ ] Any interfaces changed? NO - [ ] Any backward compatibility impacted? NO - [ ] Document update required? YES - [ ] Testing done Please provide details on - Whether new unit test cases have been added or why no new tests are required? - How it is tested? Please attach test report. - Is it a performance related change? Please attach the performance test report. - Any additional information to help reviewers in testing this change. NA( This PR has only document change ) - [ ] For large changes, please consider breaking it into sub-tasks under an umbrella JIRA. NA You can merge this pull request into a Git repository by running: $ git pull https://github.com/arshadmohammad/carbondata CARBONDATA-2009 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/carbondata/pull/1790.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #1790 commit d53fee295ef8860f8c12a7d38c653015f7809f05 Author: Mohammad ArshadDate: 2018-01-11T01:40:12Z [CARBONDATA-2009][Documentation] Document Refresh command constraint ---
[GitHub] carbondata pull request #1775: [CARBONDATA-1993] Removed unused carbon prope...
Github user asfgit closed the pull request at: https://github.com/apache/carbondata/pull/1775 ---
[GitHub] carbondata issue #1775: [CARBONDATA-1993] Removed unused carbon properties h...
Github user jackylk commented on the issue: https://github.com/apache/carbondata/pull/1775 LGTM ---
[GitHub] carbondata pull request #1787: [CARBONDATA-2017] Fix input path checking whe...
Github user jackylk commented on a diff in the pull request: https://github.com/apache/carbondata/pull/1787#discussion_r160845775 --- Diff: integration/spark-common/src/main/scala/org/apache/spark/util/FileUtils.scala --- @@ -73,7 +73,8 @@ object FileUtils { val stringBuild = new StringBuilder() val filePaths = inputPath.split(",") for (i <- 0 until filePaths.size) { -val fileType = FileFactory.getFileType(filePaths(i)) +val filePath = CarbonUtil.checkAndAppendHDFSUrl(filePaths(i)) --- End diff -- This is only for HDFS, right? How about other storage system support like S3? @SangeetaGulia Can you have a look at this, I think this may impact #1584 that you are working on ---
[GitHub] carbondata issue #1781: [CARBONDATA-2012] Add support to load pre-aggregate ...
Github user jackylk commented on the issue: https://github.com/apache/carbondata/pull/1781 Can you add more description, why the current loading flow is not transactional? ---
[GitHub] carbondata pull request #1774: [CARBONDATA-2001] Unable to Save DataFrame As...
Github user jackylk commented on a diff in the pull request: https://github.com/apache/carbondata/pull/1774#discussion_r160845350 --- Diff: integration/spark2/src/main/scala/org/apache/spark/sql/CarbonDataFrameWriter.scala --- @@ -167,13 +167,19 @@ class CarbonDataFrameWriter(sqlContext: SQLContext, val dataFrame: DataFrame) { val carbonSchema = schema.map { field => s"${ field.name } ${ convertToCarbonType(field.dataType) }" } + val isStreaming = if (options.isStreaming) Some("true") else None + val property = Map( "SORT_COLUMNS" -> options.sortColumns, "DICTIONARY_INCLUDE" -> options.dictionaryInclude, "DICTIONARY_EXCLUDE" -> options.dictionaryExclude, - "TABLE_BLOCKSIZE" -> options.tableBlockSize -).filter(_._2.isDefined).map(p => s"'${p._1}' = '${p._2.get}'").mkString(",") + "TABLE_BLOCKSIZE" -> options.tableBlockSize, + "STREAMING" -> isStreaming --- End diff -- why not use `options.isStreaming` directly? ---
[GitHub] carbondata pull request #1774: [CARBONDATA-2001] Unable to Save DataFrame As...
Github user jackylk commented on a diff in the pull request: https://github.com/apache/carbondata/pull/1774#discussion_r160845207 --- Diff: integration/spark2/src/main/scala/org/apache/spark/sql/CarbonDataFrameWriter.scala --- @@ -167,13 +167,19 @@ class CarbonDataFrameWriter(sqlContext: SQLContext, val dataFrame: DataFrame) { val carbonSchema = schema.map { field => s"${ field.name } ${ convertToCarbonType(field.dataType) }" } + val isStreaming = if (options.isStreaming) Some("true") else None + val property = Map( "SORT_COLUMNS" -> options.sortColumns, "DICTIONARY_INCLUDE" -> options.dictionaryInclude, "DICTIONARY_EXCLUDE" -> options.dictionaryExclude, - "TABLE_BLOCKSIZE" -> options.tableBlockSize -).filter(_._2.isDefined).map(p => s"'${p._1}' = '${p._2.get}'").mkString(",") + "TABLE_BLOCKSIZE" -> options.tableBlockSize, + "STREAMING" -> isStreaming +).filter(_._2.isDefined). --- End diff -- move `.` to next line ---
[jira] [Resolved] (CARBONDATA-2011) CarbonStreamingQueryListener throwing ClassCastException
[ https://issues.apache.org/jira/browse/CARBONDATA-2011?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jacky Li resolved CARBONDATA-2011. -- Resolution: Fixed Fix Version/s: 1.3.0 > CarbonStreamingQueryListener throwing ClassCastException > > > Key: CARBONDATA-2011 > URL: https://issues.apache.org/jira/browse/CARBONDATA-2011 > Project: CarbonData > Issue Type: Bug >Reporter: QiangCai >Assignee: QiangCai > Fix For: 1.3.0 > > Time Spent: 1h 50m > Remaining Estimate: 0h > > Java.lang.ClassCastException: > org.apache.spark.sql.execution.streaming.StreamingQueryWrapper cannot be cast > to org.apache.spark.sql.execution.streaming.StreamExecution -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[GitHub] carbondata pull request #1779: [CARBONDATA-2011] Fix ClassCastException in C...
Github user asfgit closed the pull request at: https://github.com/apache/carbondata/pull/1779 ---
[GitHub] carbondata issue #1779: [CARBONDATA-2011] Fix ClassCastException in CarbonSt...
Github user jackylk commented on the issue: https://github.com/apache/carbondata/pull/1779 LGTM ---
[GitHub] carbondata issue #1104: [CARBONDATA-1239] Add validation for set command par...
Github user ravipesala commented on the issue: https://github.com/apache/carbondata/pull/1104 SDV Build Fail , Please check CI http://144.76.159.231:8080/job/ApacheSDVTests/2822/ ---
[GitHub] carbondata issue #1789: [WIP] Fix avoid reading of all block information in ...
Github user ravipesala commented on the issue: https://github.com/apache/carbondata/pull/1789 SDV Build Fail , Please check CI http://144.76.159.231:8080/job/ApacheSDVTests/2821/ ---
[GitHub] carbondata issue #1789: [WIP] Fix avoid reading of all block information in ...
Github user CarbonDataQA commented on the issue: https://github.com/apache/carbondata/pull/1789 Build Failed with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/2684/ ---
[GitHub] carbondata issue #1789: [WIP] Fix avoid reading of all block information in ...
Github user CarbonDataQA commented on the issue: https://github.com/apache/carbondata/pull/1789 Build Failed with Spark 2.2.1, Please check CI http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/1451/ ---
[GitHub] carbondata issue #1104: [CARBONDATA-1239] Add validation for set command par...
Github user CarbonDataQA commented on the issue: https://github.com/apache/carbondata/pull/1104 Build Failed with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/2685/ ---
[GitHub] carbondata issue #1104: [CARBONDATA-1239] Add validation for set command par...
Github user CarbonDataQA commented on the issue: https://github.com/apache/carbondata/pull/1104 Build Success with Spark 2.2.1, Please check CI http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/1450/ ---
[GitHub] carbondata issue #1782: [WIP] Changes for creating carbon index merge file f...
Github user ravipesala commented on the issue: https://github.com/apache/carbondata/pull/1782 SDV Build Success , Please check CI http://144.76.159.231:8080/job/ApacheSDVTests/2820/ ---
[GitHub] carbondata issue #1782: [WIP] Changes for creating carbon index merge file f...
Github user CarbonDataQA commented on the issue: https://github.com/apache/carbondata/pull/1782 Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/2683/ ---
[GitHub] carbondata issue #1782: [WIP] Changes for creating carbon index merge file f...
Github user CarbonDataQA commented on the issue: https://github.com/apache/carbondata/pull/1782 Build Success with Spark 2.2.1, Please check CI http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/1449/ ---
[GitHub] carbondata issue #1788: [WIP][CARBONDATA-1592] Added analysis exception to h...
Github user ravipesala commented on the issue: https://github.com/apache/carbondata/pull/1788 SDV Build Success , Please check CI http://144.76.159.231:8080/job/ApacheSDVTests/2819/ ---
[GitHub] carbondata pull request #1789: [WIP] Fix avoid reading of all block informat...
GitHub user ravipesala opened a pull request: https://github.com/apache/carbondata/pull/1789 [WIP] Fix avoid reading of all block information in driver for old stores. Problem: For old stores prior to 1.2 version there is no blocklet information stored in carbonindex file. So the new code needs to read all carbondata files footers inside the driver to get the blocklet information. That makes the first time queries become slower. As observed count(*) query was taking 2 swconds on old version and after upgrade it takes very long time. Solution: If there is no information blocklet available in carbonindex file then don't read carbondata files footer in driver side. Instead read carbondata files in executor to get the blocklet information. Be sure to do all of the following checklist to help us incorporate your contribution quickly and easily: - [X] Any interfaces changed? - [X] Any backward compatibility impacted? - [X] Document update required? - [X] Testing done - [X] For large changes, please consider breaking it into sub-tasks under an umbrella JIRA. You can merge this pull request into a Git repository by running: $ git pull https://github.com/ravipesala/incubator-carbondata datamap-pld-store Alternatively you can review and apply these changes as the patch at: https://github.com/apache/carbondata/pull/1789.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #1789 commit f0aaf4d8d0e4761227bc1adff29d33798c88bd12 Author: ravipesalaDate: 2018-01-10T15:35:48Z Fix avoid reading of all block information in driver for old stores. ---
[GitHub] carbondata issue #1788: [WIP][CARBONDATA-1592] Added analysis exception to h...
Github user CarbonDataQA commented on the issue: https://github.com/apache/carbondata/pull/1788 Build Success with Spark 2.2.1, Please check CI http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/1448/ ---
[GitHub] carbondata issue #1788: [WIP][CARBONDATA-1592] Added analysis exception to h...
Github user CarbonDataQA commented on the issue: https://github.com/apache/carbondata/pull/1788 Build Failed with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/2682/ ---
[GitHub] carbondata issue #1787: [CARBONDATA-2017] Fix input path checking when loadi...
Github user ravipesala commented on the issue: https://github.com/apache/carbondata/pull/1787 SDV Build Success , Please check CI http://144.76.159.231:8080/job/ApacheSDVTests/2818/ ---
[GitHub] carbondata issue #1787: [CARBONDATA-2017] Fix input path checking when loadi...
Github user CarbonDataQA commented on the issue: https://github.com/apache/carbondata/pull/1787 Build Success with Spark 2.2.1, Please check CI http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/1447/ ---
[GitHub] carbondata issue #1787: [CARBONDATA-2017] Fix input path checking when loadi...
Github user CarbonDataQA commented on the issue: https://github.com/apache/carbondata/pull/1787 Build Failed with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/2681/ ---
[GitHub] carbondata pull request #1788: [WIP][CARBONDATA-1592] Added analysis excepti...
GitHub user ManoharVanam opened a pull request: https://github.com/apache/carbondata/pull/1788 [WIP][CARBONDATA-1592] Added analysis exception to handle event exceptions Description : Added analysis exception case, to handle event listener exceptions Be sure to do all of the following checklist to help us incorporate your contribution quickly and easily: - [ ] Any interfaces changed? - [ ] Any backward compatibility impacted? - [ ] Document update required? - [ ] Testing done Please provide details on - Whether new unit test cases have been added or why no new tests are required? - How it is tested? Please attach test report. - Is it a performance related change? Please attach the performance test report. - Any additional information to help reviewers in testing this change. - [ ] For large changes, please consider breaking it into sub-tasks under an umbrella JIRA. You can merge this pull request into a Git repository by running: $ git pull https://github.com/ManoharVanam/incubator-carbondata defect6 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/carbondata/pull/1788.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #1788 commit 4c05b66100784bb5aab9d5209305e37f0753316b Author: ManoharDate: 2018-01-10T14:37:33Z Added analysis exception to handle event exceptions ---
[GitHub] carbondata issue #1783: [CARBONDATA-2013] executing alter query results that...
Github user ravipesala commented on the issue: https://github.com/apache/carbondata/pull/1783 SDV Build Success , Please check CI http://144.76.159.231:8080/job/ApacheSDVTests/2817/ ---
[GitHub] carbondata pull request #1787: [CARBONDATA-2017] Fix input path checking whe...
GitHub user kevinjmh opened a pull request: https://github.com/apache/carbondata/pull/1787 [CARBONDATA-2017] Fix input path checking when loading data from multiple paths Be sure to do all of the following checklist to help us incorporate your contribution quickly and easily: - [ ] Any interfaces changed? - [ ] Any backward compatibility impacted? - [ ] Document update required? - [ ] Testing done Please provide details on - Whether new unit test cases have been added or why no new tests are required? - How it is tested? Please attach test report. - Is it a performance related change? Please attach the performance test report. - Any additional information to help reviewers in testing this change. - [ ] For large changes, please consider breaking it into sub-tasks under an umbrella JIRA. You can merge this pull request into a Git repository by running: $ git pull https://github.com/kevinjmh/carbondata load_multi_path Alternatively you can review and apply these changes as the patch at: https://github.com/apache/carbondata/pull/1787.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #1787 ---
[jira] [Created] (CARBONDATA-2017) Error occurs when loading multiple files
jiangmanhua created CARBONDATA-2017: --- Summary: Error occurs when loading multiple files Key: CARBONDATA-2017 URL: https://issues.apache.org/jira/browse/CARBONDATA-2017 Project: CarbonData Issue Type: Bug Reporter: jiangmanhua Priority: Minor Problem: Carbon supports loading from multiple file paths at once, but we find that Carbon will throw an exception like "The input file does not exist" when loading multiple files on HDFS. For example: ex1: LOAD DATA INPATH '/data/source.csv,/data/source2.csv' INTO TABLE test_table ex2: LOAD DATA INPATH 'hdfs://ha/data/source.csv,hdfs://ha/data/source2.csv' INTO TABLE test_table ex1 will throw an exception saying that source2.csv does not exist. ex2 will execute normally. Solution: We found that carbon takes the PATH as a whole and checks its prefix before spliting it into multiplt paths. So the problem will be solved when we do the prefix checking job for each path after spliting PATH into multiplt paths. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[GitHub] carbondata issue #1783: [CARBONDATA-2013] executing alter query results that...
Github user CarbonDataQA commented on the issue: https://github.com/apache/carbondata/pull/1783 Build Failed with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/2680/ ---
[GitHub] carbondata issue #1783: [CARBONDATA-2013] executing alter query results that...
Github user CarbonDataQA commented on the issue: https://github.com/apache/carbondata/pull/1783 Build Success with Spark 2.2.1, Please check CI http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/1446/ ---
[GitHub] carbondata issue #1786: [CARBONDATA-1988] Fixed bug to remove empty partitio...
Github user ravipesala commented on the issue: https://github.com/apache/carbondata/pull/1786 SDV Build Success , Please check CI http://144.76.159.231:8080/job/ApacheSDVTests/2816/ ---
[jira] [Updated] (CARBONDATA-2009) REFRESH TABLE Limitation When HiveMetaStore is used
[ https://issues.apache.org/jira/browse/CARBONDATA-2009?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mohammad Shahid Khan updated CARBONDATA-2009: - Description: Refresh table command will not register the carbon table if the old table is stored in the CarbonHiveMetastore (was: Refresh table when spark.carbon.hive.schema.store is set to true ie when hive meta store is used.) > REFRESH TABLE Limitation When HiveMetaStore is used > --- > > Key: CARBONDATA-2009 > URL: https://issues.apache.org/jira/browse/CARBONDATA-2009 > Project: CarbonData > Issue Type: Sub-task >Reporter: Mohammad Shahid Khan >Assignee: Mohammad Arshad >Priority: Minor > Fix For: 1.3.0 > > > Refresh table command will not register the carbon table if the old table is > stored in the CarbonHiveMetastore -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[GitHub] carbondata issue #1786: [CARBONDATA-1988] Fixed bug to remove empty partitio...
Github user ravipesala commented on the issue: https://github.com/apache/carbondata/pull/1786 SDV Build Success , Please check CI http://144.76.159.231:8080/job/ApacheSDVTests/2815/ ---
[jira] [Assigned] (CARBONDATA-2016) Exception displays while implementing compaction with alter query
[ https://issues.apache.org/jira/browse/CARBONDATA-2016?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] anubhav tarar reassigned CARBONDATA-2016: - Assignee: anubhav tarar > Exception displays while implementing compaction with alter query > - > > Key: CARBONDATA-2016 > URL: https://issues.apache.org/jira/browse/CARBONDATA-2016 > Project: CarbonData > Issue Type: Bug > Components: data-load >Affects Versions: 1.3.0 > Environment: spark 2.1 >Reporter: Vandana Yadav >Assignee: anubhav tarar >Priority: Minor > > Exception displays while implementing compaction with alter query. > Steps to reproduce: > 1) Create a table : > CREATE TABLE CUSTOMER1 ( C_CUSTKEY INT , C_NAME STRING , C_ADDRESS STRING , > C_NATIONKEY INT , C_PHONE STRING , C_ACCTBAL DECIMAL(15,2) , C_MKTSEGMENT > STRING , C_COMMENT STRING) stored by 'carbondata'; > 2) Insert data into the table: > a) insert into customer1 > values(1,'vandana','noida',1,'123456789',45987.78,'hello','comment') > b) insert into customer1 > values(2,'vandana','noida',2,'123456789',487.78,'hello','comment') > c) insert into customer1 > values(3,'geetika','delhi',3,'123456789',487897.78,'hello','comment') > d) insert into customer1 > values(4,'sangeeta','delhi',3,'123456789',48789.78,'hello','comment') > 3) Perform alter table query: > alter table customer1 add columns (intfield int) TBLPROPERTIES > ('DEFAULT.VALUE.intfield'='10'); > 4) show segments for displaying segments before compaction > show segments for table customer1; > output: > ++--+--+--++--+--+ > | SegmentSequenceId | Status | Load Start Time | Load End > Time | Merged To | File Format | > ++--+--+--++--+--+ > | 3 | Success | 2018-01-10 16:16:53.611 | 2018-01-10 > 16:16:54.99 | NA | COLUMNAR_V3 | > | 2 | Success | 2018-01-10 16:16:46.878 | 2018-01-10 > 16:16:47.75 | NA | COLUMNAR_V3 | > | 1 | Success | 2018-01-10 16:16:38.096 | 2018-01-10 > 16:16:38.972 | NA | COLUMNAR_V3 | > | 0 | Success | 2018-01-10 16:16:31.979 | 2018-01-10 > 16:16:33.293 | NA | COLUMNAR_V3 | > ++--+--+--++--+--+ > 4 rows selected (0.029 seconds) > 5) alter table query for compaction: > alter table customer1 compact 'minor'; > Expected Result: Table should be compacted successfully. > Actual Result: > Error: org.apache.spark.sql.AnalysisException: Compaction failed. Please > check logs for more info. Exception in compaction Compaction Failure in > Merger Rdd.; (state=,code=0) > thriftserver logs: > 18/01/10 16:17:12 ERROR CompactionResultSortProcessor: [Executor task launch > worker-36][partitionID:customer1;queryID:15798380253871] Compaction failed: > java.lang.Long cannot be cast to java.lang.Integer > java.lang.ClassCastException: java.lang.Long cannot be cast to > java.lang.Integer > at > org.apache.carbondata.processing.sort.sortdata.SortDataRows.writeDataToFile(SortDataRows.java:273) > at > org.apache.carbondata.processing.sort.sortdata.SortDataRows.startSorting(SortDataRows.java:214) > at > org.apache.carbondata.processing.merger.CompactionResultSortProcessor.processResult(CompactionResultSortProcessor.java:226) > at > org.apache.carbondata.processing.merger.CompactionResultSortProcessor.execute(CompactionResultSortProcessor.java:159) > at > org.apache.carbondata.spark.rdd.CarbonMergerRDD$$anon$1.(CarbonMergerRDD.scala:234) > at > org.apache.carbondata.spark.rdd.CarbonMergerRDD.internalCompute(CarbonMergerRDD.scala:81) > at org.apache.carbondata.spark.rdd.CarbonRDD.compute(CarbonRDD.scala:60) > at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:323) > at org.apache.spark.rdd.RDD.iterator(RDD.scala:287) > at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:87) > at org.apache.spark.scheduler.Task.run(Task.scala:99) > at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:282) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) > at java.lang.Thread.run(Thread.java:745) > 18/01/10 16:17:12 INFO UnsafeMemoryManager: [Executor task launch > worker-36][partitionID:customer1;queryID:15798380253871] Total memory used > after task 15798371335347 is 5313 Current tasks running now are :
[GitHub] carbondata issue #1785: [CARBONDATA-2015] Restricted maximum length of bytes...
Github user ravipesala commented on the issue: https://github.com/apache/carbondata/pull/1785 SDV Build Success , Please check CI http://144.76.159.231:8080/job/ApacheSDVTests/2814/ ---
[GitHub] carbondata issue #1786: [CARBONDATA-1988] Fixed bug to remove empty partitio...
Github user CarbonDataQA commented on the issue: https://github.com/apache/carbondata/pull/1786 Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/2679/ ---
[GitHub] carbondata issue #1786: [CARBONDATA-1988] Fixed bug to remove empty partitio...
Github user CarbonDataQA commented on the issue: https://github.com/apache/carbondata/pull/1786 Build Success with Spark 2.2.1, Please check CI http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/1445/ ---
[jira] [Resolved] (CARBONDATA-1957) create datamap query fails on table having dictionary_include
[ https://issues.apache.org/jira/browse/CARBONDATA-1957?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Geetika Gupta resolved CARBONDATA-1957. --- Resolution: Fixed > create datamap query fails on table having dictionary_include > - > > Key: CARBONDATA-1957 > URL: https://issues.apache.org/jira/browse/CARBONDATA-1957 > Project: CarbonData > Issue Type: Bug > Components: data-query >Affects Versions: 1.3.0 > Environment: spark2.1 >Reporter: Geetika Gupta > Fix For: 1.3.0 > > Attachments: 2000_UniqData.csv > > > I created a datamap using the following command: > create datamap uniqdata_agg on table uniqdata using 'preaggregate' as select > cust_id, cust_name,avg(decimal_column1) from uniqdata group by > cust_id,cust_name; > It throws the following error: > Error: java.lang.Exception: DataLoad failure: (state=,code=0) > Steps to reproduce: > CREATE TABLE uniqdata(CUST_ID int,CUST_NAME String,ACTIVE_EMUI_VERSION > string, DOB timestamp, DOJ timestamp, BIGINT_COLUMN1 bigint,BIGINT_COLUMN2 > bigint,DECIMAL_COLUMN1 decimal(30,10), DECIMAL_COLUMN2 > decimal(36,10),Double_COLUMN1 double, Double_COLUMN2 double,INTEGER_COLUMN1 > int) STORED BY 'org.apache.carbondata.format' > TBLPROPERTIES('DICTIONARY_INCLUDE'='CUST_ID,CUST_NAME,ACTIVE_EMUI_VERSION,DOB,DOJ,BIGINT_COLUMN1,BIGINT_COLUMN2,DECIMAL_COLUMN1,DECIMAL_COLUMN2,Double_COLUMN1,Double_COLUMN2,INTEGER_COLUMN1') > Load command: > LOAD DATA INPATH 'HDFS_URL/BabuStore/Data/uniqdata/2000_UniqData.csv' into > table uniqdata OPTIONS('DELIMITER'=',', > 'QUOTECHAR'='"','BAD_RECORDS_ACTION'='FORCE','FILEHEADER'='CUST_ID,CUST_NAME,ACTIVE_EMUI_VERSION,DOB,DOJ,BIGINT_COLUMN1,BIGINT_COLUMN2,DECIMAL_COLUMN1,DECIMAL_COLUMN2,Double_COLUMN1,Double_COLUMN2,INTEGER_COLUMN1') > Create datamap commad: > create datamap uniqdata_agg on table uniqdata using 'preaggregate' as select > cust_id, cust_name,avg(decimal_column1) from uniqdata group by > cust_id,cust_name; > The above command throws the following exception: > Error: java.lang.Exception: DataLoad failure: (state=,code=0) > Here are the logs: > 18/01/02 11:46:58 ERROR ParallelReadMergeSorterImpl: > SafeParallelSorterPool:uniqdata_uniqdata_agg > java.lang.IllegalArgumentException: requirement failed: Decimal precision > 2922 exceeds max precision 38 > at scala.Predef$.require(Predef.scala:224) > at org.apache.spark.sql.types.Decimal.set(Decimal.scala:113) > at org.apache.spark.sql.types.Decimal$.apply(Decimal.scala:426) > at org.apache.spark.sql.types.Decimal.apply(Decimal.scala) > at > org.apache.spark.sql.catalyst.expressions.UnsafeRow.getDecimal(UnsafeRow.java:409) > at > org.apache.spark.sql.catalyst.expressions.GeneratedClass$SpecificSafeProjection.apply_0$(Unknown > Source) > at > org.apache.spark.sql.catalyst.expressions.GeneratedClass$SpecificSafeProjection.apply(Unknown > Source) > at scala.collection.Iterator$$anon$11.next(Iterator.scala:409) > at scala.collection.Iterator$$anon$11.next(Iterator.scala:409) > at > org.apache.carbondata.spark.rdd.LazyRddIterator.next(NewCarbonDataLoadRDD.scala:514) > at > org.apache.carbondata.spark.rdd.LazyRddIterator.next(NewCarbonDataLoadRDD.scala:477) > at > org.apache.carbondata.processing.loading.steps.InputProcessorStepImpl$InputProcessorIterator.getBatch(InputProcessorStepImpl.java:239) > at > org.apache.carbondata.processing.loading.steps.InputProcessorStepImpl$InputProcessorIterator.next(InputProcessorStepImpl.java:200) > at > org.apache.carbondata.processing.loading.steps.InputProcessorStepImpl$InputProcessorIterator.next(InputProcessorStepImpl.java:129) > at > org.apache.carbondata.processing.loading.steps.DataConverterProcessorStepImpl$1.next(DataConverterProcessorStepImpl.java:97) > at > org.apache.carbondata.processing.loading.steps.DataConverterProcessorStepImpl$1.next(DataConverterProcessorStepImpl.java:83) > at > org.apache.carbondata.processing.loading.sort.impl.ParallelReadMergeSorterImpl$SortIteratorThread.run(ParallelReadMergeSorterImpl.java:218) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) > at java.lang.Thread.run(Thread.java:748) > 18/01/02 11:46:58 ERROR ForwardDictionaryCache: > SafeParallelSorterPool:uniqdata_uniqdata_agg Error loading the dictionary: > null > 18/01/02 11:46:58 ERROR ForwardDictionaryCache: > SafeParallelSorterPool:uniqdata_uniqdata_agg Error loading the dictionary: > null > 18/01/02 11:46:58 ERROR ForwardDictionaryCache: > SafeParallelSorterPool:uniqdata_uniqdata_agg Error loading the dictionary: > null > 18/01/02 11:46:58 ERROR
[jira] [Commented] (CARBONDATA-1957) create datamap query fails on table having dictionary_include
[ https://issues.apache.org/jira/browse/CARBONDATA-1957?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16320135#comment-16320135 ] Geetika Gupta commented on CARBONDATA-1957: --- This bug has been resolved by this PR: https://github.com/apache/carbondata/pull/1742 > create datamap query fails on table having dictionary_include > - > > Key: CARBONDATA-1957 > URL: https://issues.apache.org/jira/browse/CARBONDATA-1957 > Project: CarbonData > Issue Type: Bug > Components: data-query >Affects Versions: 1.3.0 > Environment: spark2.1 >Reporter: Geetika Gupta > Fix For: 1.3.0 > > Attachments: 2000_UniqData.csv > > > I created a datamap using the following command: > create datamap uniqdata_agg on table uniqdata using 'preaggregate' as select > cust_id, cust_name,avg(decimal_column1) from uniqdata group by > cust_id,cust_name; > It throws the following error: > Error: java.lang.Exception: DataLoad failure: (state=,code=0) > Steps to reproduce: > CREATE TABLE uniqdata(CUST_ID int,CUST_NAME String,ACTIVE_EMUI_VERSION > string, DOB timestamp, DOJ timestamp, BIGINT_COLUMN1 bigint,BIGINT_COLUMN2 > bigint,DECIMAL_COLUMN1 decimal(30,10), DECIMAL_COLUMN2 > decimal(36,10),Double_COLUMN1 double, Double_COLUMN2 double,INTEGER_COLUMN1 > int) STORED BY 'org.apache.carbondata.format' > TBLPROPERTIES('DICTIONARY_INCLUDE'='CUST_ID,CUST_NAME,ACTIVE_EMUI_VERSION,DOB,DOJ,BIGINT_COLUMN1,BIGINT_COLUMN2,DECIMAL_COLUMN1,DECIMAL_COLUMN2,Double_COLUMN1,Double_COLUMN2,INTEGER_COLUMN1') > Load command: > LOAD DATA INPATH 'HDFS_URL/BabuStore/Data/uniqdata/2000_UniqData.csv' into > table uniqdata OPTIONS('DELIMITER'=',', > 'QUOTECHAR'='"','BAD_RECORDS_ACTION'='FORCE','FILEHEADER'='CUST_ID,CUST_NAME,ACTIVE_EMUI_VERSION,DOB,DOJ,BIGINT_COLUMN1,BIGINT_COLUMN2,DECIMAL_COLUMN1,DECIMAL_COLUMN2,Double_COLUMN1,Double_COLUMN2,INTEGER_COLUMN1') > Create datamap commad: > create datamap uniqdata_agg on table uniqdata using 'preaggregate' as select > cust_id, cust_name,avg(decimal_column1) from uniqdata group by > cust_id,cust_name; > The above command throws the following exception: > Error: java.lang.Exception: DataLoad failure: (state=,code=0) > Here are the logs: > 18/01/02 11:46:58 ERROR ParallelReadMergeSorterImpl: > SafeParallelSorterPool:uniqdata_uniqdata_agg > java.lang.IllegalArgumentException: requirement failed: Decimal precision > 2922 exceeds max precision 38 > at scala.Predef$.require(Predef.scala:224) > at org.apache.spark.sql.types.Decimal.set(Decimal.scala:113) > at org.apache.spark.sql.types.Decimal$.apply(Decimal.scala:426) > at org.apache.spark.sql.types.Decimal.apply(Decimal.scala) > at > org.apache.spark.sql.catalyst.expressions.UnsafeRow.getDecimal(UnsafeRow.java:409) > at > org.apache.spark.sql.catalyst.expressions.GeneratedClass$SpecificSafeProjection.apply_0$(Unknown > Source) > at > org.apache.spark.sql.catalyst.expressions.GeneratedClass$SpecificSafeProjection.apply(Unknown > Source) > at scala.collection.Iterator$$anon$11.next(Iterator.scala:409) > at scala.collection.Iterator$$anon$11.next(Iterator.scala:409) > at > org.apache.carbondata.spark.rdd.LazyRddIterator.next(NewCarbonDataLoadRDD.scala:514) > at > org.apache.carbondata.spark.rdd.LazyRddIterator.next(NewCarbonDataLoadRDD.scala:477) > at > org.apache.carbondata.processing.loading.steps.InputProcessorStepImpl$InputProcessorIterator.getBatch(InputProcessorStepImpl.java:239) > at > org.apache.carbondata.processing.loading.steps.InputProcessorStepImpl$InputProcessorIterator.next(InputProcessorStepImpl.java:200) > at > org.apache.carbondata.processing.loading.steps.InputProcessorStepImpl$InputProcessorIterator.next(InputProcessorStepImpl.java:129) > at > org.apache.carbondata.processing.loading.steps.DataConverterProcessorStepImpl$1.next(DataConverterProcessorStepImpl.java:97) > at > org.apache.carbondata.processing.loading.steps.DataConverterProcessorStepImpl$1.next(DataConverterProcessorStepImpl.java:83) > at > org.apache.carbondata.processing.loading.sort.impl.ParallelReadMergeSorterImpl$SortIteratorThread.run(ParallelReadMergeSorterImpl.java:218) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) > at java.lang.Thread.run(Thread.java:748) > 18/01/02 11:46:58 ERROR ForwardDictionaryCache: > SafeParallelSorterPool:uniqdata_uniqdata_agg Error loading the dictionary: > null > 18/01/02 11:46:58 ERROR ForwardDictionaryCache: > SafeParallelSorterPool:uniqdata_uniqdata_agg Error loading the dictionary: > null > 18/01/02 11:46:58 ERROR ForwardDictionaryCache: >
[GitHub] carbondata issue #1785: [CARBONDATA-2015] Restricted maximum length of bytes...
Github user CarbonDataQA commented on the issue: https://github.com/apache/carbondata/pull/1785 Build Success with Spark 2.2.1, Please check CI http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/1443/ ---
[GitHub] carbondata issue #1785: [CARBONDATA-2015] Restricted maximum length of bytes...
Github user CarbonDataQA commented on the issue: https://github.com/apache/carbondata/pull/1785 Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/2677/ ---
[GitHub] carbondata issue #1584: [CARBONDATA-1827] Added S3 Implementation
Github user ravipesala commented on the issue: https://github.com/apache/carbondata/pull/1584 SDV Build Success , Please check CI http://144.76.159.231:8080/job/ApacheSDVTests/2813/ ---
[GitHub] carbondata issue #1786: [CARBONDATA-1988] Fixed bug to remove empty partitio...
Github user CarbonDataQA commented on the issue: https://github.com/apache/carbondata/pull/1786 Build Failed with Spark 2.2.1, Please check CI http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/1444/ ---
[GitHub] carbondata issue #1786: [CARBONDATA-1988] Fixed bug to remove empty partitio...
Github user CarbonDataQA commented on the issue: https://github.com/apache/carbondata/pull/1786 Build Failed with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/2678/ ---
[GitHub] carbondata issue #1724: [CARBONDATA-1940][PreAgg] Fixed bug for creation of ...
Github user CarbonDataQA commented on the issue: https://github.com/apache/carbondata/pull/1724 Build Success with Spark 2.2.1, Please check CI http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/1442/ ---
[GitHub] carbondata issue #1724: [CARBONDATA-1940][PreAgg] Fixed bug for creation of ...
Github user CarbonDataQA commented on the issue: https://github.com/apache/carbondata/pull/1724 Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/2676/ ---
[GitHub] carbondata issue #1584: [CARBONDATA-1827] Added S3 Implementation
Github user SangeetaGulia commented on the issue: https://github.com/apache/carbondata/pull/1584 @jackylk we have made the changes as per your review comments. Can you please check. ---
[jira] [Updated] (CARBONDATA-2016) Exception displays while implementing compaction with alter query
[ https://issues.apache.org/jira/browse/CARBONDATA-2016?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vandana Yadav updated CARBONDATA-2016: -- Description: Exception displays while implementing compaction with alter query. Steps to reproduce: 1) Create a table : CREATE TABLE CUSTOMER1 ( C_CUSTKEY INT , C_NAME STRING , C_ADDRESS STRING , C_NATIONKEY INT , C_PHONE STRING , C_ACCTBAL DECIMAL(15,2) , C_MKTSEGMENT STRING , C_COMMENT STRING) stored by 'carbondata'; 2) Insert data into the table: a) insert into customer1 values(1,'vandana','noida',1,'123456789',45987.78,'hello','comment') b) insert into customer1 values(2,'vandana','noida',2,'123456789',487.78,'hello','comment') c) insert into customer1 values(3,'geetika','delhi',3,'123456789',487897.78,'hello','comment') d) insert into customer1 values(4,'sangeeta','delhi',3,'123456789',48789.78,'hello','comment') 3) Perform alter table query: alter table customer1 add columns (intfield int) TBLPROPERTIES ('DEFAULT.VALUE.intfield'='10'); 4) show segments for displaying segments before compaction show segments for table customer1; output: ++--+--+--++--+--+ | SegmentSequenceId | Status | Load Start Time | Load End Time | Merged To | File Format | ++--+--+--++--+--+ | 3 | Success | 2018-01-10 16:16:53.611 | 2018-01-10 16:16:54.99 | NA | COLUMNAR_V3 | | 2 | Success | 2018-01-10 16:16:46.878 | 2018-01-10 16:16:47.75 | NA | COLUMNAR_V3 | | 1 | Success | 2018-01-10 16:16:38.096 | 2018-01-10 16:16:38.972 | NA | COLUMNAR_V3 | | 0 | Success | 2018-01-10 16:16:31.979 | 2018-01-10 16:16:33.293 | NA | COLUMNAR_V3 | ++--+--+--++--+--+ 4 rows selected (0.029 seconds) 5) alter table query for compaction: alter table customer1 compact 'minor'; Expected Result: Table should be compacted successfully. Actual Result: Error: org.apache.spark.sql.AnalysisException: Compaction failed. Please check logs for more info. Exception in compaction Compaction Failure in Merger Rdd.; (state=,code=0) thriftserver logs: 18/01/10 16:17:12 ERROR CompactionResultSortProcessor: [Executor task launch worker-36][partitionID:customer1;queryID:15798380253871] Compaction failed: java.lang.Long cannot be cast to java.lang.Integer java.lang.ClassCastException: java.lang.Long cannot be cast to java.lang.Integer at org.apache.carbondata.processing.sort.sortdata.SortDataRows.writeDataToFile(SortDataRows.java:273) at org.apache.carbondata.processing.sort.sortdata.SortDataRows.startSorting(SortDataRows.java:214) at org.apache.carbondata.processing.merger.CompactionResultSortProcessor.processResult(CompactionResultSortProcessor.java:226) at org.apache.carbondata.processing.merger.CompactionResultSortProcessor.execute(CompactionResultSortProcessor.java:159) at org.apache.carbondata.spark.rdd.CarbonMergerRDD$$anon$1.(CarbonMergerRDD.scala:234) at org.apache.carbondata.spark.rdd.CarbonMergerRDD.internalCompute(CarbonMergerRDD.scala:81) at org.apache.carbondata.spark.rdd.CarbonRDD.compute(CarbonRDD.scala:60) at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:323) at org.apache.spark.rdd.RDD.iterator(RDD.scala:287) at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:87) at org.apache.spark.scheduler.Task.run(Task.scala:99) at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:282) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) at java.lang.Thread.run(Thread.java:745) 18/01/10 16:17:12 INFO UnsafeMemoryManager: [Executor task launch worker-36][partitionID:customer1;queryID:15798380253871] Total memory used after task 15798371335347 is 5313 Current tasks running now are : [6856382704941, 14621295743743, 14461639534151, 4378916027096, 15798216567589] 18/01/10 16:17:12 INFO UnsafeMemoryManager: [Executor task launch worker-36][partitionID:customer1;queryID:15798380253871] Total memory used after task 15798371335347 is 5313 Current tasks running now are : [6856382704941, 14621295743743, 14461639534151, 4378916027096, 15798216567589] 18/01/10 16:17:12 INFO UnsafeMemoryManager: [Executor task launch worker-36][partitionID:customer1;queryID:15798380253871] Total memory used after task 15798371335347 is 5313 Current tasks running now are : [6856382704941, 14621295743743, 14461639534151, 4378916027096,
[GitHub] carbondata issue #1584: [CARBONDATA-1827] Added S3 Implementation
Github user CarbonDataQA commented on the issue: https://github.com/apache/carbondata/pull/1584 Build Success with Spark 2.2.1, Please check CI http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/1441/ ---
[GitHub] carbondata pull request #1786: [CARBONDATA-1988] Fixed bug to remove empty p...
GitHub user geetikagupta16 opened a pull request: https://github.com/apache/carbondata/pull/1786 [CARBONDATA-1988] Fixed bug to remove empty partition directory for drop partition command Be sure to do all of the following checklist to help us incorporate your contribution quickly and easily: - [ ] Any interfaces changed? - [ ] Any backward compatibility impacted? - [ ] Document update required? - [ ] Testing done Please provide details on - Whether new unit test cases have been added or why no new tests are required? - How it is tested? Please attach test report. - Is it a performance related change? Please attach the performance test report. - Any additional information to help reviewers in testing this change. - [ ] For large changes, please consider breaking it into sub-tasks under an umbrella JIRA. You can merge this pull request into a Git repository by running: $ git pull https://github.com/geetikagupta16/incubator-carbondata CARBONDATA-1988 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/carbondata/pull/1786.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #1786 commit 41263d54d69a492e77275d4c375d330430cbebc3 Author: Geetika GuptaDate: 2018-01-10T10:53:55Z Refactored code to remove partition directory for drop partition command ---
[GitHub] carbondata issue #1584: [CARBONDATA-1827] Added S3 Implementation
Github user CarbonDataQA commented on the issue: https://github.com/apache/carbondata/pull/1584 Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/2675/ ---
[jira] [Created] (CARBONDATA-2016) Exception displays while implementing compaction with alter query
Vandana Yadav created CARBONDATA-2016: - Summary: Exception displays while implementing compaction with alter query Key: CARBONDATA-2016 URL: https://issues.apache.org/jira/browse/CARBONDATA-2016 Project: CarbonData Issue Type: Bug Components: data-load Affects Versions: 1.3.0 Environment: spark 2.1 Reporter: Vandana Yadav Priority: Minor Exception displays while implementing compaction with alter query. Steps to reproduce: 1) Create a table : CREATE TABLE CUSTOMER1 ( C_CUSTKEY INT , C_NAME STRING , C_ADDRESS STRING , C_NATIONKEY INT , C_PHONE STRING , C_ACCTBAL DECIMAL(15,2) , C_MKTSEGMENT STRING , C_COMMENT STRING) stored by 'carbondata'; 2) Insert data into the table: a) insert into customer1 values(1,'vandana','noida',1,'123456789',45987.78,'hello','comment') b) insert into customer1 values(2,'vandana','noida',2,'123456789',487.78,'hello','comment') c) insert into customer1 values(3,'geetika','delhi',3,'123456789',487897.78,'hello','comment') d) insert into customer1 values(4,'sangeeta','delhi',3,'123456789',48789.78,'hello','comment') 3) Perform alter table query: alter table customer1 add columns (intfield int) TBLPROPERTIES ('DEFAULT.VALUE.intfield'='10'); 4) show segments for displaying segments before compaction show segments for table customer1; output: ++--+--+--++--+--+ | SegmentSequenceId | Status | Load Start Time | Load End Time | Merged To | File Format | ++--+--+--++--+--+ | 3 | Success | 2018-01-10 16:16:53.611 | 2018-01-10 16:16:54.99 | NA | COLUMNAR_V3 | | 2 | Success | 2018-01-10 16:16:46.878 | 2018-01-10 16:16:47.75 | NA | COLUMNAR_V3 | | 1 | Success | 2018-01-10 16:16:38.096 | 2018-01-10 16:16:38.972 | NA | COLUMNAR_V3 | | 0 | Success | 2018-01-10 16:16:31.979 | 2018-01-10 16:16:33.293 | NA | COLUMNAR_V3 | ++--+--+--++--+--+ 4 rows selected (0.029 seconds) 5) alter table query for compaction: alter table customer1 compact 'minor'; Expected Result: Table should be compacted successfully. Actual Result: Error: org.apache.spark.sql.AnalysisException: Compaction failed. Please check logs for more info. Exception in compaction Compaction Failure in Merger Rdd.; (state=,code=0) -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[GitHub] carbondata issue #1784: [CARBONDATA-1965]removed sort_scope from dynamic con...
Github user ravipesala commented on the issue: https://github.com/apache/carbondata/pull/1784 SDV Build Success , Please check CI http://144.76.159.231:8080/job/ApacheSDVTests/2812/ ---
[jira] [Updated] (CARBONDATA-2015) Restricted maximum length of bytes per column
[ https://issues.apache.org/jira/browse/CARBONDATA-2015?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] dhatchayani updated CARBONDATA-2015: Description: Validation for number of bytes for a column is added. We have limited the number of characters per column to 32000. For example, a single unicode character takes 3 bytes. So in this case, if my column has 30,000 unicode characters, then 32000 * 3 exceeds the short range. So, load will fail. > Restricted maximum length of bytes per column > - > > Key: CARBONDATA-2015 > URL: https://issues.apache.org/jira/browse/CARBONDATA-2015 > Project: CarbonData > Issue Type: Bug >Reporter: dhatchayani >Assignee: dhatchayani >Priority: Minor > Time Spent: 10m > Remaining Estimate: 0h > > Validation for number of bytes for a column is added. > We have limited the number of characters per column to 32000. > For example, a single unicode character takes 3 bytes. So in this case, if my > column has 30,000 unicode characters, then 32000 * 3 exceeds the short range. > So, load will fail. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[GitHub] carbondata pull request #1785: [CARBONDATA-2015] Restricted maximum length o...
GitHub user dhatchayani opened a pull request: https://github.com/apache/carbondata/pull/1785 [CARBONDATA-2015] Restricted maximum length of bytes per column Validation for number of bytes for a column is added. We have limited the number of characters per column to 32000. For example, a single unicode character takes 3 bytes. So in this case, if my column has 30,000 unicode characters, then 32000 * 3 exceeds the short range. So, load will fail. - [ ] Any interfaces changed? - [ ] Any backward compatibility impacted? - [ ] Document update required? - [x] Testing done UT Added - [ ] For large changes, please consider breaking it into sub-tasks under an umbrella JIRA. You can merge this pull request into a Git repository by running: $ git pull https://github.com/dhatchayani/incubator-carbondata 32000_bytes Alternatively you can review and apply these changes as the patch at: https://github.com/apache/carbondata/pull/1785.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #1785 commit e380a1d6b2ffae8611f6045e9f63d2ca6e710652 Author: dhatchayaniDate: 2018-01-10T10:59:14Z [CARBONDATA-2015] Restricted maximum length of bytes per column ---
[GitHub] carbondata issue #1784: [CARBONDATA-1965]removed sort_scope from dynamic con...
Github user CarbonDataQA commented on the issue: https://github.com/apache/carbondata/pull/1784 Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/2674/ ---
[GitHub] carbondata issue #1784: [CARBONDATA-1965]removed sort_scope from dynamic con...
Github user CarbonDataQA commented on the issue: https://github.com/apache/carbondata/pull/1784 Build Success with Spark 2.2.1, Please check CI http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/1440/ ---
[GitHub] carbondata issue #1724: [CARBONDATA-1940][PreAgg] Fixed bug for creation of ...
Github user kumarvishal09 commented on the issue: https://github.com/apache/carbondata/pull/1724 retest this please ---
[jira] [Created] (CARBONDATA-2015) Restricted maximum length of bytes per column
dhatchayani created CARBONDATA-2015: --- Summary: Restricted maximum length of bytes per column Key: CARBONDATA-2015 URL: https://issues.apache.org/jira/browse/CARBONDATA-2015 Project: CarbonData Issue Type: Bug Reporter: dhatchayani Assignee: dhatchayani Priority: Minor -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Assigned] (CARBONDATA-2014) update table status for load failure only after first entry
[ https://issues.apache.org/jira/browse/CARBONDATA-2014?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Akash R Nilugal reassigned CARBONDATA-2014: --- Assignee: Akash R Nilugal > update table status for load failure only after first entry > --- > > Key: CARBONDATA-2014 > URL: https://issues.apache.org/jira/browse/CARBONDATA-2014 > Project: CarbonData > Issue Type: Bug >Reporter: Akash R Nilugal >Assignee: Akash R Nilugal >Priority: Minor > Time Spent: 0.5h > Remaining Estimate: 0h > > update table status for load failure only after first entry and before > calling to update the table status for failure, check whether it is hive > partition table in the same way as it is checked while updating in progress > status to table status -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[GitHub] carbondata pull request #1784: [CARBONDATA-1965]removed sort_scope from dyna...
GitHub user vandana7 opened a pull request: https://github.com/apache/carbondata/pull/1784 [CARBONDATA-1965]removed sort_scope from dynamic configuration in carbondata using set-reset as it is not configured by set You can merge this pull request into a Git repository by running: $ git pull https://github.com/vandana7/incubator-carbondata remove_scope_set Alternatively you can review and apply these changes as the patch at: https://github.com/apache/carbondata/pull/1784.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #1784 commit 05175b284ff7c58dec7ed11c2a3c1f914bc22697 Author: vandanaDate: 2018-01-10T10:08:40Z removed sort_scope from dynamic configurationin carbondata using set-reset as it is not configured by set ---
[GitHub] carbondata issue #1751: [CARBONDATA-1971][Blocklet Prunning] Measure Null va...
Github user ravipesala commented on the issue: https://github.com/apache/carbondata/pull/1751 SDV Build Fail , Please check CI http://144.76.159.231:8080/job/ApacheSDVTests/2811/ ---
[jira] [Closed] (CARBONDATA-1735) Carbon1.3.0 Load: Segment created during load is not marked for delete if beeline session is closed while load is still in progress
[ https://issues.apache.org/jira/browse/CARBONDATA-1735?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ajeet Rai closed CARBONDATA-1735. - Resolution: Fixed This issue has been verified in latest carbon 1.3 version and it is working fine. Hence closing the defect. > Carbon1.3.0 Load: Segment created during load is not marked for delete if > beeline session is closed while load is still in progress > > > Key: CARBONDATA-1735 > URL: https://issues.apache.org/jira/browse/CARBONDATA-1735 > Project: CarbonData > Issue Type: Bug > Components: data-load >Affects Versions: 1.3.0 > Environment: 3 Node ant cluster >Reporter: Ajeet Rai >Priority: Minor > Labels: DFX > > Load: Segment created during load is not marked for delete if beeline session > is closed while load is still in progress. > Steps: > 1: Create a table with dictionary include > 2: Start a load job > 3: close the beeline session when global dictionary generation job is still > in progress. > 4: Observe that global dictionary generation job is completed but next job is > not triggered. > 5: Also observe that table status file is not updated and status of job is > still in progress. > 6: show segment will show this segment with status as in progress. > Expected behaviour: Either job should be completed or load should fail and > segment should be marked for delete. -- This message was sent by Atlassian JIRA (v6.4.14#64029)