[GitHub] carbondata issue #2712: [HOTFIX] Fix streaming CI issue
Github user CarbonDataQA commented on the issue: https://github.com/apache/carbondata/pull/2712 Build Success with Spark 2.3.1, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder2.3/8483/ ---
[GitHub] carbondata issue #2628: [CARBONDATA-2851][CARBONDATA-2852] Support zstd as c...
Github user CarbonDataQA commented on the issue: https://github.com/apache/carbondata/pull/2628 Build Success with Spark 2.2.1, Please check CI http://95.216.28.178:8080/job/ApacheCarbonPRBuilder1/411/ ---
[GitHub] carbondata issue #2712: [HOTFIX] Fix streaming CI issue
Github user CarbonDataQA commented on the issue: https://github.com/apache/carbondata/pull/2712 Build Failed with Spark 2.2.1, Please check CI http://95.216.28.178:8080/job/ApacheCarbonPRBuilder1/413/ ---
[GitHub] carbondata issue #2711: [CARBONDATA-2929][DataMap] Add block skipped info fo...
Github user CarbonDataQA commented on the issue: https://github.com/apache/carbondata/pull/2711 Build Failed with Spark 2.3.1, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder2.3/8482/ ---
[GitHub] carbondata issue #2708: [CARBONDATA-2886] Select Filter Compatibility Fixed
Github user CarbonDataQA commented on the issue: https://github.com/apache/carbondata/pull/2708 Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder2.1/246/ ---
[GitHub] carbondata issue #2711: [CARBONDATA-2929][DataMap] Add block skipped info fo...
Github user CarbonDataQA commented on the issue: https://github.com/apache/carbondata/pull/2711 Build Failed with Spark 2.2.1, Please check CI http://95.216.28.178:8080/job/ApacheCarbonPRBuilder1/412/ ---
[GitHub] carbondata issue #2702: [CARBONDATA-2924] Fix parsing issue for map as a nes...
Github user CarbonDataQA commented on the issue: https://github.com/apache/carbondata/pull/2702 Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder2.1/245/ ---
[GitHub] carbondata issue #2708: [CARBONDATA-2886] Select Filter Compatibility Fixed
Github user brijoobopanna commented on the issue: https://github.com/apache/carbondata/pull/2708 retest this please ---
[GitHub] carbondata issue #2695: [CARBONDATA-2919] Support ingest from Kafka in Strea...
Github user CarbonDataQA commented on the issue: https://github.com/apache/carbondata/pull/2695 Build Success with Spark 2.2.1, Please check CI http://95.216.28.178:8080/job/ApacheCarbonPRBuilder1/410/ ---
[GitHub] carbondata issue #2628: [CARBONDATA-2851][CARBONDATA-2852] Support zstd as c...
Github user CarbonDataQA commented on the issue: https://github.com/apache/carbondata/pull/2628 Build Failed with Spark 2.3.1, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder2.3/8481/ ---
[GitHub] carbondata issue #2712: [HOTFIX] Fix streaming CI issue
Github user CarbonDataQA commented on the issue: https://github.com/apache/carbondata/pull/2712 Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder2.1/244/ ---
[GitHub] carbondata issue #2711: [CARBONDATA-2929][DataMap] Add block skipped info fo...
Github user CarbonDataQA commented on the issue: https://github.com/apache/carbondata/pull/2711 Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder2.1/243/ ---
[GitHub] carbondata pull request #2712: [HOTFIX] Fix streaming CI issue
GitHub user QiangCai opened a pull request: https://github.com/apache/carbondata/pull/2712 [HOTFIX] Fix streaming CI issue fix streaming ci issue - [x] Any interfaces changed? no - [x] Any backward compatibility impacted? no - [x] Document update required? no - [x] Testing done Please provide details on - Whether new unit test cases have been added or why no new tests are required? - How it is tested? Please attach test report. - Is it a performance related change? Please attach the performance test report. - Any additional information to help reviewers in testing this change. - [x] For large changes, please consider breaking it into sub-tasks under an umbrella JIRA. small changes You can merge this pull request into a Git repository by running: $ git pull https://github.com/QiangCai/carbondata fix_streaming_issue Alternatively you can review and apply these changes as the patch at: https://github.com/apache/carbondata/pull/2712.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #2712 commit 31ba78cb30279e0ab7e37f28ab078cf86bcee4c9 Author: QiangCai Date: 2018-09-12T04:00:27Z fix streaming ci issue ---
[GitHub] carbondata pull request #2711: [CARBONDATA-2929][DataMap] Add block skipped ...
GitHub user kevinjmh opened a pull request: https://github.com/apache/carbondata/pull/2711 [CARBONDATA-2929][DataMap] Add block skipped info for explain command This pr will add block skipped info by counting distinct file path from hit blocklet. It shows like below: ``` |== CarbonData Profiler == Table Scan on test - total: 125 blocks, 250 blocklets - filter: (l_partkey <> null and l_partkey = 1006) - pruned by Main DataMap - skipped: 119 blocks, 238 blocklets - pruned by CG DataMap - name: dm - provider: bloomfilter - skipped: 6 blocks, 12 blocklets ``` ``` |== CarbonData Profiler == Table Scan on test - total: 125 blocks, 250 blocklets - filter: TEXT_MATCH('l_shipmode:AIR') - pruned by Main DataMap - skipped: 0 blocks, 0 blocklets - pruned by FG DataMap - name: dm - provider: lucene - skipped: 12 blocks, 80 blocklets ``` Be sure to do all of the following checklist to help us incorporate your contribution quickly and easily: - [ ] Any interfaces changed? - [ ] Any backward compatibility impacted? - [ ] Document update required? - [ ] Testing done Please provide details on - Whether new unit test cases have been added or why no new tests are required? - How it is tested? Please attach test report. - Is it a performance related change? Please attach the performance test report. - Any additional information to help reviewers in testing this change. - [ ] For large changes, please consider breaking it into sub-tasks under an umbrella JIRA. You can merge this pull request into a Git repository by running: $ git pull https://github.com/kevinjmh/carbondata explain_block_skip Alternatively you can review and apply these changes as the patch at: https://github.com/apache/carbondata/pull/2711.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #2711 commit 0828b4d3f366b02a3f9db89e862fc9bc0b89 Author: Manhua Date: 2018-09-12T03:29:46Z add block skip info ---
[GitHub] carbondata issue #2628: [CARBONDATA-2851][CARBONDATA-2852] Support zstd as c...
Github user jackylk commented on the issue: https://github.com/apache/carbondata/pull/2628 LGTM. spark 2.3 CI has problem currently, we are fixing it ---
[GitHub] carbondata issue #2695: [CARBONDATA-2919] Support ingest from Kafka in Strea...
Github user CarbonDataQA commented on the issue: https://github.com/apache/carbondata/pull/2695 Build Failed with Spark 2.3.1, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder2.3/8480/ ---
[GitHub] carbondata issue #2628: [CARBONDATA-2851][CARBONDATA-2852] Support zstd as c...
Github user CarbonDataQA commented on the issue: https://github.com/apache/carbondata/pull/2628 Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder2.1/242/ ---
[GitHub] carbondata issue #2695: [CARBONDATA-2919] Support ingest from Kafka in Strea...
Github user CarbonDataQA commented on the issue: https://github.com/apache/carbondata/pull/2695 Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder2.1/241/ ---
[GitHub] carbondata issue #2709: [HOTFIX] Removed scala dependency from carbon core m...
Github user xuchuanyin commented on the issue: https://github.com/apache/carbondata/pull/2709 @jackylk I got preaggregate related failure in spark2.3 http://136.243.101.176:8080/job/ApacheCarbonPRBuilder2.3/8479/ it seems all the failed testcase all contains sql("reset") ---
[jira] [Updated] (CARBONDATA-2929) Add block skipped info for explain command
[ https://issues.apache.org/jira/browse/CARBONDATA-2929?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] jiangmanhua updated CARBONDATA-2929: Summary: Add block skipped info for explain command (was: add block skipped info for explain command) > Add block skipped info for explain command > -- > > Key: CARBONDATA-2929 > URL: https://issues.apache.org/jira/browse/CARBONDATA-2929 > Project: CarbonData > Issue Type: Improvement >Reporter: jiangmanhua >Assignee: jiangmanhua >Priority: Major > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (CARBONDATA-2929) add block skipped info for explain command
jiangmanhua created CARBONDATA-2929: --- Summary: add block skipped info for explain command Key: CARBONDATA-2929 URL: https://issues.apache.org/jira/browse/CARBONDATA-2929 Project: CarbonData Issue Type: Improvement Reporter: jiangmanhua Assignee: jiangmanhua -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[GitHub] carbondata issue #2628: [CARBONDATA-2851][CARBONDATA-2852] Support zstd as c...
Github user xuchuanyin commented on the issue: https://github.com/apache/carbondata/pull/2628 retest this please ---
[GitHub] carbondata issue #2706: [CARBONDATA-2927] multiple issue fixes for varchar c...
Github user xuchuanyin commented on the issue: https://github.com/apache/carbondata/pull/2706 @ajantha-bhat Hi, I think the main problem may be that you set the 'rowbuffer' as static which should not be shared among different data loadings. Besides, the judgement for increasing rowBuffer size per row per column may decrease data loading performance. As a result, I'd like to implement this in an easier way. We can add a table propery or load option for the size of row buffer. Just keep the previous row-buffer related code as it is. All you need is to change the initial size of the rowbuffer based on the table property or load option. @kumarvishal09 @ravipesala How do you think? ---
[jira] [Updated] (CARBONDATA-2928) query failed when doing merge index during load
[ https://issues.apache.org/jira/browse/CARBONDATA-2928?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ocean updated CARBONDATA-2928: -- Description: In carbondata version 1.4.1, carbonindex file be merged in every load. But when query through thriftserver(about 10QPS), if merge index is in progress, An error will occurs. 18/09/12 11:18:25 ERROR SparkExecuteStatementOperation: Error executing query, currentState RUNNING, org.apache.spark.sql.catalyst.errors.package$TreeNodeException: execute, tree: Exchange SinglePartition +- *HashAggregate(keys=[], functions=[partial_count(1)], output=[count#1692258L|#1692258L]) +- *Project +- *FileScan carbondata default.ae_event_cb_40e_std[] PushedFilters: [IsNotNull(eventid), IsNotNull(productid), IsNotNull(starttime_day), EqualTo(productid,534), Equa... at org.apache.spark.sql.catalyst.errors.package$.attachTree(package.scala:56) at org.apache.spark.sql.execution.exchange.ShuffleExchange.doExecute(ShuffleExchange.scala:115) at org.apache.spark.sql.execution.SparkPlan$$anonfun$execute$1.apply(SparkPlan.scala:117) at org.apache.spark.sql.execution.SparkPlan$$anonfun$execute$1.apply(SparkPlan.scala:117) at org.apache.spark.sql.execution.SparkPlan$$anonfun$executeQuery$1.apply(SparkPlan.scala:138) at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:151) at org.apache.spark.sql.execution.SparkPlan.executeQuery(SparkPlan.scala:135) at org.apache.spark.sql.execution.SparkPlan.execute(SparkPlan.scala:116) at org.apache.spark.sql.execution.InputAdapter.inputRDDs(WholeStageCodegenExec.scala:252) at org.apache.spark.sql.execution.aggregate.HashAggregateExec.inputRDDs(HashAggregateExec.scala:141) at org.apache.spark.sql.execution.WholeStageCodegenExec.doExecute(WholeStageCodegenExec.scala:386) at org.apache.spark.sql.execution.SparkPlan$$anonfun$execute$1.apply(SparkPlan.scala:117) at org.apache.spark.sql.execution.SparkPlan$$anonfun$execute$1.apply(SparkPlan.scala:117) at org.apache.spark.sql.execution.SparkPlan$$anonfun$executeQuery$1.apply(SparkPlan.scala:138) at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:151) at org.apache.spark.sql.execution.SparkPlan.executeQuery(SparkPlan.scala:135) at org.apache.spark.sql.execution.SparkPlan.execute(SparkPlan.scala:116) at org.apache.spark.sql.execution.SparkPlan.getByteArrayRdd(SparkPlan.scala:228) at org.apache.spark.sql.execution.SparkPlan.executeCollect(SparkPlan.scala:275) at org.apache.spark.sql.Dataset.org$apache$spark$sql$Dataset$$collectFromPlan(Dataset.scala:2861) at org.apache.spark.sql.Dataset$$anonfun$collect$1.apply(Dataset.scala:2387) at org.apache.spark.sql.Dataset$$anonfun$collect$1.apply(Dataset.scala:2387) at org.apache.spark.sql.Dataset$$anonfun$55.apply(Dataset.scala:2842) at org.apache.spark.sql.execution.SQLExecution$.withNewExecutionId(SQLExecution.scala:65) at org.apache.spark.sql.Dataset.withAction(Dataset.scala:2841) at org.apache.spark.sql.Dataset.collect(Dataset.scala:2387) at org.apache.spark.sql.hive.thriftserver.SparkExecuteStatementOperation.org$apache$spark$sql$hive$thriftserver$SparkExecuteStatementOperation$$execute(SparkExecuteStatementOperation.scala:245) at org.apache.spark.sql.hive.thriftserver.SparkExecuteStatementOperation$$anon$1$$anon$2.run(SparkExecuteStatementOperation.scala:174) at org.apache.spark.sql.hive.thriftserver.SparkExecuteStatementOperation$$anon$1$$anon$2.run(SparkExecuteStatementOperation.scala:171) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:422) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1698) at org.apache.spark.sql.hive.thriftserver.SparkExecuteStatementOperation$$anon$1.run(SparkExecuteStatementOperation.scala:184) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) at java.util.concurrent.FutureTask.run(FutureTask.java:266) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) at java.lang.Thread.run(Thread.java:745) Caused by: java.io.IOException: Problem in loading segment blocks. at org.apache.carbondata.core.indexstore.BlockletDataMapIndexStore.getAll(BlockletDataMapIndexStore.java:184) at org.apache.carbondata.core.indexstore.blockletindex.BlockletDataMapFactory.getDataMaps(BlockletDataMapFactory.java:144) at org.apache.carbondata.core.datamap.TableDataMap.prune(TableDataMap.java:93) at org.apache.carbondata.core.datamap.dev.expr.DataMapExprWrapperImpl.prune(DataMapExprWrapperImpl.java:53) at org.apache.carbondata.hadoop.api.CarbonInputFormat.getPrunedBlocklets(CarbonInputFormat.java:442) at org.apache.carbondata.hadoop.api.CarbonInputFormat.getDataBlocksOfSegment(CarbonInputFormat.java:378) at
[jira] [Created] (CARBONDATA-2928) query failed when doing merge index during load
ocean created CARBONDATA-2928: - Summary: query failed when doing merge index during load Key: CARBONDATA-2928 URL: https://issues.apache.org/jira/browse/CARBONDATA-2928 Project: CarbonData Issue Type: Bug Components: data-load Affects Versions: 1.4.1 Reporter: ocean Fix For: NONE In carbondata version 1.4.1, carbonindex file be merged in every load. But when query through thriftserver, if merge index is in progress, An error will occurs. 18/09/12 11:18:25 ERROR SparkExecuteStatementOperation: Error executing query, currentState RUNNING, org.apache.spark.sql.catalyst.errors.package$TreeNodeException: execute, tree: Exchange SinglePartition +- *HashAggregate(keys=[], functions=[partial_count(1)], output=[count#1692258L]) +- *Project +- *FileScan carbondata default.ae_event_cb_40e_std[] PushedFilters: [IsNotNull(eventid), IsNotNull(productid), IsNotNull(starttime_day), EqualTo(productid,534), Equa... at org.apache.spark.sql.catalyst.errors.package$.attachTree(package.scala:56) at org.apache.spark.sql.execution.exchange.ShuffleExchange.doExecute(ShuffleExchange.scala:115) at org.apache.spark.sql.execution.SparkPlan$$anonfun$execute$1.apply(SparkPlan.scala:117) at org.apache.spark.sql.execution.SparkPlan$$anonfun$execute$1.apply(SparkPlan.scala:117) at org.apache.spark.sql.execution.SparkPlan$$anonfun$executeQuery$1.apply(SparkPlan.scala:138) at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:151) at org.apache.spark.sql.execution.SparkPlan.executeQuery(SparkPlan.scala:135) at org.apache.spark.sql.execution.SparkPlan.execute(SparkPlan.scala:116) at org.apache.spark.sql.execution.InputAdapter.inputRDDs(WholeStageCodegenExec.scala:252) at org.apache.spark.sql.execution.aggregate.HashAggregateExec.inputRDDs(HashAggregateExec.scala:141) at org.apache.spark.sql.execution.WholeStageCodegenExec.doExecute(WholeStageCodegenExec.scala:386) at org.apache.spark.sql.execution.SparkPlan$$anonfun$execute$1.apply(SparkPlan.scala:117) at org.apache.spark.sql.execution.SparkPlan$$anonfun$execute$1.apply(SparkPlan.scala:117) at org.apache.spark.sql.execution.SparkPlan$$anonfun$executeQuery$1.apply(SparkPlan.scala:138) at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:151) at org.apache.spark.sql.execution.SparkPlan.executeQuery(SparkPlan.scala:135) at org.apache.spark.sql.execution.SparkPlan.execute(SparkPlan.scala:116) at org.apache.spark.sql.execution.SparkPlan.getByteArrayRdd(SparkPlan.scala:228) at org.apache.spark.sql.execution.SparkPlan.executeCollect(SparkPlan.scala:275) at org.apache.spark.sql.Dataset.org$apache$spark$sql$Dataset$$collectFromPlan(Dataset.scala:2861) at org.apache.spark.sql.Dataset$$anonfun$collect$1.apply(Dataset.scala:2387) at org.apache.spark.sql.Dataset$$anonfun$collect$1.apply(Dataset.scala:2387) at org.apache.spark.sql.Dataset$$anonfun$55.apply(Dataset.scala:2842) at org.apache.spark.sql.execution.SQLExecution$.withNewExecutionId(SQLExecution.scala:65) at org.apache.spark.sql.Dataset.withAction(Dataset.scala:2841) at org.apache.spark.sql.Dataset.collect(Dataset.scala:2387) at org.apache.spark.sql.hive.thriftserver.SparkExecuteStatementOperation.org$apache$spark$sql$hive$thriftserver$SparkExecuteStatementOperation$$execute(SparkExecuteStatementOperation.scala:245) at org.apache.spark.sql.hive.thriftserver.SparkExecuteStatementOperation$$anon$1$$anon$2.run(SparkExecuteStatementOperation.scala:174) at org.apache.spark.sql.hive.thriftserver.SparkExecuteStatementOperation$$anon$1$$anon$2.run(SparkExecuteStatementOperation.scala:171) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:422) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1698) at org.apache.spark.sql.hive.thriftserver.SparkExecuteStatementOperation$$anon$1.run(SparkExecuteStatementOperation.scala:184) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) at java.util.concurrent.FutureTask.run(FutureTask.java:266) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) at java.lang.Thread.run(Thread.java:745) Caused by: java.io.IOException: Problem in loading segment blocks. at org.apache.carbondata.core.indexstore.BlockletDataMapIndexStore.getAll(BlockletDataMapIndexStore.java:184) at org.apache.carbondata.core.indexstore.blockletindex.BlockletDataMapFactory.getDataMaps(BlockletDataMapFactory.java:144) at org.apache.carbondata.core.datamap.TableDataMap.prune(TableDataMap.java:93) at org.apache.carbondata.core.datamap.dev.expr.DataMapExprWrapperImpl.prune(DataMapExprWrapperImpl.java:53) at
[GitHub] carbondata pull request #2706: [CARBONDATA-2927] multiple issue fixes for va...
Github user xuchuanyin commented on a diff in the pull request: https://github.com/apache/carbondata/pull/2706#discussion_r216885804 --- Diff: processing/src/main/java/org/apache/carbondata/processing/loading/sort/unsafe/UnsafeSortDataRows.java --- @@ -240,11 +249,11 @@ public void addRow(Object[] row) throws CarbonSortKeyAndGroupByException { throw new CarbonSortKeyAndGroupByException(ex); } rowPage.addRow(row, rowBuffer.get()); - } catch (Exception e) { -LOGGER.error( -"exception occurred while trying to acquire a semaphore lock: " + e.getMessage()); -throw new CarbonSortKeyAndGroupByException(e); } +} catch (Exception e) { + LOGGER --- End diff -- bad indent. we can move the msg to next line and keep method call in this line ---
[GitHub] carbondata pull request #2706: [CARBONDATA-2927] multiple issue fixes for va...
Github user xuchuanyin commented on a diff in the pull request: https://github.com/apache/carbondata/pull/2706#discussion_r216884982 --- Diff: processing/src/main/java/org/apache/carbondata/processing/loading/sort/SortStepRowHandler.java --- @@ -570,23 +589,31 @@ public int writeRawRowAsIntermediateSortTempRowToUnsafeMemory(Object[] row, private void packNoSortFieldsToBytes(Object[] row, ByteBuffer rowBuffer) { // convert dict & no-sort for (int idx = 0; idx < this.dictNoSortDimCnt; idx++) { + // cannot exceed default 2MB, hence no need to call ensureArraySize rowBuffer.putInt((int) row[this.dictNoSortDimIdx[idx]]); } // convert no-dict & no-sort for (int idx = 0; idx < this.noDictNoSortDimCnt; idx++) { byte[] bytes = (byte[]) row[this.noDictNoSortDimIdx[idx]]; + // cannot exceed default 2MB, hence no need to call ensureArraySize --- End diff -- for one column, it may not exceed 2MB, what if we lots of no-sort-no-dict columns? ---
[GitHub] carbondata pull request #2706: [CARBONDATA-2927] multiple issue fixes for va...
Github user xuchuanyin commented on a diff in the pull request: https://github.com/apache/carbondata/pull/2706#discussion_r216884722 --- Diff: processing/src/main/java/org/apache/carbondata/processing/loading/sort/SortStepRowHandler.java --- @@ -559,7 +572,13 @@ public int writeRawRowAsIntermediateSortTempRowToUnsafeMemory(Object[] row, return size; } - + private void validateUnsafeMemoryBlockSizeLimit(long unsafeRemainingLength, int size) --- End diff -- please optimize the parameter name of 'size' for better reading, it seems that it represents the requestedSize ---
[GitHub] carbondata pull request #2706: [CARBONDATA-2927] multiple issue fixes for va...
Github user xuchuanyin commented on a diff in the pull request: https://github.com/apache/carbondata/pull/2706#discussion_r216885444 --- Diff: processing/src/main/java/org/apache/carbondata/processing/loading/sort/unsafe/UnsafeCarbonRowPage.java --- @@ -59,12 +60,11 @@ public UnsafeCarbonRowPage(TableFieldStat tableFieldStat, MemoryBlock memoryBloc this.taskId = taskId; buffer = new IntPointerBuffer(this.taskId); this.dataBlock = memoryBlock; -// TODO Only using 98% of space for safe side.May be we can have different logic. -sizeToBeUsed = dataBlock.size() - (dataBlock.size() * 5) / 100; +sizeToBeUsed = dataBlock.size(); --- End diff -- Is the old comment outdated? Have you ensured the 'safe side' it mentioned? ---
[GitHub] carbondata pull request #2706: [CARBONDATA-2927] multiple issue fixes for va...
Github user xuchuanyin commented on a diff in the pull request: https://github.com/apache/carbondata/pull/2706#discussion_r216885323 --- Diff: processing/src/main/java/org/apache/carbondata/processing/loading/sort/SortStepRowHandler.java --- @@ -598,26 +625,53 @@ private void packNoSortFieldsToBytes(Object[] row, ByteBuffer rowBuffer) { tmpValue = row[this.measureIdx[idx]]; tmpDataType = this.dataTypes[idx]; if (null == tmpValue) { +// can exceed default 2MB, hence need to call ensureArraySize +rowBuffer = UnsafeSortDataRows +.ensureArraySize(1); rowBuffer.put((byte) 0); continue; } + // can exceed default 2MB, hence need to call ensureArraySize + rowBuffer = UnsafeSortDataRows + .ensureArraySize(1); --- End diff -- bad indent, can be moved to previous line The same with line#642, line#647 ---
[GitHub] carbondata pull request #2706: [CARBONDATA-2927] multiple issue fixes for va...
Github user xuchuanyin commented on a diff in the pull request: https://github.com/apache/carbondata/pull/2706#discussion_r216884374 --- Diff: core/src/main/java/org/apache/carbondata/core/memory/UnsafeMemoryManager.java --- @@ -200,7 +200,7 @@ public static MemoryBlock allocateMemoryWithRetry(long taskId, long size) } if (baseBlock == null) { INSTANCE.printCurrentMemoryUsage(); - throw new MemoryException("Not enough memory"); + throw new MemoryException("Not enough memory, increase carbon.unsafe.working.memory.in.mb"); --- End diff -- I think you can optimize the error message to `Not enough unsafe working memory (total: , available: , request: )` ---
[GitHub] carbondata pull request #2706: [CARBONDATA-2927] multiple issue fixes for va...
Github user xuchuanyin commented on a diff in the pull request: https://github.com/apache/carbondata/pull/2706#discussion_r216885250 --- Diff: processing/src/main/java/org/apache/carbondata/processing/loading/sort/SortStepRowHandler.java --- @@ -598,26 +625,53 @@ private void packNoSortFieldsToBytes(Object[] row, ByteBuffer rowBuffer) { tmpValue = row[this.measureIdx[idx]]; tmpDataType = this.dataTypes[idx]; if (null == tmpValue) { +// can exceed default 2MB, hence need to call ensureArraySize +rowBuffer = UnsafeSortDataRows +.ensureArraySize(1); --- End diff -- bad indent, can be moved to previous line ---
[GitHub] carbondata pull request #2706: [CARBONDATA-2927] multiple issue fixes for va...
Github user xuchuanyin commented on a diff in the pull request: https://github.com/apache/carbondata/pull/2706#discussion_r216886119 --- Diff: processing/src/main/java/org/apache/carbondata/processing/loading/sort/unsafe/UnsafeSortDataRows.java --- @@ -326,6 +335,19 @@ private void startFileBasedMerge() throws InterruptedException { dataSorterAndWriterExecutorService.awaitTermination(2, TimeUnit.DAYS); } + public static ByteBuffer ensureArraySize(int requestSize) { --- End diff -- If we increase the rowbuffer runtime, is there a way to decrease it? Or if there is no need to do so, how long will this rowbuffer last? ---
[GitHub] carbondata pull request #2706: [CARBONDATA-2927] multiple issue fixes for va...
Github user xuchuanyin commented on a diff in the pull request: https://github.com/apache/carbondata/pull/2706#discussion_r216885202 --- Diff: processing/src/main/java/org/apache/carbondata/processing/loading/sort/SortStepRowHandler.java --- @@ -570,23 +589,31 @@ public int writeRawRowAsIntermediateSortTempRowToUnsafeMemory(Object[] row, private void packNoSortFieldsToBytes(Object[] row, ByteBuffer rowBuffer) { // convert dict & no-sort for (int idx = 0; idx < this.dictNoSortDimCnt; idx++) { + // cannot exceed default 2MB, hence no need to call ensureArraySize rowBuffer.putInt((int) row[this.dictNoSortDimIdx[idx]]); } // convert no-dict & no-sort for (int idx = 0; idx < this.noDictNoSortDimCnt; idx++) { byte[] bytes = (byte[]) row[this.noDictNoSortDimIdx[idx]]; + // cannot exceed default 2MB, hence no need to call ensureArraySize rowBuffer.putShort((short) bytes.length); rowBuffer.put(bytes); } // convert varchar dims for (int idx = 0; idx < this.varcharDimCnt; idx++) { byte[] bytes = (byte[]) row[this.varcharDimIdx[idx]]; + // can exceed default 2MB, hence need to call ensureArraySize + rowBuffer = UnsafeSortDataRows --- End diff -- Should we call this method per row per column? Since in most scenarios, 2MB per row is enough, so will the method calling here cause performance decrease? ---
[GitHub] carbondata pull request #2706: [CARBONDATA-2927] multiple issue fixes for va...
Github user xuchuanyin commented on a diff in the pull request: https://github.com/apache/carbondata/pull/2706#discussion_r216885637 --- Diff: processing/src/main/java/org/apache/carbondata/processing/loading/sort/unsafe/UnsafeSortDataRows.java --- @@ -72,7 +72,7 @@ private SortParameters parameters; private TableFieldStat tableFieldStat; - private ThreadLocal rowBuffer; + private static ThreadLocal rowBuffer; --- End diff -- I think the 'static' here may cause problem for concurrent loading. Each loading should their own rowBuffer. ---
[GitHub] carbondata pull request #2706: [CARBONDATA-2927] multiple issue fixes for va...
Github user xuchuanyin commented on a diff in the pull request: https://github.com/apache/carbondata/pull/2706#discussion_r216885885 --- Diff: processing/src/main/java/org/apache/carbondata/processing/loading/sort/unsafe/UnsafeSortDataRows.java --- @@ -326,6 +335,19 @@ private void startFileBasedMerge() throws InterruptedException { dataSorterAndWriterExecutorService.awaitTermination(2, TimeUnit.DAYS); } + public static ByteBuffer ensureArraySize(int requestSize) { --- End diff -- please give a comment that this method is used to increase the rowbuffer during loading. ---
[GitHub] carbondata issue #2628: [CARBONDATA-2851][CARBONDATA-2852] Support zstd as c...
Github user CarbonDataQA commented on the issue: https://github.com/apache/carbondata/pull/2628 Build Failed with Spark 2.3.1, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder2.3/8479/ ---
[GitHub] carbondata issue #2628: [CARBONDATA-2851][CARBONDATA-2852] Support zstd as c...
Github user CarbonDataQA commented on the issue: https://github.com/apache/carbondata/pull/2628 Build Success with Spark 2.2.1, Please check CI http://95.216.28.178:8080/job/ApacheCarbonPRBuilder1/409/ ---
[GitHub] carbondata issue #2628: [CARBONDATA-2851][CARBONDATA-2852] Support zstd as c...
Github user CarbonDataQA commented on the issue: https://github.com/apache/carbondata/pull/2628 Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder2.1/240/ ---
[GitHub] carbondata issue #2628: [CARBONDATA-2851][CARBONDATA-2852] Support zstd as c...
Github user xuchuanyin commented on the issue: https://github.com/apache/carbondata/pull/2628 @ravipesala @jackylk I add an optional 'compressor_name' aside the 'compression_codec'. During processing, I use the compressor_name and set compression_codec to a deprecated value. Also I add an interface to register customize compressor and add a test for it. For now, all the review comments are resolved. ---
[GitHub] carbondata issue #2683: [CARBONDATA-2916] Add CarbonCli tool for data summar...
Github user CarbonDataQA commented on the issue: https://github.com/apache/carbondata/pull/2683 Build Success with Spark 2.2.1, Please check CI http://95.216.28.178:8080/job/ApacheCarbonPRBuilder1/408/ ---
[GitHub] carbondata issue #2683: [CARBONDATA-2916] Add CarbonCli tool for data summar...
Github user CarbonDataQA commented on the issue: https://github.com/apache/carbondata/pull/2683 Build Failed with Spark 2.3.1, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder2.3/8478/ ---
[GitHub] carbondata issue #2695: [CARBONDATA-2919] Support ingest from Kafka in Strea...
Github user CarbonDataQA commented on the issue: https://github.com/apache/carbondata/pull/2695 Build Failed with Spark 2.2.1, Please check CI http://95.216.28.178:8080/job/ApacheCarbonPRBuilder1/407/ ---
[GitHub] carbondata issue #2683: [CARBONDATA-2916] Add CarbonCli tool for data summar...
Github user CarbonDataQA commented on the issue: https://github.com/apache/carbondata/pull/2683 Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder2.1/239/ ---
[GitHub] carbondata issue #2695: [CARBONDATA-2919] Support ingest from Kafka in Strea...
Github user CarbonDataQA commented on the issue: https://github.com/apache/carbondata/pull/2695 Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder2.1/238/ ---
[GitHub] carbondata issue #2695: [CARBONDATA-2919] Support ingest from Kafka in Strea...
Github user CarbonDataQA commented on the issue: https://github.com/apache/carbondata/pull/2695 Build Failed with Spark 2.3.1, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder2.3/8477/ ---
[GitHub] carbondata issue #2709: [HOTFIX] Removed scala dependency from carbon core m...
Github user jackylk commented on the issue: https://github.com/apache/carbondata/pull/2709 spark 2.3 has random failure? @ravipesala @QiangCai ---
[GitHub] carbondata pull request #2695: [CARBONDATA-2919] Support ingest from Kafka i...
Github user jackylk commented on a diff in the pull request: https://github.com/apache/carbondata/pull/2695#discussion_r216736902 --- Diff: examples/spark2/src/main/scala/org/apache/carbondata/examples/StreamSQLExample.scala --- @@ -0,0 +1,124 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + *http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.carbondata.examples + +import java.io.File + +import org.apache.carbondata.examples.util.ExampleUtils + +// scalastyle:off println +object StreamSQLExample { --- End diff -- Now I changed this Example to use socket stream source. ---
[GitHub] carbondata issue #2706: [CARBONDATA-2927] multiple issue fixes for varchar c...
Github user CarbonDataQA commented on the issue: https://github.com/apache/carbondata/pull/2706 Build Failed with Spark 2.3.1, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder2.3/8476/ ---
[GitHub] carbondata issue #2706: [CARBONDATA-2927] multiple issue fixes for varchar c...
Github user CarbonDataQA commented on the issue: https://github.com/apache/carbondata/pull/2706 Build Failed with Spark 2.2.1, Please check CI http://95.216.28.178:8080/job/ApacheCarbonPRBuilder1/406/ ---
[GitHub] carbondata issue #2706: [CARBONDATA-2927] multiple issue fixes for varchar c...
Github user CarbonDataQA commented on the issue: https://github.com/apache/carbondata/pull/2706 Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder2.1/237/ ---
[GitHub] carbondata issue #2706: [CARBONDATA-2927] multiple issue fixes for varchar c...
Github user ajantha-bhat commented on the issue: https://github.com/apache/carbondata/pull/2706 @kumarvishal09 @ravipesala : please do in-depth review for this PR. impact is more. ---
[jira] [Created] (CARBONDATA-2927) Multiple issue fixes for varchar column and complex columns that grows more than 2MB
Ajantha Bhat created CARBONDATA-2927: Summary: Multiple issue fixes for varchar column and complex columns that grows more than 2MB Key: CARBONDATA-2927 URL: https://issues.apache.org/jira/browse/CARBONDATA-2927 Project: CarbonData Issue Type: Bug Reporter: Ajantha Bhat Assignee: Ajantha Bhat *Fixed:* *1. varchar data length is more than 2MB, buffer overflow exception (thread local row buffer)* *root* casue*: thread* loaclbuffer *was hardcoded with 2MB.* *solution: grow dynamically based on the row size.* *2. read data from carbon file having one row of varchar data with 150 MB length is very slow.* *root casue: At UnsafeDMStore, ensure memory is just incresing by 8KB each time and lot of time malloc and free happens before reaching 150MB. hence very slow performance.* *solution: directly check and allocate the required size.* *3. Jvm crash when data size is more than 128 MB in unsafe sort step.* *root cause: unsafeCarbonRowPage is of 128MB, so if data is more than 128MB for one row, we access block beyond allocated, leading to JVM crash.* *solution: validate the size before access and prompt user to increase unsafe memory. (by carbon property)* -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[GitHub] carbondata issue #2683: [CARBONDATA-2916] Add CarbonCli tool for data summar...
Github user CarbonDataQA commented on the issue: https://github.com/apache/carbondata/pull/2683 Build Failed with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder2.1/236/ ---
[GitHub] carbondata issue #2683: [CARBONDATA-2916] Add CarbonCli tool for data summar...
Github user CarbonDataQA commented on the issue: https://github.com/apache/carbondata/pull/2683 Build Failed with Spark 2.2.1, Please check CI http://95.216.28.178:8080/job/ApacheCarbonPRBuilder1/405/ ---
[GitHub] carbondata issue #2683: [CARBONDATA-2916] Add CarbonCli tool for data summar...
Github user CarbonDataQA commented on the issue: https://github.com/apache/carbondata/pull/2683 Build Failed with Spark 2.3.1, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder2.3/8475/ ---
[GitHub] carbondata issue #2683: [CARBONDATA-2916] Add CarbonCli tool for data summar...
Github user CarbonDataQA commented on the issue: https://github.com/apache/carbondata/pull/2683 Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder2.1/235/ ---
[GitHub] carbondata pull request #2710: [2875]two different threads overwriting the s...
Github user shardul-cr7 closed the pull request at: https://github.com/apache/carbondata/pull/2710 ---
[GitHub] carbondata issue #2703: [CARBONDATA-2925]Wrong data displayed for spark file...
Github user CarbonDataQA commented on the issue: https://github.com/apache/carbondata/pull/2703 Build Success with Spark 2.2.1, Please check CI http://95.216.28.178:8080/job/ApacheCarbonPRBuilder1/403/ ---
[GitHub] carbondata issue #2703: [CARBONDATA-2925]Wrong data displayed for spark file...
Github user CarbonDataQA commented on the issue: https://github.com/apache/carbondata/pull/2703 Build Failed with Spark 2.3.1, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder2.3/8473/ ---
[GitHub] carbondata issue #2703: [CARBONDATA-2925]Wrong data displayed for spark file...
Github user CarbonDataQA commented on the issue: https://github.com/apache/carbondata/pull/2703 Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder2.1/234/ ---
[GitHub] carbondata issue #2628: [CARBONDATA-2851][CARBONDATA-2852] Support zstd as c...
Github user CarbonDataQA commented on the issue: https://github.com/apache/carbondata/pull/2628 Build Failed with Spark 2.3.1, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder2.3/8472/ ---
[GitHub] carbondata issue #2628: [CARBONDATA-2851][CARBONDATA-2852] Support zstd as c...
Github user CarbonDataQA commented on the issue: https://github.com/apache/carbondata/pull/2628 Build Failed with Spark 2.2.1, Please check CI http://95.216.28.178:8080/job/ApacheCarbonPRBuilder1/402/ ---
[jira] [Assigned] (CARBONDATA-2877) CarbonDataWriterException when loading data to carbon table with large number of rows/columns from Spark-Submit
[ https://issues.apache.org/jira/browse/CARBONDATA-2877?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Brijoo Bopanna reassigned CARBONDATA-2877: -- Assignee: Brijoo Bopanna (was: kumar vishal) > CarbonDataWriterException when loading data to carbon table with large number > of rows/columns from Spark-Submit > --- > > Key: CARBONDATA-2877 > URL: https://issues.apache.org/jira/browse/CARBONDATA-2877 > Project: CarbonData > Issue Type: Bug > Components: data-load >Affects Versions: 1.4.1 > Environment: Spark 2.1 >Reporter: Chetan Bhat >Assignee: Brijoo Bopanna >Priority: Major > > Steps : > from Spark-Submit. User creates a table with large number of columns(around > 100) and tries to load around 3 lakh records to the table. > Spark-submit command - spark-submit --master yarn --num-executors 3 > --executor-memory 75g --driver-memory 10g --executor-cores 12 --class > Actual Issue : Data loading fails with CarbonDataWriterException. > Executor yarn UI log- > org.apache.spark.util.TaskCompletionListenerException: > org.apache.carbondata.core.datastore.exception.CarbonDataWriterException > Previous exception in task: Error while initializing data handler : > > org.apache.carbondata.processing.loading.steps.DataWriterProcessorStepImpl.execute(DataWriterProcessorStepImpl.java:141) > > org.apache.carbondata.processing.loading.DataLoadExecutor.execute(DataLoadExecutor.java:51) > > org.apache.carbondata.spark.rdd.NewCarbonDataLoadRDD$$anon$1.(NewCarbonDataLoadRDD.scala:221) > > org.apache.carbondata.spark.rdd.NewCarbonDataLoadRDD.internalCompute(NewCarbonDataLoadRDD.scala:197) > org.apache.carbondata.spark.rdd.CarbonRDD.compute(CarbonRDD.scala:78) > org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:324) > org.apache.spark.rdd.RDD.iterator(RDD.scala:288) > org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:87) > org.apache.spark.scheduler.Task.run(Task.scala:99) > org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:325) > > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) > > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) > java.lang.Thread.run(Thread.java:748) > at > org.apache.spark.TaskContextImpl.invokeListeners(TaskContextImpl.scala:138) > at > org.apache.spark.TaskContextImpl.markTaskCompleted(TaskContextImpl.scala:116) > at org.apache.spark.scheduler.Task.run(Task.scala:109) > at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:325) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) > at java.lang.Thread.run(Thread.java:748) > > Expected : The dataloading should be successful from Spark-submit similar to > that in Beeline. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[GitHub] carbondata issue #2710: [2875]two different threads overwriting the same car...
Github user CarbonDataQA commented on the issue: https://github.com/apache/carbondata/pull/2710 Can one of the admins verify this patch? ---
[GitHub] carbondata issue #2710: [2875]two different threads overwriting the same car...
Github user CarbonDataQA commented on the issue: https://github.com/apache/carbondata/pull/2710 Can one of the admins verify this patch? ---
[GitHub] carbondata pull request #2710: [2875]two different threads overwriting the s...
GitHub user shardul-cr7 opened a pull request: https://github.com/apache/carbondata/pull/2710 [2875]two different threads overwriting the same carbondatafile Problem : Two different threads are overwriting the same carbondata file during creation of external table. Solution: Chances of two threads concurrently loading same carbondatafile is reduced by changing the timestamp attached in .carbondata file from millisecond to nanosecond.So chances of collision of different threads having the same file name is reduced. Be sure to do all of the following checklist to help us incorporate your contribution quickly and easily: - [ ] Any interfaces changed? - [ ] Any backward compatibility impacted? - [ ] Document update required? - [x] Testing done Done Manually - [ ] For large changes, please consider breaking it into sub-tasks under an umbrella JIRA. You can merge this pull request into a Git repository by running: $ git pull https://github.com/shardul-cr7/carbondata master Alternatively you can review and apply these changes as the patch at: https://github.com/apache/carbondata/pull/2710.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #2710 commit 61520a3d0bacfbcbed5a2b5ac300f08cf9b36bb4 Author: shardul-cr7 Date: 2018-09-11T12:51:09Z chances of two different threads overwriting the same carbondatafile is reduced ---
[GitHub] carbondata issue #2628: [CARBONDATA-2851][CARBONDATA-2852] Support zstd as c...
Github user CarbonDataQA commented on the issue: https://github.com/apache/carbondata/pull/2628 Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder2.1/233/ ---
[GitHub] carbondata issue #2628: [CARBONDATA-2851][CARBONDATA-2852] Support zstd as c...
Github user xuchuanyin commented on the issue: https://github.com/apache/carbondata/pull/2628 @ravipesala yeah, that's what I'm doing now. please check the commit: https://github.com/apache/carbondata/pull/2628/commits/d21fd869d442f535e4704dc06d9edc2f01984cb0 ---
[GitHub] carbondata issue #2705: [CARBONDATA-2926] fixed ArrayIndexOutOfBoundExceptio...
Github user CarbonDataQA commented on the issue: https://github.com/apache/carbondata/pull/2705 Build Success with Spark 2.2.1, Please check CI http://95.216.28.178:8080/job/ApacheCarbonPRBuilder1/401/ ---
[GitHub] carbondata issue #2705: [CARBONDATA-2926] fixed ArrayIndexOutOfBoundExceptio...
Github user CarbonDataQA commented on the issue: https://github.com/apache/carbondata/pull/2705 Build Failed with Spark 2.3.1, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder2.3/8471/ ---
[GitHub] carbondata issue #2628: [CARBONDATA-2851][CARBONDATA-2852] Support zstd as c...
Github user CarbonDataQA commented on the issue: https://github.com/apache/carbondata/pull/2628 Build Failed with Spark 2.3.1, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder2.3/8470/ ---
[GitHub] carbondata issue #2628: [CARBONDATA-2851][CARBONDATA-2852] Support zstd as c...
Github user ravipesala commented on the issue: https://github.com/apache/carbondata/pull/2628 @xuchuanyin yes, we cannot get rid of enum. But add another optional field in `ChunkCompressionMeta` to take interface name. Just ignore the enum and read only interface name. @jackylk Please give your opinion on this. ---
[GitHub] carbondata issue #2628: [CARBONDATA-2851][CARBONDATA-2852] Support zstd as c...
Github user CarbonDataQA commented on the issue: https://github.com/apache/carbondata/pull/2628 Build Success with Spark 2.2.1, Please check CI http://95.216.28.178:8080/job/ApacheCarbonPRBuilder1/400/ ---
[GitHub] carbondata issue #2703: [CARBONDATA-2925]Wrong data displayed for spark file...
Github user CarbonDataQA commented on the issue: https://github.com/apache/carbondata/pull/2703 Build Failed with Spark 2.3.1, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder2.3/8469/ ---
[GitHub] carbondata issue #2628: [CARBONDATA-2851][CARBONDATA-2852] Support zstd as c...
Github user xuchuanyin commented on the issue: https://github.com/apache/carbondata/pull/2628 @ravipesala fine, I'll rework on this. The bad news is that the Enum 'CompressionCodec' in thrift is 'required', so even we do use it, we cannot get rid of it. The good news is that for legacy store, it is always snappy which makes it easier if we bypass this Enum. ---
[jira] [Resolved] (CARBONDATA-2909) Support Multiple User reading and writing through SDK.
[ https://issues.apache.org/jira/browse/CARBONDATA-2909?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ravindra Pesala resolved CARBONDATA-2909. - Resolution: Fixed Assignee: Kunal Kapoor Fix Version/s: 1.5.0 > Support Multiple User reading and writing through SDK. > -- > > Key: CARBONDATA-2909 > URL: https://issues.apache.org/jira/browse/CARBONDATA-2909 > Project: CarbonData > Issue Type: Improvement >Reporter: Kunal Kapoor >Assignee: Kunal Kapoor >Priority: Major > Fix For: 1.5.0 > > Time Spent: 16h > Remaining Estimate: 0h > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[GitHub] carbondata pull request #2678: [CARBONDATA-2909] Multi user support for SDK ...
Github user asfgit closed the pull request at: https://github.com/apache/carbondata/pull/2678 ---
[GitHub] carbondata issue #2678: [CARBONDATA-2909] Multi user support for SDK on S3
Github user ravipesala commented on the issue: https://github.com/apache/carbondata/pull/2678 LGTM ---
[GitHub] carbondata issue #2705: [CARBONDATA-2926] fixed ArrayIndexOutOfBoundExceptio...
Github user CarbonDataQA commented on the issue: https://github.com/apache/carbondata/pull/2705 Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder2.1/232/ ---
[GitHub] carbondata issue #2703: [CARBONDATA-2925]Wrong data displayed for spark file...
Github user CarbonDataQA commented on the issue: https://github.com/apache/carbondata/pull/2703 Build Success with Spark 2.2.1, Please check CI http://95.216.28.178:8080/job/ApacheCarbonPRBuilder1/399/ ---
[GitHub] carbondata issue #2709: [HOTFIX] Removed scala dependency from carbon core m...
Github user CarbonDataQA commented on the issue: https://github.com/apache/carbondata/pull/2709 Build Success with Spark 2.2.1, Please check CI http://95.216.28.178:8080/job/ApacheCarbonPRBuilder1/398/ ---
[GitHub] carbondata issue #2709: [HOTFIX] Removed scala dependency from carbon core m...
Github user CarbonDataQA commented on the issue: https://github.com/apache/carbondata/pull/2709 Build Failed with Spark 2.3.1, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder2.3/8468/ ---
[GitHub] carbondata issue #2670: [CARBONDATA-2917] Support binary datatype
Github user CarbonDataQA commented on the issue: https://github.com/apache/carbondata/pull/2670 Build Success with Spark 2.2.1, Please check CI http://95.216.28.178:8080/job/ApacheCarbonPRBuilder1/397/ ---
[GitHub] carbondata pull request #2705: [CARBONDATA-2926] fixed ArrayIndexOutOfBoundE...
Github user ajantha-bhat commented on a diff in the pull request: https://github.com/apache/carbondata/pull/2705#discussion_r216634097 --- Diff: core/src/main/java/org/apache/carbondata/core/datastore/TableSpec.java --- @@ -36,6 +37,14 @@ private DimensionSpec[] dimensionSpec; private MeasureSpec[] measureSpec; + // Many places we might have to access no-dictionary column spec. + // but no-dictionary column spec are not always in below order like, + // dictionary + no dictionary + complex + measure + // when sort_columns are empty, no columns are selected for sorting. + // so, spec will not be in above order. + // Hence NoDictionaryDimensionSpec will be useful and it will be subset of dimensionSpec. + private List NoDictionaryDimensionSpec; --- End diff -- done. ---
[GitHub] carbondata issue #2628: [CARBONDATA-2851][CARBONDATA-2852] Support zstd as c...
Github user CarbonDataQA commented on the issue: https://github.com/apache/carbondata/pull/2628 Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder2.1/231/ ---
[GitHub] carbondata issue #2703: [CARBONDATA-2925]Wrong data displayed for spark file...
Github user CarbonDataQA commented on the issue: https://github.com/apache/carbondata/pull/2703 Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder2.1/230/ ---
[GitHub] carbondata issue #2670: [CARBONDATA-2917] Support binary datatype
Github user CarbonDataQA commented on the issue: https://github.com/apache/carbondata/pull/2670 Build Failed with Spark 2.3.1, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder2.3/8467/ ---
[GitHub] carbondata issue #2628: [CARBONDATA-2851][CARBONDATA-2852] Support zstd as c...
Github user xuchuanyin commented on the issue: https://github.com/apache/carbondata/pull/2628 retest this please ---
[GitHub] carbondata issue #2678: [CARBONDATA-2909] Multi user support for SDK on S3
Github user CarbonDataQA commented on the issue: https://github.com/apache/carbondata/pull/2678 Build Success with Spark 2.3.1, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder2.3/8466/ ---
[GitHub] carbondata issue #2678: [CARBONDATA-2909] Multi user support for SDK on S3
Github user CarbonDataQA commented on the issue: https://github.com/apache/carbondata/pull/2678 Build Success with Spark 2.2.1, Please check CI http://95.216.28.178:8080/job/ApacheCarbonPRBuilder1/396/ ---
[GitHub] carbondata issue #2628: [CARBONDATA-2851][CARBONDATA-2852] Support zstd as c...
Github user CarbonDataQA commented on the issue: https://github.com/apache/carbondata/pull/2628 Build Failed with Spark 2.3.1, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder2.3/8465/ ---
[GitHub] carbondata issue #2628: [CARBONDATA-2851][CARBONDATA-2852] Support zstd as c...
Github user xuchuanyin commented on the issue: https://github.com/apache/carbondata/pull/2628 @ravipesala As for the implementation, is **duplicate the info and add another description for it by the side of current enum** OK? Or do you have another suggestion to implement this? ---
[GitHub] carbondata issue #2628: [CARBONDATA-2851][CARBONDATA-2852] Support zstd as c...
Github user CarbonDataQA commented on the issue: https://github.com/apache/carbondata/pull/2628 Build Success with Spark 2.2.1, Please check CI http://95.216.28.178:8080/job/ApacheCarbonPRBuilder1/395/ ---
[GitHub] carbondata issue #2709: [HOTFIX] Removed scala dependency from carbon core m...
Github user CarbonDataQA commented on the issue: https://github.com/apache/carbondata/pull/2709 Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder2.1/229/ ---
[GitHub] carbondata issue #2628: [CARBONDATA-2851][CARBONDATA-2852] Support zstd as c...
Github user ravipesala commented on the issue: https://github.com/apache/carbondata/pull/2628 @xuchuanyin I feel it is very necessary to save compressor name in thrift instead of enum. It will not be a good idea to change thrift for every compression support and also it limits the user to give their custom compressor interface while creating the table. ---
[GitHub] carbondata issue #2704: [HOTFIX] Old stores cannot read with new table infer...
Github user CarbonDataQA commented on the issue: https://github.com/apache/carbondata/pull/2704 Build Failed with Spark 2.3.1, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder2.3/8464/ ---
[GitHub] carbondata pull request #2705: [CARBONDATA-2926] fixed ArrayIndexOutOfBoundE...
Github user ravipesala commented on a diff in the pull request: https://github.com/apache/carbondata/pull/2705#discussion_r216620144 --- Diff: core/src/main/java/org/apache/carbondata/core/datastore/TableSpec.java --- @@ -36,6 +37,14 @@ private DimensionSpec[] dimensionSpec; private MeasureSpec[] measureSpec; + // Many places we might have to access no-dictionary column spec. + // but no-dictionary column spec are not always in below order like, + // dictionary + no dictionary + complex + measure + // when sort_columns are empty, no columns are selected for sorting. + // so, spec will not be in above order. + // Hence NoDictionaryDimensionSpec will be useful and it will be subset of dimensionSpec. + private List NoDictionaryDimensionSpec; --- End diff -- Better change name to `noDictionaryDimensionSpec` ---
[GitHub] carbondata issue #2607: [CARBONDATA-2818] Presto Upgrade to 0.206
Github user CarbonDataQA commented on the issue: https://github.com/apache/carbondata/pull/2607 Build Success with Spark 2.2.1, Please check CI http://95.216.28.178:8080/job/ApacheCarbonPRBuilder1/393/ ---
[GitHub] carbondata issue #2704: [HOTFIX] Old stores cannot read with new table infer...
Github user CarbonDataQA commented on the issue: https://github.com/apache/carbondata/pull/2704 Build Success with Spark 2.2.1, Please check CI http://95.216.28.178:8080/job/ApacheCarbonPRBuilder1/394/ ---