[GitHub] carbondata issue #2991: [CARBONDATA-3043] Add build script and add test case...
Github user CarbonDataQA commented on the issue: https://github.com/apache/carbondata/pull/2991 Build Failed with Spark 2.2.1, Please check CI http://95.216.28.178:8080/job/ApacheCarbonPRBuilder1/2466/ ---
[GitHub] carbondata issue #2991: [CARBONDATA-3043] Add build script and add test case...
Github user CarbonDataQA commented on the issue: https://github.com/apache/carbondata/pull/2991 Build Success with Spark 2.3.2, Please check CI http://136.243.101.176:8080/job/carbondataprbuilder2.3/10504/ ---
[jira] [Created] (CARBONDATA-3239) Throwing ArrayIndexOutOfBoundsException in DataSkewRangePartitioner
QiangCai created CARBONDATA-3239: Summary: Throwing ArrayIndexOutOfBoundsException in DataSkewRangePartitioner Key: CARBONDATA-3239 URL: https://issues.apache.org/jira/browse/CARBONDATA-3239 Project: CarbonData Issue Type: Bug Components: data-load Reporter: QiangCai 2019-01-10 15:31:21 ERROR DataLoadProcessorStepOnSpark$:367 - Data Loading failed for table carbon_range_column4 java.lang.ArrayIndexOutOfBoundsException: 1 at org.apache.spark.DataSkewRangePartitioner$$anonfun$initialize$1.apply$mcVI$sp(DataSkewRangePartitioner.scala:223) at scala.collection.immutable.Range.foreach$mVc$sp(Range.scala:160) at org.apache.spark.DataSkewRangePartitioner.initialize(DataSkewRangePartitioner.scala:222) at org.apache.spark.DataSkewRangePartitioner.getPartition(DataSkewRangePartitioner.scala:234) at org.apache.spark.shuffle.sort.BypassMergeSortShuffleWriter.write(BypassMergeSortShuffleWriter.java:151) at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:96) at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:53) at org.apache.spark.scheduler.Task.run(Task.scala:108) at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:338) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) at java.lang.Thread.run(Thread.java:748) -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[GitHub] carbondata issue #2991: [CARBONDATA-3043] Add build script and add test case...
Github user CarbonDataQA commented on the issue: https://github.com/apache/carbondata/pull/2991 Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder2.1/2247/ ---
[GitHub] carbondata pull request #2991: [CARBONDATA-3043] Add build script and add te...
Github user BJangir commented on a diff in the pull request: https://github.com/apache/carbondata/pull/2991#discussion_r246647432 --- Diff: docs/csdk-guide.md --- @@ -29,6 +29,32 @@ code and without CarbonSession. In the carbon jars package, there exist a carbondata-sdk.jar, including SDK reader for C++ SDK. + +##Compile/Build CSDK +CSDK supports cmake based compilation and has dependency list in CMakeLists.txt. + Prerequisites +GCC >=4.8.5 +Cmake >3.13 +Make >=4.1 + +Steps +1. Go to CSDK folder(/opt/.../CSDK/) +2. Create build folder . (/opt/.../CSDK/build) +3. Run Command from build folder `cmake ../` +4. `make` + +Test Cases are written in [main.cpp](https://github.com/apache/carbondata/blob/master/store/CSDK/test/main.cpp) with GoogleTest C++ Framework. +if GoogleTest LIBRARY is not added then compilation of example code will fail. Please follow below steps to solve the same +1. Remove test/main.cpp from SOURCE_FILES of CMakeLists.txt and compile/build again. +2. Follow below Steps to configure GoogleTest Framework +* Download googleTest release (CI is complied with 1.8) https://github.com/google/googletest/releases +* Extract to folder like /opt/googletest/googletest-release-1.8.1/ and create build folder inside this like /opt/googletest/googletest-release-1.8.1/googletest/build) --- End diff -- updated , please review again. ---
[GitHub] carbondata pull request #2991: [CARBONDATA-3043] Add build script and add te...
Github user BJangir commented on a diff in the pull request: https://github.com/apache/carbondata/pull/2991#discussion_r246645671 --- Diff: docs/csdk-guide.md --- @@ -40,6 +66,7 @@ release the memory and destroy JVM. C++ SDK support read batch row. User can set batch by using withBatch(int batch) before build, and read batch by using readNextBatchRow(). + --- End diff -- OK ---
[GitHub] carbondata pull request #2991: [CARBONDATA-3043] Add build script and add te...
Github user BJangir commented on a diff in the pull request: https://github.com/apache/carbondata/pull/2991#discussion_r246644993 --- Diff: docs/csdk-guide.md --- @@ -29,6 +29,32 @@ code and without CarbonSession. In the carbon jars package, there exist a carbondata-sdk.jar, including SDK reader for C++ SDK. + +##Compile/Build CSDK --- End diff -- OK ---
[GitHub] carbondata pull request #2991: [CARBONDATA-3043] Add build script and add te...
Github user BJangir commented on a diff in the pull request: https://github.com/apache/carbondata/pull/2991#discussion_r246644367 --- Diff: docs/csdk-guide.md --- @@ -29,6 +29,32 @@ code and without CarbonSession. In the carbon jars package, there exist a carbondata-sdk.jar, including SDK reader for C++ SDK. + +# Compile/Build CSDK --- End diff -- OK. ---
[GitHub] carbondata pull request #2991: [CARBONDATA-3043] Add build script and add te...
Github user BJangir commented on a diff in the pull request: https://github.com/apache/carbondata/pull/2991#discussion_r246643403 --- Diff: docs/csdk-guide.md --- @@ -29,6 +29,32 @@ code and without CarbonSession. In the carbon jars package, there exist a carbondata-sdk.jar, including SDK reader for C++ SDK. + +##Compile/Build CSDK +CSDK supports cmake based compilation and has dependency list in CMakeLists.txt. + Prerequisites +GCC >=4.8.5 +Cmake >3.13 +Make >=4.1 + +Steps +1. Go to CSDK folder(/opt/.../CSDK/) +2. Create build folder . (/opt/.../CSDK/build) +3. Run Command from build folder `cmake ../` +4. `make` --- End diff -- It is same like before not change. After `make` command you will get the executable program (named CSDK). so directly execute the same . ./CSDK . if Result to be redirected to xml then use command like ./CSDK --gtest_output="xml:${REPORT_PATH}/CSDK_Report.xml". ---
[GitHub] carbondata issue #3060: [HOTFIX] Exclude filter doesn't work in presto carbo...
Github user ajantha-bhat commented on the issue: https://github.com/apache/carbondata/pull/3060 @qiuchenjian : Test cases are there, But problem comes only in the cluster. Not in local environment due to jar dependency. ---
[GitHub] carbondata issue #3054: [CARBONDATA-3232] Add example and doc for alluxio in...
Github user CarbonDataQA commented on the issue: https://github.com/apache/carbondata/pull/3054 Build Success with Spark 2.2.1, Please check CI http://95.216.28.178:8080/job/ApacheCarbonPRBuilder1/2465/ ---
[GitHub] carbondata issue #3054: [CARBONDATA-3232] Add example and doc for alluxio in...
Github user CarbonDataQA commented on the issue: https://github.com/apache/carbondata/pull/3054 Build Success with Spark 2.3.2, Please check CI http://136.243.101.176:8080/job/carbondataprbuilder2.3/10503/ ---
[GitHub] carbondata issue #3001: [CARBONDATA-3220] Support presto to read stream segm...
Github user CarbonDataQA commented on the issue: https://github.com/apache/carbondata/pull/3001 Build Success with Spark 2.3.2, Please check CI http://136.243.101.176:8080/job/carbondataprbuilder2.3/10502/ ---
[GitHub] carbondata issue #3001: [CARBONDATA-3220] Support presto to read stream segm...
Github user CarbonDataQA commented on the issue: https://github.com/apache/carbondata/pull/3001 Build Success with Spark 2.2.1, Please check CI http://95.216.28.178:8080/job/ApacheCarbonPRBuilder1/2464/ ---
[GitHub] carbondata issue #3054: [CARBONDATA-3232] Add example and doc for alluxio in...
Github user CarbonDataQA commented on the issue: https://github.com/apache/carbondata/pull/3054 Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder2.1/2246/ ---
[GitHub] carbondata issue #3060: [HOTFIX] Exclude filter doesn't work in presto carbo...
Github user qiuchenjian commented on the issue: https://github.com/apache/carbondata/pull/3060 Is there test case to test this scene (use 'exclude filter' in presto carbon )ï¼ if notï¼ better to add test case, so that other's changes will not affect this feature. ---
[GitHub] carbondata issue #3001: [CARBONDATA-3220] Support presto to read stream segm...
Github user CarbonDataQA commented on the issue: https://github.com/apache/carbondata/pull/3001 Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder2.1/2245/ ---
[GitHub] carbondata issue #3001: [CARBONDATA-3220] Support presto to read stream segm...
Github user zzcclp commented on the issue: https://github.com/apache/carbondata/pull/3001 retest this please ---
[GitHub] carbondata issue #3001: [CARBONDATA-3220] Support presto to read stream segm...
Github user CarbonDataQA commented on the issue: https://github.com/apache/carbondata/pull/3001 Build Failed with Spark 2.2.1, Please check CI http://95.216.28.178:8080/job/ApacheCarbonPRBuilder1/2463/ ---
[GitHub] carbondata issue #3056: [CARBONDATA-3236] Fix for JVM Crash for insert into ...
Github user xuchuanyin commented on the issue: https://github.com/apache/carbondata/pull/3056 @manishnalla1994 > Solution: Check if any other RDD is sharing the same task context. If so, don't the clear the resource at that time, the other RDD which shared the context should clear the memory once after the task is finished. It seems in #2591, for data source table scenario, if the query and insert procedures also share the same context, it can also benefit from the implementation in #2591 without any changes. Right? ---
[GitHub] carbondata issue #3046: [CARBONDATA-3231] Fix OOM exception when dictionary ...
Github user xuchuanyin commented on the issue: https://github.com/apache/carbondata/pull/3046 We do not need to expose this threshold to the user. Instead, we can judge ourselves in carbondata. Step1. We can get the size of non-dictionary-encoded page (say M) and the size of dictionary-encoded page (say N). Step2: if M/N >=1 (or M/N >= 0.9), we can fallback automatically. Parquet (maybe ORC) behaves like this. ---
[GitHub] carbondata issue #3001: [CARBONDATA-3220] Support presto to read stream segm...
Github user CarbonDataQA commented on the issue: https://github.com/apache/carbondata/pull/3001 Build Success with Spark 2.3.2, Please check CI http://136.243.101.176:8080/job/carbondataprbuilder2.3/10501/ ---
[GitHub] carbondata pull request #3054: [CARBONDATA-3232] Add example and doc for all...
Github user xuchuanyin commented on a diff in the pull request: https://github.com/apache/carbondata/pull/3054#discussion_r246427391 --- Diff: examples/spark2/src/main/scala/org/apache/carbondata/examples/AlluxioExample.scala --- @@ -28,46 +33,86 @@ import org.apache.carbondata.examples.util.ExampleUtils /** * configure alluxio: * 1.start alluxio - * 2.upload the jar :"/alluxio_path/core/client/target/ - * alluxio-core-client-YOUR-VERSION-jar-with-dependencies.jar" - * 3.Get more detail at:http://www.alluxio.org/docs/master/en/Running-Spark-on-Alluxio.html + * 2.Get more detail at: https://www.alluxio.org/docs/1.8/en/compute/Spark.html */ - object AlluxioExample { - def main(args: Array[String]) { -val spark = ExampleUtils.createCarbonSession("AlluxioExample") -exampleBody(spark) -spark.close() + def main (args: Array[String]) { +val carbon = ExampleUtils.createCarbonSession("AlluxioExample", + storePath = "alluxio://localhost:19998/carbondata") +exampleBody(carbon) +carbon.close() } - def exampleBody(spark : SparkSession): Unit = { + def exampleBody (spark: SparkSession): Unit = { +val rootPath = new File(this.getClass.getResource("/").getPath + + "../../../..").getCanonicalPath spark.sparkContext.hadoopConfiguration.set("fs.alluxio.impl", "alluxio.hadoop.FileSystem") --- End diff -- So you need to mention this in the current document ---
[GitHub] carbondata issue #3060: [HOTFIX] Exclude filter doesn't work in presto carbo...
Github user CarbonDataQA commented on the issue: https://github.com/apache/carbondata/pull/3060 Build Success with Spark 2.2.1, Please check CI http://95.216.28.178:8080/job/ApacheCarbonPRBuilder1/2462/ ---
[GitHub] carbondata issue #3060: [HOTFIX] Exclude filter doesn't work in presto carbo...
Github user CarbonDataQA commented on the issue: https://github.com/apache/carbondata/pull/3060 Build Success with Spark 2.3.2, Please check CI http://136.243.101.176:8080/job/carbondataprbuilder2.3/10500/ ---
[GitHub] carbondata issue #3001: [CARBONDATA-3220] Support presto to read stream segm...
Github user CarbonDataQA commented on the issue: https://github.com/apache/carbondata/pull/3001 Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder2.1/2244/ ---
[GitHub] carbondata issue #3021: [CARBONDATA-3193] Cdh5.14.2 spark2.2.0 support
Github user chenliang613 commented on the issue: https://github.com/apache/carbondata/pull/3021 @chandrasaripaka please let us know 3026 if solved your issues? ---
[GitHub] carbondata issue #3060: [HOTFIX] Exclude filter doesn't work in presto carbo...
Github user CarbonDataQA commented on the issue: https://github.com/apache/carbondata/pull/3060 Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder2.1/2243/ ---
[GitHub] carbondata issue #3060: [HOTFIX] Exclude filter doesn't work in presto carbo...
Github user ajantha-bhat commented on the issue: https://github.com/apache/carbondata/pull/3060 @ravipesala : please check. ---
[GitHub] carbondata issue #3001: [CARBONDATA-3220] Support presto to read stream segm...
Github user CarbonDataQA commented on the issue: https://github.com/apache/carbondata/pull/3001 Build Success with Spark 2.3.2, Please check CI http://136.243.101.176:8080/job/carbondataprbuilder2.3/10499/ ---
[GitHub] carbondata pull request #3060: [HOTFIX] Exclude filter doesn't work in prest...
GitHub user ajantha-bhat opened a pull request: https://github.com/apache/carbondata/pull/3060 [HOTFIX] Exclude filter doesn't work in presto carbon in cluster **problem:** exclude filter fails in cluster for presto carbon with exception. ``` java.lang.NoClassDefFoundError: org/roaringbitmap/RoaringBitmap at org.apache.carbondata.core.scan.filter.FilterUtil.prepareExcludeFilterMembers(FilterUtil.java:826) at org.apache.carbondata.core.scan.filter.FilterUtil.getDimColumnFilterInfoAfterApplyingCBO(FilterUtil.java:776) at org.apache.carbondata.core.scan.filter.FilterUtil.getFilterListForAllValues(FilterUtil.java:884) ``` **cause:** RoaringBitmap jar is not added in the dependency, hence it is not present in the presto snapshot folder. **solution** : include RoaringBitmap in dependency. Be sure to do all of the following checklist to help us incorporate your contribution quickly and easily: - [ ] Any interfaces changed? NA - [ ] Any backward compatibility impacted? NA - [ ] Document update required? NA - [ ] Testing done. please find the report **Before:** ``` presto:default> select name from nbig where name < 'aj' limit 5; Query 20190109_131447_4_qhrfk failed: org/roaringbitmap/RoaringBitmap ``` **After:** ``` presto:default> select name from nbig where name < 'aj' limit 5; name 208 150209 150210 150211 150212 (5 rows) ``` - [ ] For large changes, please consider breaking it into sub-tasks under an umbrella JIRA. NA You can merge this pull request into a Git repository by running: $ git pull https://github.com/ajantha-bhat/carbondata issue_fix Alternatively you can review and apply these changes as the patch at: https://github.com/apache/carbondata/pull/3060.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #3060 commit b3041adb284a0aa30a64e55908e6b9904c29 Author: ajantha-bhat Date: 2019-01-09T13:26:10Z Fix Roaring bit map exception in presto filter query ---
[GitHub] carbondata issue #3001: [CARBONDATA-3220] Support presto to read stream segm...
Github user CarbonDataQA commented on the issue: https://github.com/apache/carbondata/pull/3001 Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder2.1/2242/ ---
[jira] [Resolved] (CARBONDATA-3237) optimize presto query time for dictionary include string column
[ https://issues.apache.org/jira/browse/CARBONDATA-3237?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] kumar vishal resolved CARBONDATA-3237. -- Resolution: Fixed > optimize presto query time for dictionary include string column > --- > > Key: CARBONDATA-3237 > URL: https://issues.apache.org/jira/browse/CARBONDATA-3237 > Project: CarbonData > Issue Type: Bug >Reporter: Ajantha Bhat >Assignee: Ajantha Bhat >Priority: Major > Time Spent: 2h 50m > Remaining Estimate: 0h > > optimize presto query time for dictionary include string column. > > problem: currently, for each query, presto carbon creates dictionary block > for string columns. > This happens for each query and if cardinality is more , it takes more time > to build. This is not required. we can lookup using normal dictionary lookup. > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[GitHub] carbondata pull request #3059: [HOTFIX][DataLoad]fix task assignment issue u...
Github user kevinjmh commented on a diff in the pull request: https://github.com/apache/carbondata/pull/3059#discussion_r246376652 --- Diff: processing/src/main/java/org/apache/carbondata/processing/util/CarbonLoaderUtil.java --- @@ -609,6 +609,14 @@ public static Dictionary getDictionary(AbsoluteTableIdentifier absoluteTableIden blockAssignmentStrategy = BlockAssignmentStrategy.BLOCK_SIZE_FIRST; } else { blockAssignmentStrategy = BlockAssignmentStrategy.BLOCK_NUM_FIRST; + // fall back to BLOCK_NUM_FIRST strategy need to reset + // the average expected size for each node + if (blockInfos.size() > 0) { --- End diff -- could be set to some value if use NODE_MIN_SIZE_FIRST but fall back to BLOCK_NUM_FIRST ---
[GitHub] carbondata issue #3059: [HOTFIX][DataLoad]fix task assignment issue using NO...
Github user CarbonDataQA commented on the issue: https://github.com/apache/carbondata/pull/3059 Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder2.1/2241/ ---
[GitHub] carbondata issue #3059: [HOTFIX][DataLoad]fix task assignment issue using NO...
Github user CarbonDataQA commented on the issue: https://github.com/apache/carbondata/pull/3059 Build Success with Spark 2.3.2, Please check CI http://136.243.101.176:8080/job/carbondataprbuilder2.3/10497/ ---
[GitHub] carbondata issue #3054: [CARBONDATA-3232] Add example and doc for alluxio in...
Github user CarbonDataQA commented on the issue: https://github.com/apache/carbondata/pull/3054 Build Success with Spark 2.3.2, Please check CI http://136.243.101.176:8080/job/carbondataprbuilder2.3/10498/ ---
[GitHub] carbondata issue #3055: [CARBONDATA-3237] Fix presto carbon issues in dictio...
Github user kumarvishal09 commented on the issue: https://github.com/apache/carbondata/pull/3055 LGTM ---
[GitHub] carbondata pull request #3055: [CARBONDATA-3237] Fix presto carbon issues in...
Github user asfgit closed the pull request at: https://github.com/apache/carbondata/pull/3055 ---
[jira] [Resolved] (CARBONDATA-3200) No-Sort Compaction
[ https://issues.apache.org/jira/browse/CARBONDATA-3200?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] kumar vishal resolved CARBONDATA-3200. -- Resolution: Fixed > No-Sort Compaction > -- > > Key: CARBONDATA-3200 > URL: https://issues.apache.org/jira/browse/CARBONDATA-3200 > Project: CarbonData > Issue Type: New Feature > Components: core >Reporter: Naman Rastogi >Assignee: Naman Rastogi >Priority: Major > Time Spent: 14h > Remaining Estimate: 0h > > When the data is loaded with SORT_SCOPE as NO_SORT, and done compaction upon, > the data still remains unsorted. This does not affect much in query. The > major purpose of compaction, is better pack the data and improve query > performance. > > Now, the expected behaviour of compaction is sort to the data, so that after > compaction, query performance becomes better. The columns to sort upon are > provided by SORT_COLUMNS. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[GitHub] carbondata pull request #3029: [CARBONDATA-3200] No-Sort compaction
Github user asfgit closed the pull request at: https://github.com/apache/carbondata/pull/3029 ---
[GitHub] carbondata issue #3029: [CARBONDATA-3200] No-Sort compaction
Github user kumarvishal09 commented on the issue: https://github.com/apache/carbondata/pull/3029 LGTM ---
[GitHub] carbondata issue #3059: [HOTFIX][DataLoad]fix task assignment issue using NO...
Github user CarbonDataQA commented on the issue: https://github.com/apache/carbondata/pull/3059 Build Success with Spark 2.2.1, Please check CI http://95.216.28.178:8080/job/ApacheCarbonPRBuilder1/2460/ ---
[GitHub] carbondata issue #3029: [CARBONDATA-3200] No-Sort compaction
Github user CarbonDataQA commented on the issue: https://github.com/apache/carbondata/pull/3029 Build Success with Spark 2.3.2, Please check CI http://136.243.101.176:8080/job/carbondataprbuilder2.3/10496/ ---
[GitHub] carbondata issue #3001: [CARBONDATA-3220] Support presto to read stream segm...
Github user CarbonDataQA commented on the issue: https://github.com/apache/carbondata/pull/3001 Build Success with Spark 2.2.1, Please check CI http://95.216.28.178:8080/job/ApacheCarbonPRBuilder1/2461/ ---
[GitHub] carbondata issue #3029: [CARBONDATA-3200] No-Sort compaction
Github user CarbonDataQA commented on the issue: https://github.com/apache/carbondata/pull/3029 Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder2.1/2240/ ---
[GitHub] carbondata issue #3055: [CARBONDATA-3237] Fix presto carbon issues in dictio...
Github user CarbonDataQA commented on the issue: https://github.com/apache/carbondata/pull/3055 Build Success with Spark 2.3.2, Please check CI http://136.243.101.176:8080/job/carbondataprbuilder2.3/10495/ ---
[GitHub] carbondata issue #3046: [CARBONDATA-3231] Fix OOM exception when dictionary ...
Github user kumarvishal09 commented on the issue: https://github.com/apache/carbondata/pull/3046 @xuchuanyin In near future we are planing to change threshold(currently based on number) to size based local dictionary. Size based threshold will give more control. Current changes in the PR is helping in doing that. Later Just have to expose the table property in create table command for user to control the size threshold. Also didn't get the meaning of your comment. these changes are minimal now also. ---
[GitHub] carbondata pull request #3059: [HOTFIX][DataLoad]fix task assignment issue u...
Github user qiuchenjian commented on a diff in the pull request: https://github.com/apache/carbondata/pull/3059#discussion_r246352937 --- Diff: processing/src/main/java/org/apache/carbondata/processing/util/CarbonLoaderUtil.java --- @@ -609,6 +609,14 @@ public static Dictionary getDictionary(AbsoluteTableIdentifier absoluteTableIden blockAssignmentStrategy = BlockAssignmentStrategy.BLOCK_SIZE_FIRST; } else { blockAssignmentStrategy = BlockAssignmentStrategy.BLOCK_NUM_FIRST; + // fall back to BLOCK_NUM_FIRST strategy need to reset + // the average expected size for each node + if (blockInfos.size() > 0) { --- End diff -- ```suggestion if (numOfNodes > 0) { ``` if blockInfos.size() = 0 ï¼ sizePerNode will be 0ï¼ so no need to add if ... else ... Do numOfNodes need to consider be 0? ---
[GitHub] carbondata issue #3054: [CARBONDATA-3232] Add example and doc for alluxio in...
Github user CarbonDataQA commented on the issue: https://github.com/apache/carbondata/pull/3054 Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder2.1/2239/ ---
[jira] [Resolved] (CARBONDATA-3236) JVM Crash for insert into new table from old table
[ https://issues.apache.org/jira/browse/CARBONDATA-3236?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] kumar vishal resolved CARBONDATA-3236. -- Resolution: Fixed > JVM Crash for insert into new table from old table > -- > > Key: CARBONDATA-3236 > URL: https://issues.apache.org/jira/browse/CARBONDATA-3236 > Project: CarbonData > Issue Type: Bug >Reporter: MANISH NALLA >Assignee: MANISH NALLA >Priority: Minor > Time Spent: 1h 40m > Remaining Estimate: 0h > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[GitHub] carbondata pull request #3056: [CARBONDATA-3236] Fix for JVM Crash for inser...
Github user asfgit closed the pull request at: https://github.com/apache/carbondata/pull/3056 ---
[GitHub] carbondata issue #3055: [CARBONDATA-3237] Fix presto carbon issues in dictio...
Github user CarbonDataQA commented on the issue: https://github.com/apache/carbondata/pull/3055 Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder2.1/2238/ ---
[GitHub] carbondata issue #3056: [CARBONDATA-3236] Fix for JVM Crash for insert into ...
Github user kumarvishal09 commented on the issue: https://github.com/apache/carbondata/pull/3056 LGTM ---
[GitHub] carbondata issue #3054: [CARBONDATA-3232] Add example and doc for alluxio in...
Github user CarbonDataQA commented on the issue: https://github.com/apache/carbondata/pull/3054 Build Success with Spark 2.2.1, Please check CI http://95.216.28.178:8080/job/ApacheCarbonPRBuilder1/2459/ ---
[GitHub] carbondata issue #3058: [WIP][CARBONDATA-3238] Solve StackOverflowError usin...
Github user CarbonDataQA commented on the issue: https://github.com/apache/carbondata/pull/3058 Build Success with Spark 2.3.2, Please check CI http://136.243.101.176:8080/job/carbondataprbuilder2.3/10494/ ---
[GitHub] carbondata issue #3056: [CARBONDATA-3236] Fix for JVM Crash for insert into ...
Github user ravipesala commented on the issue: https://github.com/apache/carbondata/pull/3056 LGTM ---
[GitHub] carbondata issue #3055: [CARBONDATA-3237] Fix presto carbon issues in dictio...
Github user CarbonDataQA commented on the issue: https://github.com/apache/carbondata/pull/3055 Build Success with Spark 2.2.1, Please check CI http://95.216.28.178:8080/job/ApacheCarbonPRBuilder1/2458/ ---
[GitHub] carbondata issue #3001: [CARBONDATA-3220] Support presto to read stream segm...
Github user ravipesala commented on the issue: https://github.com/apache/carbondata/pull/3001 LGTM ---
[GitHub] carbondata issue #3001: [CARBONDATA-3220] Support presto to read stream segm...
Github user QiangCai commented on the issue: https://github.com/apache/carbondata/pull/3001 @ravipesala reverted ---
[GitHub] carbondata issue #3037: [CARBONDATA-3190] Open example module code style che...
Github user CarbonDataQA commented on the issue: https://github.com/apache/carbondata/pull/3037 Build Success with Spark 2.3.2, Please check CI http://136.243.101.176:8080/job/carbondataprbuilder2.3/10493/ ---
[GitHub] carbondata issue #3001: [CARBONDATA-3220] Support presto to read stream segm...
Github user ravipesala commented on the issue: https://github.com/apache/carbondata/pull/3001 @QiangCai Please don't add binary files :( . you supposed to generate files and execute the test ---
[GitHub] carbondata issue #3029: [CARBONDATA-3200] No-Sort compaction
Github user CarbonDataQA commented on the issue: https://github.com/apache/carbondata/pull/3029 Build Success with Spark 2.2.1, Please check CI http://95.216.28.178:8080/job/ApacheCarbonPRBuilder1/2457/ ---
[GitHub] carbondata issue #3055: [CARBONDATA-3237] Fix presto carbon issues in dictio...
Github user ravipesala commented on the issue: https://github.com/apache/carbondata/pull/3055 LGTM ---
[GitHub] carbondata issue #3037: [CARBONDATA-3190] Open example module code style che...
Github user CarbonDataQA commented on the issue: https://github.com/apache/carbondata/pull/3037 Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder2.1/2237/ ---
[GitHub] carbondata pull request #3029: [CARBONDATA-3200] No-Sort compaction
Github user NamanRastogi commented on a diff in the pull request: https://github.com/apache/carbondata/pull/3029#discussion_r246329602 --- Diff: processing/src/main/java/org/apache/carbondata/processing/merger/CarbonCompactionUtil.java --- @@ -400,24 +417,53 @@ private static int getDimensionDefaultCardinality(CarbonDimension dimension) { * @param tableLastUpdatedTime * @return */ - public static boolean checkIfAnyRestructuredBlockExists(Map segmentMapping, - Map> dataFileMetadataSegMapping, long tableLastUpdatedTime) { -boolean restructuredBlockExists = false; -for (Map.Entry taskMap : segmentMapping.entrySet()) { - String segmentId = taskMap.getKey(); + public static boolean checkIfAnyRestructuredBlockExists( + Map segmentMapping, + Map> dataFileMetadataSegMapping, + long tableLastUpdatedTime) { + +for (Map.Entry segmentEntry : segmentMapping.entrySet()) { + String segmentId = segmentEntry.getKey(); List listMetadata = dataFileMetadataSegMapping.get(segmentId); - for (DataFileFooter dataFileFooter : listMetadata) { -// if schema modified timestamp is greater than footer stored schema timestamp, -// it indicates it is a restructured block -if (tableLastUpdatedTime > dataFileFooter.getSchemaUpdatedTimeStamp()) { - restructuredBlockExists = true; - break; -} + + if (isRestructured(listMetadata, tableLastUpdatedTime)) { +return true; } - if (restructuredBlockExists) { -break; +} + +return false; + } + + public static boolean isRestructured(List listMetadata, + long tableLastUpdatedTime) { +/* + * TODO: only in case of add and drop this variable should be true + */ +for (DataFileFooter dataFileFooter : listMetadata) { + // if schema modified timestamp is greater than footer stored schema timestamp, + // it indicates it is a restructured block + if (tableLastUpdatedTime > dataFileFooter.getSchemaUpdatedTimeStamp()) { +return true; } } -return restructuredBlockExists; +return false; } + + public static boolean isSorted(TaskBlockInfo taskBlockInfo) throws IOException { +String filePath = + taskBlockInfo.getAllTableBlockInfoList().iterator().next().get(0).getFilePath(); +long fileSize = +FileFactory.getCarbonFile(filePath, FileFactory.getFileType(filePath)).getSize(); + +FileReader fileReader = FileFactory.getFileHolder(FileFactory.getFileType(filePath)); +ByteBuffer buffer = + fileReader.readByteBuffer(FileFactory.getUpdatedFilePath(filePath), fileSize - 8, 8); +fileReader.finish(); + +CarbonFooterReaderV3 footerReader = new CarbonFooterReaderV3(filePath, buffer.getLong()); +FileFooter3 footer = footerReader.readFooterVersion3(); + +return footer.isIs_sort(); --- End diff -- Done. ---
[GitHub] carbondata pull request #3029: [CARBONDATA-3200] No-Sort compaction
Github user NamanRastogi commented on a diff in the pull request: https://github.com/apache/carbondata/pull/3029#discussion_r246329142 --- Diff: processing/src/main/java/org/apache/carbondata/processing/merger/CarbonCompactionExecutor.java --- @@ -105,10 +105,15 @@ public CarbonCompactionExecutor(Map segmentMapping, * * @return List of Carbon iterators --- End diff -- Done ---
[GitHub] carbondata issue #3001: [CARBONDATA-3220] Support presto to read stream segm...
Github user CarbonDataQA commented on the issue: https://github.com/apache/carbondata/pull/3001 Build Success with Spark 2.2.1, Please check CI http://95.216.28.178:8080/job/ApacheCarbonPRBuilder1/2453/ ---
[GitHub] carbondata issue #3058: [WIP][CARBONDATA-3238] Solve StackOverflowError usin...
Github user CarbonDataQA commented on the issue: https://github.com/apache/carbondata/pull/3058 Build Success with Spark 2.2.1, Please check CI http://95.216.28.178:8080/job/ApacheCarbonPRBuilder1/2456/ ---
[GitHub] carbondata issue #3029: [CARBONDATA-3200] No-Sort compaction
Github user ravipesala commented on the issue: https://github.com/apache/carbondata/pull/3029 LGTM ---
[GitHub] carbondata issue #3058: [WIP][CARBONDATA-3238] Solve StackOverflowError usin...
Github user CarbonDataQA commented on the issue: https://github.com/apache/carbondata/pull/3058 Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder2.1/2236/ ---
[GitHub] carbondata issue #3058: [WIP][CARBONDATA-3238] Solve StackOverflowError usin...
Github user CarbonDataQA commented on the issue: https://github.com/apache/carbondata/pull/3058 Build Success with Spark 2.3.2, Please check CI http://136.243.101.176:8080/job/carbondataprbuilder2.3/10490/ ---
[GitHub] carbondata issue #3037: [CARBONDATA-3190] Open example module code style che...
Github user CarbonDataQA commented on the issue: https://github.com/apache/carbondata/pull/3037 Build Success with Spark 2.2.1, Please check CI http://95.216.28.178:8080/job/ApacheCarbonPRBuilder1/2455/ ---
[GitHub] carbondata issue #3001: [CARBONDATA-3220] Support presto to read stream segm...
Github user CarbonDataQA commented on the issue: https://github.com/apache/carbondata/pull/3001 Build Success with Spark 2.3.2, Please check CI http://136.243.101.176:8080/job/carbondataprbuilder2.3/10491/ ---
[GitHub] carbondata pull request #3055: [CARBONDATA-3237] Fix presto carbon issues in...
Github user ajantha-bhat commented on a diff in the pull request: https://github.com/apache/carbondata/pull/3055#discussion_r246319596 --- Diff: integration/presto/src/main/java/org/apache/carbondata/presto/readers/SliceStreamReader.java --- @@ -95,22 +105,14 @@ public SliceStreamReader(int batchSize, DataType dataType, dictOffsets[dictOffsets.length - 1] = size; dictionaryBlock = new VariableWidthBlock(dictionary.getDictionarySize(), Slices.wrappedBuffer(singleArrayDictValues), dictOffsets, Optional.of(nulls)); -values = (int[]) ((CarbonColumnVectorImpl) getDictionaryVector()).getDataArray(); +this.isLocalDict = true; } - @Override public void setBatchSize(int batchSize) { + --- End diff -- done ---
[GitHub] carbondata pull request #3055: [CARBONDATA-3237] Fix presto carbon issues in...
Github user ajantha-bhat commented on a diff in the pull request: https://github.com/apache/carbondata/pull/3055#discussion_r246318548 --- Diff: integration/presto/src/main/java/org/apache/carbondata/presto/readers/SliceStreamReader.java --- @@ -142,5 +144,17 @@ public SliceStreamReader(int batchSize, DataType dataType, @Override public void reset() { builder = type.createBlockBuilder(null, batchSize); +this.isLocalDict = false; + } + + @Override public void putInt(int rowId, int value) { +Object data = DataTypeUtil --- End diff -- putInt() will not be called incase of local dictionary as setDictionary() itself is filling all the values array. Hence no impact with change to local dictionary. Also local dictionary UT are present and running fine after the changes ---
[GitHub] carbondata pull request #3059: [HOTFIX][DataLoad]fix task assignment issue u...
Github user KanakaKumar commented on a diff in the pull request: https://github.com/apache/carbondata/pull/3059#discussion_r246317841 --- Diff: processing/src/main/java/org/apache/carbondata/processing/util/CarbonLoaderUtil.java --- @@ -609,6 +613,9 @@ public static Dictionary getDictionary(AbsoluteTableIdentifier absoluteTableIden blockAssignmentStrategy = BlockAssignmentStrategy.BLOCK_SIZE_FIRST; } else { blockAssignmentStrategy = BlockAssignmentStrategy.BLOCK_NUM_FIRST; + // fall back to BLOCK_NUM_FIRST strategy need to reset + // the average expected size for each node + sizePerNode = numberOfBlocksPerNode; --- End diff -- assignLeftOverBlocks also needs this similar if else self checks. I think its ok, you can take a call to refactor now or later. ---
[GitHub] carbondata pull request #3059: [HOTFIX][DataLoad]fix task assignment issue u...
Github user ndwangsen commented on a diff in the pull request: https://github.com/apache/carbondata/pull/3059#discussion_r246317595 --- Diff: processing/src/main/java/org/apache/carbondata/processing/util/CarbonLoaderUtil.java --- @@ -575,19 +575,23 @@ public static Dictionary getDictionary(AbsoluteTableIdentifier absoluteTableIden } // calculate the average expected size for each node -long sizePerNode = 0; +long numberOfBlocksPerNode = 0; +if (blockInfos.size() > 0) { + numberOfBlocksPerNode = blockInfos.size() / numOfNodes; +} +numberOfBlocksPerNode = numberOfBlocksPerNode <= 0 ? 1 : numberOfBlocksPerNode; +long dataSizePerNode = 0; long totalFileSize = 0; +for (Distributable blockInfo : uniqueBlocks) { + totalFileSize += ((TableBlockInfo) blockInfo).getBlockLength(); +} +dataSizePerNode = totalFileSize / numOfNodes; +long sizePerNode = 0; if (BlockAssignmentStrategy.BLOCK_NUM_FIRST == blockAssignmentStrategy) { - if (blockInfos.size() > 0) { -sizePerNode = blockInfos.size() / numOfNodes; - } - sizePerNode = sizePerNode <= 0 ? 1 : sizePerNode; + sizePerNode = numberOfBlocksPerNode; --- End diff -- this modify i think is ok , if using BLOCK_NUM_FIRST block assignment strategy ---
[GitHub] carbondata issue #3059: [HOTFIX][DataLoad]fix task assignment issue using NO...
Github user CarbonDataQA commented on the issue: https://github.com/apache/carbondata/pull/3059 Build Failed with Spark 2.3.2, Please check CI http://136.243.101.176:8080/job/carbondataprbuilder2.3/10492/ ---
[GitHub] carbondata issue #3056: [CARBONDATA-3236] Fix for JVM Crash for insert into ...
Github user manishnalla1994 commented on the issue: https://github.com/apache/carbondata/pull/3056 @xuchuanyin Datasource table uses direct filling flow. As in direct flow there is no intermediate buffer so we are not using off-heap to store the page data in memory(filling all the records of a page to vector instead of filling batch wise). So in this case we can remove freeing of unsafe memory for Query as its not required. In case of stored by table, handling will be different as we support both batch wise filling and direct filling and for batch filling we are using unsafe, so we have to clear unsafe memory in this case. Here same handling is not required for data source table. Please refer https://github.com/apache/carbondata/pull/2591 for stored by handling of this issue. ---
[GitHub] carbondata issue #3059: [HOTFIX][DataLoad]fix task assignment issue using NO...
Github user CarbonDataQA commented on the issue: https://github.com/apache/carbondata/pull/3059 Build Failed with Spark 2.2.1, Please check CI http://95.216.28.178:8080/job/ApacheCarbonPRBuilder1/2454/ ---
[GitHub] carbondata pull request #3059: [HOTFIX][DataLoad]fix task assignment issue u...
Github user KanakaKumar commented on a diff in the pull request: https://github.com/apache/carbondata/pull/3059#discussion_r246311819 --- Diff: processing/src/main/java/org/apache/carbondata/processing/util/CarbonLoaderUtil.java --- @@ -575,19 +575,23 @@ public static Dictionary getDictionary(AbsoluteTableIdentifier absoluteTableIden } // calculate the average expected size for each node -long sizePerNode = 0; +long numberOfBlocksPerNode = 0; +if (blockInfos.size() > 0) { + numberOfBlocksPerNode = blockInfos.size() / numOfNodes; +} +numberOfBlocksPerNode = numberOfBlocksPerNode <= 0 ? 1 : numberOfBlocksPerNode; +long dataSizePerNode = 0; long totalFileSize = 0; +for (Distributable blockInfo : uniqueBlocks) { + totalFileSize += ((TableBlockInfo) blockInfo).getBlockLength(); +} +dataSizePerNode = totalFileSize / numOfNodes; +long sizePerNode = 0; if (BlockAssignmentStrategy.BLOCK_NUM_FIRST == blockAssignmentStrategy) { - if (blockInfos.size() > 0) { -sizePerNode = blockInfos.size() / numOfNodes; - } - sizePerNode = sizePerNode <= 0 ? 1 : sizePerNode; + sizePerNode = numberOfBlocksPerNode; --- End diff -- This if else can be complete avoided and use the correct variable in the method call for blocks allocation ---
[GitHub] carbondata pull request #3059: [HOTFIX][DataLoad]fix task assignment issue u...
Github user KanakaKumar commented on a diff in the pull request: https://github.com/apache/carbondata/pull/3059#discussion_r246311168 --- Diff: processing/src/main/java/org/apache/carbondata/processing/util/CarbonLoaderUtil.java --- @@ -609,6 +613,9 @@ public static Dictionary getDictionary(AbsoluteTableIdentifier absoluteTableIden blockAssignmentStrategy = BlockAssignmentStrategy.BLOCK_SIZE_FIRST; } else { blockAssignmentStrategy = BlockAssignmentStrategy.BLOCK_NUM_FIRST; + // fall back to BLOCK_NUM_FIRST strategy need to reset + // the average expected size for each node + sizePerNode = numberOfBlocksPerNode; --- End diff -- instead of reassigning the same variable, assignBlocksByDataLocality () can use numberOfBlocksPerNode directly? ---
[GitHub] carbondata pull request #3059: [HOTFIX][DataLoad]fix task assignment issue u...
Github user KanakaKumar commented on a diff in the pull request: https://github.com/apache/carbondata/pull/3059#discussion_r246309331 --- Diff: processing/src/main/java/org/apache/carbondata/processing/util/CarbonLoaderUtil.java --- @@ -575,19 +575,23 @@ public static Dictionary getDictionary(AbsoluteTableIdentifier absoluteTableIden } // calculate the average expected size for each node -long sizePerNode = 0; +long numberOfBlocksPerNode = 0; +if (blockInfos.size() > 0) { + numberOfBlocksPerNode = blockInfos.size() / numOfNodes; +} +numberOfBlocksPerNode = numberOfBlocksPerNode <= 0 ? 1 : numberOfBlocksPerNode; +long dataSizePerNode = 0; long totalFileSize = 0; +for (Distributable blockInfo : uniqueBlocks) { + totalFileSize += ((TableBlockInfo) blockInfo).getBlockLength(); +} +dataSizePerNode = totalFileSize / numOfNodes; +long sizePerNode = 0; if (BlockAssignmentStrategy.BLOCK_NUM_FIRST == blockAssignmentStrategy) { - if (blockInfos.size() > 0) { -sizePerNode = blockInfos.size() / numOfNodes; - } - sizePerNode = sizePerNode <= 0 ? 1 : sizePerNode; + sizePerNode = numberOfBlocksPerNode; --- End diff -- Please don't change sizePerNode variable ---
[GitHub] carbondata issue #3053: [CARBONDATA-3233]Fix JVM crash issue in snappy compr...
Github user akashrn5 commented on the issue: https://github.com/apache/carbondata/pull/3053 > Does this PR fix two problems? > If it is yes, better to separate it into two. > the one line change of rowId to rowId + 1 is coupled with this, when i removed the compress method in unSafeFixLengthColumnPage, i got this issue and fixed in this, so this is required in this PR only ---
[GitHub] carbondata issue #3053: [CARBONDATA-3233]Fix JVM crash issue in snappy compr...
Github user manishgupta88 commented on the issue: https://github.com/apache/carbondata/pull/3053 @kumarvishal09 ...I agree with you that it is a functional issue and we need to merge it. My point was before merging we can do one load performance test to see if there is any performance degrade and if there is any then we can update the benchmark results ---
[GitHub] carbondata issue #3059: [HOTFIX][DataLoad]fix task assignment issue using NO...
Github user CarbonDataQA commented on the issue: https://github.com/apache/carbondata/pull/3059 Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder2.1/2235/ ---
[GitHub] carbondata issue #3037: [CARBONDATA-3190] Open example module code style che...
Github user xubo245 commented on the issue: https://github.com/apache/carbondata/pull/3037 retest this please ---
[GitHub] carbondata issue #3058: [WIP][CARBONDATA-3238] Solve StackOverflowError usin...
Github user CarbonDataQA commented on the issue: https://github.com/apache/carbondata/pull/3058 Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder2.1/2233/ ---
[GitHub] carbondata issue #3001: [CARBONDATA-3220] Support presto to read stream segm...
Github user CarbonDataQA commented on the issue: https://github.com/apache/carbondata/pull/3001 Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder2.1/2234/ ---
[GitHub] carbondata issue #3053: [CARBONDATA-3233]Fix JVM crash issue in snappy compr...
Github user akashrn5 commented on the issue: https://github.com/apache/carbondata/pull/3053 @kumarvishal09 i have tested the fallback scenario by changing code, it is even failing with that also and i have raised discussion in snappy community also [https://groups.google.com/forum/#!topic/snappy-compression/4noNVKCMBqM](url) ---
[GitHub] carbondata issue #3054: [CARBONDATA-3232] Add example and doc for alluxio in...
Github user CarbonDataQA commented on the issue: https://github.com/apache/carbondata/pull/3054 Build Success with Spark 2.3.2, Please check CI http://136.243.101.176:8080/job/carbondataprbuilder2.3/10489/ ---
[GitHub] carbondata issue #3053: [CARBONDATA-3233]Fix JVM crash issue in snappy compr...
Github user kumarvishal09 commented on the issue: https://github.com/apache/carbondata/pull/3053 @manishgupta88 @xuchuanyin I think if it's really a problem with snappy then whether any performance impact is there or not we have to merge as its a functional issue. :) @akashrn5 May be this issue is coming because of offheap to onheap fallback in UnsafeMemoryManager can u please verify once. Please try discuss with snappy community also. ---
[GitHub] carbondata pull request #3059: [HOTFIX][DataLoad]fix task assignment issue u...
Github user ndwangsen commented on a diff in the pull request: https://github.com/apache/carbondata/pull/3059#discussion_r246299802 --- Diff: processing/src/main/java/org/apache/carbondata/processing/util/CarbonLoaderUtil.java --- @@ -1164,4 +1156,35 @@ private static void deleteFiles(List filesToBeDeleted) throws IOExceptio FileFactory.deleteFile(filePath, FileFactory.getFileType(filePath)); } } + + /** + * This method will calculate the average expected size for each node + * + * @param blockInfos blocks + * @param uniqueBlocks unique blocks + * @param numOfNodes if number of nodes has to be decided + * based on block location information + * @param blockAssignmentStrategy strategy used to assign blocks + * @return the average expected size for each node + */ + private static long calcAvgLoadSizePerNode(List blockInfos, --- End diff -- ok, i modify it ---
[GitHub] carbondata pull request #3059: [HOTFIX][DataLoad]fix task assignment issue u...
Github user ndwangsen commented on a diff in the pull request: https://github.com/apache/carbondata/pull/3059#discussion_r246299700 --- Diff: processing/src/main/java/org/apache/carbondata/processing/util/CarbonLoaderUtil.java --- @@ -609,6 +597,10 @@ public static Dictionary getDictionary(AbsoluteTableIdentifier absoluteTableIden blockAssignmentStrategy = BlockAssignmentStrategy.BLOCK_SIZE_FIRST; } else { blockAssignmentStrategy = BlockAssignmentStrategy.BLOCK_NUM_FIRST; + // fall back to BLOCK_NUM_FIRST strategy need to recalculate + // the average expected size for each node + sizePerNode = calcAvgLoadSizePerNode(blockInfos,uniqueBlocks, --- End diff -- ok, i modify it. ---
[GitHub] carbondata issue #3001: [CARBONDATA-3220] Support presto to read stream segm...
Github user QiangCai commented on the issue: https://github.com/apache/carbondata/pull/3001 @ravipesala added test case for reading stream table ---
[jira] [Resolved] (CARBONDATA-3235) AlterTableRename and PreAgg Datamap Fail Issue
[ https://issues.apache.org/jira/browse/CARBONDATA-3235?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] kumar vishal resolved CARBONDATA-3235. -- Resolution: Fixed > AlterTableRename and PreAgg Datamap Fail Issue > -- > > Key: CARBONDATA-3235 > URL: https://issues.apache.org/jira/browse/CARBONDATA-3235 > Project: CarbonData > Issue Type: Bug >Reporter: Naman Rastogi >Assignee: Naman Rastogi >Priority: Minor > Time Spent: 40m > Remaining Estimate: 0h > > h3. Alter Table Rename Table Fail > * When table rename is success in hive, but failed in carbon data store, it > would throw exception, but would not go back and undo rename in hive. > h3. Create-Preagregate-Datamap Fail > * When (preaggregate) datamap schema is written, but table updation is failed > -> call CarbonDropDataMapCommand.processMetadata() > -> call dropDataMapFromSystemFolder() -> this is supposed to delete the > folder on disk, but doesnt as the datamap is not yet updated in table, and > throws NoSuchDataMapException -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[GitHub] carbondata pull request #2996: [CARBONDATA-3235] Fix Rename-Fail & Datamap-c...
Github user asfgit closed the pull request at: https://github.com/apache/carbondata/pull/2996 ---
[GitHub] carbondata issue #2996: [CARBONDATA-3235] Fix Rename-Fail & Datamap-creation...
Github user kumarvishal09 commented on the issue: https://github.com/apache/carbondata/pull/2996 LGTM ---
[GitHub] carbondata pull request #3032: [CARBONDATA-3210] Merge common method into Ca...
Github user asfgit closed the pull request at: https://github.com/apache/carbondata/pull/3032 ---