[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3974: [Carbondata-3999] Fix permission issue of indexServerTmp directory
CarbonDataQA1 commented on pull request #3974: URL: https://github.com/apache/carbondata/pull/3974#issuecomment-706011556 Build Failed with Spark 2.4.5, Please check CI http://121.244.95.60:12545/job/ApacheCarbon_PR_Builder_2.4.5/2593/ This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3974: [Carbondata-3999] Fix permission issue of indexServerTmp directory
CarbonDataQA1 commented on pull request #3974: URL: https://github.com/apache/carbondata/pull/3974#issuecomment-706011857 Build Failed with Spark 2.3.4, Please check CI http://121.244.95.60:12545/job/ApacheCarbonPRBuilder2.3/4343/ This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[jira] [Created] (CARBONDATA-4025) strage size of MV is double to that of a table.
suyash yadav created CARBONDATA-4025: Summary: strage size of MV is double to that of a table. Key: CARBONDATA-4025 URL: https://issues.apache.org/jira/browse/CARBONDATA-4025 Project: CarbonData Issue Type: Improvement Components: core Affects Versions: 2.0.1 Environment: Apcahe carbondata 2.0.1 Apache spark 2.4.5 Hadoop 2.7.2 Reporter: suyash yadav We are doing a POC based on carbondata but we have observed that when we create n MV on a table with timeseries function of same granualarity the MV takes double the space of the table. In my scenario, My table has 1.3 million records and MV also has same number of records but the size of the table is 3.6 MB but the size of the MV is around 6.5 MB. This is really important for us as critical business decision are getting affected due to this behaviour. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (CARBONDATA-4025) storage space for MV is double to that of a table on which MV has been created.
[ https://issues.apache.org/jira/browse/CARBONDATA-4025?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] suyash yadav updated CARBONDATA-4025: - Summary: storage space for MV is double to that of a table on which MV has been created. (was: strage size of MV is double to that of a table.) > storage space for MV is double to that of a table on which MV has been > created. > --- > > Key: CARBONDATA-4025 > URL: https://issues.apache.org/jira/browse/CARBONDATA-4025 > Project: CarbonData > Issue Type: Improvement > Components: core >Affects Versions: 2.0.1 > Environment: Apcahe carbondata 2.0.1 > Apache spark 2.4.5 > Hadoop 2.7.2 >Reporter: suyash yadav >Priority: Major > > We are doing a POC based on carbondata but we have observed that when we > create n MV on a table with timeseries function of same granualarity the MV > takes double the space of the table. > > In my scenario, My table has 1.3 million records and MV also has same number > of records but the size of the table is 3.6 MB but the size of the MV is > around 6.5 MB. > This is really important for us as critical business decision are getting > affected due to this behaviour. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[GitHub] [carbondata] marchpure commented on pull request #3934: [WIP] Support Global Unique Id for SegmentNo
marchpure commented on pull request #3934: URL: https://github.com/apache/carbondata/pull/3934#issuecomment-706046465 retest this please This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [carbondata] Karan980 commented on pull request #3974: [Carbondata-3999] Fix permission issue of indexServerTmp directory
Karan980 commented on pull request #3974: URL: https://github.com/apache/carbondata/pull/3974#issuecomment-706061668 retest this please This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [carbondata] kunal642 commented on pull request #3914: [CARBONDATA-3979] Added Hive local dictionary support example
kunal642 commented on pull request #3914: URL: https://github.com/apache/carbondata/pull/3914#issuecomment-706070572 retest this please This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3695: [WIP] partition optimization
CarbonDataQA1 commented on pull request #3695: URL: https://github.com/apache/carbondata/pull/3695#issuecomment-706079398 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3934: [WIP] Support Global Unique Id for SegmentNo
CarbonDataQA1 commented on pull request #3934: URL: https://github.com/apache/carbondata/pull/3934#issuecomment-706093225 Build Failed with Spark 2.3.4, Please check CI http://121.244.95.60:12545/job/ApacheCarbonPRBuilder2.3/4344/ This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3934: [WIP] Support Global Unique Id for SegmentNo
CarbonDataQA1 commented on pull request #3934: URL: https://github.com/apache/carbondata/pull/3934#issuecomment-706101096 Build Failed with Spark 2.4.5, Please check CI http://121.244.95.60:12545/job/ApacheCarbon_PR_Builder_2.4.5/2594/ This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3974: [Carbondata-3999] Fix permission issue of indexServerTmp directory
CarbonDataQA1 commented on pull request #3974: URL: https://github.com/apache/carbondata/pull/3974#issuecomment-706110498 Build Success with Spark 2.3.4, Please check CI http://121.244.95.60:12545/job/ApacheCarbonPRBuilder2.3/4346/ This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3974: [Carbondata-3999] Fix permission issue of indexServerTmp directory
CarbonDataQA1 commented on pull request #3974: URL: https://github.com/apache/carbondata/pull/3974#issuecomment-706111654 Build Success with Spark 2.4.5, Please check CI http://121.244.95.60:12545/job/ApacheCarbon_PR_Builder_2.4.5/2596/ This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [carbondata] nihal0107 opened a new pull request #3975: [CARBONDATA-3964] Added test case for select query without filter
nihal0107 opened a new pull request #3975: URL: https://github.com/apache/carbondata/pull/3975 ### Why is this PR needed? Added test case for select or select count query without filter to not prune with multi thread. ### What changes were proposed in this PR? Added test case for select or select count query without filter to not prune with multi thread. ### Does this PR introduce any user interface change? - No - Yes. (please explain the change and update document) ### Is any new testcase added? - Yes This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3914: [CARBONDATA-3979] Added Hive local dictionary support example
CarbonDataQA1 commented on pull request #3914: URL: https://github.com/apache/carbondata/pull/3914#issuecomment-706120209 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3975: [CARBONDATA-3964] Added test case for select query without filter
CarbonDataQA1 commented on pull request #3975: URL: https://github.com/apache/carbondata/pull/3975#issuecomment-706161324 Build Success with Spark 2.3.4, Please check CI http://121.244.95.60:12545/job/ApacheCarbonPRBuilder2.3/4348/ This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3975: [CARBONDATA-3964] Added test case for select query without filter
CarbonDataQA1 commented on pull request #3975: URL: https://github.com/apache/carbondata/pull/3975#issuecomment-706162827 Build Success with Spark 2.4.5, Please check CI http://121.244.95.60:12545/job/ApacheCarbon_PR_Builder_2.4.5/2598/ This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [carbondata] Indhumathi27 commented on a change in pull request #3969: [CARBONDATA-3932] [CARBONDATA-3903] change discovery.uri in presto guide and dml document update
Indhumathi27 commented on a change in pull request #3969: URL: https://github.com/apache/carbondata/pull/3969#discussion_r502405706 ## File path: docs/dml-of-carbondata.md ## @@ -447,13 +452,17 @@ CarbonData DML statements are documented here,which includes: ### DELETE - This command allows us to delete records from CarbonData table. + This command allows us to delete records from CarbonData table. Without providing expression, it will delete all the records from table. Review comment: this can be added after syntax, as a note This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3914: [CARBONDATA-3979] Added Hive local dictionary support example
CarbonDataQA1 commented on pull request #3914: URL: https://github.com/apache/carbondata/pull/3914#issuecomment-706120209 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [carbondata] QiangCai commented on pull request #3924: [CARBONDATA-3988] Allow SI creation on first dimension column
QiangCai commented on pull request #3924: URL: https://github.com/apache/carbondata/pull/3924#issuecomment-706004725 I don't understand the advantage of this change. maybe we need to improve the use logic of both SI and the main index. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3975: [CARBONDATA-3964] Added test case for select query without filter
CarbonDataQA1 commented on pull request #3975: URL: https://github.com/apache/carbondata/pull/3975#issuecomment-706161324 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3695: [WIP] partition optimization
CarbonDataQA1 commented on pull request #3695: URL: https://github.com/apache/carbondata/pull/3695#issuecomment-705392359 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [carbondata] ShreelekhyaG commented on a change in pull request #3914: [CARBONDATA-3979] Added Hive local dictionary support example
ShreelekhyaG commented on a change in pull request #3914: URL: https://github.com/apache/carbondata/pull/3914#discussion_r501455143 ## File path: integration/hive/src/test/java/org/apache/carbondata/hive/HiveCarbonTest.java ## @@ -211,6 +248,85 @@ public void testStructType() throws Exception { checkAnswer(carbonResult, hiveResult); } + private ArrayList getDimRawChunk(Integer blockindex) + throws IOException { +File rootPath = new File(HiveTestUtils.class.getResource("/").getPath() + "../../../.."); +String storePath = rootPath.getAbsolutePath() + "/integration/hive/target/warehouse/warehouse/hive_carbon_table/"; +CarbonFile[] dataFiles = FileFactory.getCarbonFile(storePath) +.listFiles(new CarbonFileFilter() { + @Override + public boolean accept(CarbonFile file) { +if (file.getName().endsWith(CarbonCommonConstants.FACT_FILE_EXT)) { + return true; +} else { + return false; +} + } +}); +ArrayList dimensionRawColumnChunks = read(dataFiles[0].getAbsolutePath(), +blockindex); +return dimensionRawColumnChunks; + } + + private ArrayList read(String filePath, Integer blockIndex) throws IOException { Review comment: Done. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [carbondata] akashrn5 commented on pull request #3964: [CARBONDATA-4015] Remove hardcode of Lock configuration in Update and Delete
akashrn5 commented on pull request #3964: URL: https://github.com/apache/carbondata/pull/3964#issuecomment-705383989 LGTM This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3912: [CARBONDATA-3977] Global sort partitions should be determined dynamically
CarbonDataQA1 commented on pull request #3912: URL: https://github.com/apache/carbondata/pull/3912#issuecomment-705381449 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [carbondata] brijoobopanna commented on pull request #3789: [CARBONDATA-3864] Store Size Optimization
brijoobopanna commented on pull request #3789: URL: https://github.com/apache/carbondata/pull/3789#issuecomment-705567589 @Indhumathi27 can you please post a test result with a data set that has more string types This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [carbondata] Karan980 closed pull request #3973: [Carbondata-3999] Fix permission issue in /tmp/indexservertmp directory.
Karan980 closed pull request #3973: URL: https://github.com/apache/carbondata/pull/3973 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3972: [WIP]Launch same number of task as select query for insert into select and ctas cases when target table is of no_sort
CarbonDataQA1 commented on pull request #3972: URL: https://github.com/apache/carbondata/pull/3972#issuecomment-705547810 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [carbondata] asfgit closed pull request #3961: [CARBONDATA-4019]Fix CDC merge failure join expression made of AND/OR expressions.
asfgit closed pull request #3961: URL: https://github.com/apache/carbondata/pull/3961 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [carbondata] marchpure commented on pull request #3934: [WIP] Support Global Unique Id for SegmentNo
marchpure commented on pull request #3934: URL: https://github.com/apache/carbondata/pull/3934#issuecomment-706046465 retest this please This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [carbondata] kunal642 commented on pull request #3915: [CARBONDATA-3975]Fix wrong data from carbondata for binary column when read via hive
kunal642 commented on pull request #3915: URL: https://github.com/apache/carbondata/pull/3915#issuecomment-705375211 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [carbondata] asfgit closed pull request #3964: [CARBONDATA-4015] Remove hardcode of Lock configuration in Update and Delete
asfgit closed pull request #3964: URL: https://github.com/apache/carbondata/pull/3964 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [carbondata] kunal642 commented on pull request #3967: [CARBONDATA-4004] Issue with select after update command
kunal642 commented on pull request #3967: URL: https://github.com/apache/carbondata/pull/3967#issuecomment-705514956 @akkio-97 Can we add a test case for this? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [carbondata] maheshrajus commented on pull request #3912: [CARBONDATA-3977] Global sort partitions should be determined dynamically
maheshrajus commented on pull request #3912: URL: https://github.com/apache/carbondata/pull/3912#issuecomment-705339174 retest this please This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [carbondata] ShreelekhyaG commented on a change in pull request #3966: [CARBONDATA-4023] Create MV failed on table with geospatial index using carbonsession.
ShreelekhyaG commented on a change in pull request #3966: URL: https://github.com/apache/carbondata/pull/3966#discussion_r501611619 ## File path: integration/spark/src/main/scala/org/apache/spark/sql/parser/CarbonSparkSqlParserUtil.scala ## @@ -562,8 +562,11 @@ object CarbonSparkSqlParserUtil { * @return returns if lower case conversion is needed else */ def needToConvertToLowerCase(key: String): Boolean = { -val noConvertList = Array(CarbonCommonConstants.COMPRESSOR, "PATH", "bad_record_path", +var noConvertList = Array(CarbonCommonConstants.COMPRESSOR, "PATH", "bad_record_path", "timestampformat", "dateformat") +if (key.startsWith(CarbonCommonConstants.SPATIAL_INDEX) && key.endsWith("class")) { Review comment: Done This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3966: [CARBONDATA-4023] Create MV failed on table with geospatial index using carbonsession.
CarbonDataQA1 commented on pull request #3966: URL: https://github.com/apache/carbondata/pull/3966#issuecomment-705521165 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [carbondata] asfgit closed pull request #3915: [CARBONDATA-3975]Fix wrong data from carbondata for binary column when read via hive
asfgit closed pull request #3915: URL: https://github.com/apache/carbondata/pull/3915 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3971: [WIP] Do not clean stale data
CarbonDataQA1 commented on pull request #3971: URL: https://github.com/apache/carbondata/pull/3971#issuecomment-705424705 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [carbondata] Karan980 commented on pull request #3974: [Carbondata-3999] Fix permission issue of indexServerTmp directory
Karan980 commented on pull request #3974: URL: https://github.com/apache/carbondata/pull/3974#issuecomment-706061668 retest this please This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3915: [CARBONDATA-3975]Fix wrong data from carbondata for binary column when read via hive
CarbonDataQA1 commented on pull request #3915: URL: https://github.com/apache/carbondata/pull/3915#issuecomment-705431123 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3973: [Carbondata-3999] Fix permission issue in /tmp/indexservertmp directory.
CarbonDataQA1 commented on pull request #3973: URL: https://github.com/apache/carbondata/pull/3973#issuecomment-705785924 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3974: [Carbondata-3999] Fix permission issue of indexServerTmp directory
CarbonDataQA1 commented on pull request #3974: URL: https://github.com/apache/carbondata/pull/3974#issuecomment-706011556 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [carbondata] Indhumathi27 commented on pull request #3950: [CARBONDATA-3889] Enable scalastyle check for all scala test code
Indhumathi27 commented on pull request #3950: URL: https://github.com/apache/carbondata/pull/3950#issuecomment-705337439 @QiangCai Please rebase This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [carbondata] Indhumathi27 commented on a change in pull request #3969: [CARBONDATA-3932] [CARBONDATA-3903] change discovery.uri in presto guide and dml document update
Indhumathi27 commented on a change in pull request #3969: URL: https://github.com/apache/carbondata/pull/3969#discussion_r501492611 ## File path: docs/dml-of-carbondata.md ## @@ -43,6 +43,7 @@ CarbonData DML statements are documented here,which includes: **NOTE**: * Use 'file://' prefix to indicate local input files path, but it just supports local mode. * If run on cluster mode, please upload all input files to distributed file system, for example 'hdfs://' for hdfs. +* Each load creates new segment folder and manages the folder through tablestatus file. Review comment: since this content is already present in Segment Management. no need to add it ## File path: docs/dml-of-carbondata.md ## @@ -303,6 +304,7 @@ CarbonData DML statements are documented here,which includes: * The data type of source and destination table columns should be same * INSERT INTO command does not support partial success if bad records are found, it will fail. * Data cannot be loaded or updated in source table while insert from source table to target table is in progress. + * Each insert creates new segment folder and manages the folder through tablestatus file. Review comment: same comment as above ## File path: docs/dml-of-carbondata.md ## @@ -43,6 +43,7 @@ CarbonData DML statements are documented here,which includes: **NOTE**: * Use 'file://' prefix to indicate local input files path, but it just supports local mode. * If run on cluster mode, please upload all input files to distributed file system, for example 'hdfs://' for hdfs. +* Each load creates new segment folder and manages the folder through tablestatus file. Review comment: since this content is already present in Segment Management. no need to add it. If needed can update the first line in Segment Management ## File path: docs/dml-of-carbondata.md ## @@ -402,6 +404,11 @@ CarbonData DML statements are documented here,which includes: ## UPDATE AND DELETE + Since the data in CarbonData files is immutable, the updates and delete are done via maintaining two files namely: Review comment: Can update the line based on filesystem. Like, ` Since the data stored in file system like HDFS is immutable,..` ## File path: docs/dml-of-carbondata.md ## @@ -454,6 +461,10 @@ CarbonData DML statements are documented here,which includes: Review comment: Can specify without providing expression, it will delete all records from table ## File path: docs/dml-of-carbondata.md ## @@ -447,13 +452,17 @@ CarbonData DML statements are documented here,which includes: ### DELETE - This command allows us to delete records from CarbonData table. + This command allows us to delete records from CarbonData table. Without providing expression, it will delete all the records from table. Review comment: this can be added after syntax, as a note ## File path: docs/dml-of-carbondata.md ## @@ -402,6 +402,11 @@ CarbonData DML statements are documented here,which includes: ## UPDATE AND DELETE + Since the data stored in file system like HDFS is immutable, the update and delete in carbondata are done via maintaining two files namely: Review comment: ```suggestion Since the data stored in a file system like HDFS is immutable, the update and delete in carbondata are done via maintaining two files namely: ``` This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [carbondata] maheshrajus closed pull request #3968: [WIP] Partition optimization
maheshrajus closed pull request #3968: URL: https://github.com/apache/carbondata/pull/3968 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3934: [WIP] Support Global Unique Id for SegmentNo
CarbonDataQA1 commented on pull request #3934: URL: https://github.com/apache/carbondata/pull/3934#issuecomment-706093225 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [carbondata] Indhumathi27 commented on a change in pull request #3966: [CARBONDATA-4023] Create MV failed on table with geospatial index using carbonsession.
Indhumathi27 commented on a change in pull request #3966: URL: https://github.com/apache/carbondata/pull/3966#discussion_r501479227 ## File path: integration/spark/src/main/scala/org/apache/spark/sql/parser/CarbonSparkSqlParserUtil.scala ## @@ -562,8 +562,11 @@ object CarbonSparkSqlParserUtil { * @return returns if lower case conversion is needed else */ def needToConvertToLowerCase(key: String): Boolean = { -val noConvertList = Array(CarbonCommonConstants.COMPRESSOR, "PATH", "bad_record_path", +var noConvertList = Array(CarbonCommonConstants.COMPRESSOR, "PATH", "bad_record_path", "timestampformat", "dateformat") +if (key.startsWith(CarbonCommonConstants.SPATIAL_INDEX) && key.endsWith("class")) { Review comment: ```suggestion if (key.startsWith(CarbonCommonConstants.SPATIAL_INDEX) && key.endsWith(".class")) { ``` This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [carbondata] kunal642 commented on pull request #3961: [CARBONDATA-4019]Fix CDC merge failure join expression made of AND/OR expressions.
kunal642 commented on pull request #3961: URL: https://github.com/apache/carbondata/pull/3961#issuecomment-705371096 LGTM This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [carbondata] kunal642 commented on pull request #3914: [CARBONDATA-3979] Added Hive local dictionary support example
kunal642 commented on pull request #3914: URL: https://github.com/apache/carbondata/pull/3914#issuecomment-706070572 retest this please This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [carbondata] Indhumathi27 commented on pull request #3959: [CARBONDATA-4010] Doc changes for long strings.
Indhumathi27 commented on pull request #3959: URL: https://github.com/apache/carbondata/pull/3959#issuecomment-705339900 LGTM This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3969: [CARBONDATA-3932] [CARBONDATA-3903] change discovery.uri in presto guide and dml document update
CarbonDataQA1 commented on pull request #3969: URL: https://github.com/apache/carbondata/pull/3969#issuecomment-705548840 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [carbondata] VenuReddy2103 commented on pull request #3972: [WIP]Launch same number of task as select query for insert into select and ctas cases when target table is of no_sort
VenuReddy2103 commented on pull request #3972: URL: https://github.com/apache/carbondata/pull/3972#issuecomment-705575486 retest this please This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [carbondata] ShreelekhyaG commented on a change in pull request #3969: [CARBONDATA-3932] [CARBONDATA-3903] change discovery.uri in presto guide and dml document update
ShreelekhyaG commented on a change in pull request #3969: URL: https://github.com/apache/carbondata/pull/3969#discussion_r501635531 ## File path: docs/dml-of-carbondata.md ## @@ -454,6 +461,10 @@ CarbonData DML statements are documented here,which includes: Review comment: Done ## File path: docs/dml-of-carbondata.md ## @@ -43,6 +43,7 @@ CarbonData DML statements are documented here,which includes: **NOTE**: * Use 'file://' prefix to indicate local input files path, but it just supports local mode. * If run on cluster mode, please upload all input files to distributed file system, for example 'hdfs://' for hdfs. +* Each load creates new segment folder and manages the folder through tablestatus file. Review comment: ok. removed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [carbondata] Indhumathi27 commented on pull request #3971: [WIP] Do not clean stale data
Indhumathi27 commented on pull request #3971: URL: https://github.com/apache/carbondata/pull/3971#issuecomment-705367973 retest this please This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [carbondata] maheshrajus commented on pull request #3968: [WIP] Partition optimization
maheshrajus commented on pull request #3968: URL: https://github.com/apache/carbondata/pull/3968#issuecomment-705337955 kunal already raised PR for this. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [carbondata] Indhumathi27 commented on pull request #3966: [CARBONDATA-4023] Create MV failed on table with geospatial index using carbonsession.
Indhumathi27 commented on pull request #3966: URL: https://github.com/apache/carbondata/pull/3966#issuecomment-706178383 LGTM This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3917: [CARBONDATA-3978] Clean files refactor and added support for a trash folder where all the carbondata files will be copied to after
CarbonDataQA1 commented on pull request #3917: URL: https://github.com/apache/carbondata/pull/3917#issuecomment-706210381 Build Failed with Spark 2.3.4, Please check CI http://121.244.95.60:12545/job/ApacheCarbonPRBuilder2.3/4350/ This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3917: [CARBONDATA-3978] Clean files refactor and added support for a trash folder where all the carbondata files will be copied to after
CarbonDataQA1 commented on pull request #3917: URL: https://github.com/apache/carbondata/pull/3917#issuecomment-706213323 Build Failed with Spark 2.4.5, Please check CI http://121.244.95.60:12545/job/ApacheCarbon_PR_Builder_2.4.5/2600/ This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[jira] [Closed] (CARBONDATA-3795) Create external carbon table fails if the schema is not provided
[ https://issues.apache.org/jira/browse/CARBONDATA-3795?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chetan Bhat closed CARBONDATA-3795. --- Fix Version/s: 2.0.1 Resolution: Fixed Issue fixed in 2.0.1 > Create external carbon table fails if the schema is not provided > > > Key: CARBONDATA-3795 > URL: https://issues.apache.org/jira/browse/CARBONDATA-3795 > Project: CarbonData > Issue Type: Bug > Components: data-query >Affects Versions: 2.0.0 > Environment: Spark 2.4.5 compatible carbon jars >Reporter: Chetan Bhat >Priority: Major > Fix For: 2.0.1 > > > Create external carbon table fails if the schema is not provided. > Example command - > create external table test1 stored as carbondata location > '/user/sparkhive/warehouse/1_6_1.db/brinjal/'; > *Error: org.apache.spark.sql.AnalysisException: Unable to infer the schema. > The schema specification is required to create the table `1_6_1`.`test1`.; > (state=,code=0)* > > *Logs -* > 2020-05-05 22:57:25,638 | ERROR | [HiveServer2-Background-Pool: Thread-371] | > Error executing query, currentState RUNNING, | > org.apache.spark.internal.Logging$class.logError(Logging.scala:91) > org.apache.spark.sql.AnalysisException: Unable to infer the schema. The > schema specification is required to create the table `1_6_1`.`test1`.; > at > org.apache.spark.sql.hive.ResolveHiveSerdeTable$$anonfun$apply$1.applyOrElse(HiveStrategies.scala:104) > at > org.apache.spark.sql.hive.ResolveHiveSerdeTable$$anonfun$apply$1.applyOrElse(HiveStrategies.scala:90) > at > org.apache.spark.sql.catalyst.plans.logical.AnalysisHelper$$anonfun$resolveOperatorsDown$1$$anonfun$2.apply(AnalysisHelper.scala:108) > at > org.apache.spark.sql.catalyst.plans.logical.AnalysisHelper$$anonfun$resolveOperatorsDown$1$$anonfun$2.apply(AnalysisHelper.scala:108) > at > org.apache.spark.sql.catalyst.trees.CurrentOrigin$.withOrigin(TreeNode.scala:69) > at > org.apache.spark.sql.catalyst.plans.logical.AnalysisHelper$$anonfun$resolveOperatorsDown$1.apply(AnalysisHelper.scala:107) > at > org.apache.spark.sql.catalyst.plans.logical.AnalysisHelper$$anonfun$resolveOperatorsDown$1.apply(AnalysisHelper.scala:106) > at > org.apache.spark.sql.catalyst.plans.logical.AnalysisHelper$.allowInvokingTransformsInAnalyzer(AnalysisHelper.scala:194) > at > org.apache.spark.sql.catalyst.plans.logical.AnalysisHelper$class.resolveOperatorsDown(AnalysisHelper.scala:106) > at > org.apache.spark.sql.catalyst.plans.logical.LogicalPlan.resolveOperatorsDown(LogicalPlan.scala:29) > at > org.apache.spark.sql.catalyst.plans.logical.AnalysisHelper$class.resolveOperators(AnalysisHelper.scala:73) > at > org.apache.spark.sql.catalyst.plans.logical.LogicalPlan.resolveOperators(LogicalPlan.scala:29) > at > org.apache.spark.sql.hive.ResolveHiveSerdeTable.apply(HiveStrategies.scala:90) > at > org.apache.spark.sql.hive.ResolveHiveSerdeTable.apply(HiveStrategies.scala:44) > at > org.apache.spark.sql.catalyst.rules.RuleExecutor$$anonfun$execute$1$$anonfun$apply$1.apply(RuleExecutor.scala:87) > at > org.apache.spark.sql.catalyst.rules.RuleExecutor$$anonfun$execute$1$$anonfun$apply$1.apply(RuleExecutor.scala:84) > at > scala.collection.LinearSeqOptimized$class.foldLeft(LinearSeqOptimized.scala:124) > at scala.collection.immutable.List.foldLeft(List.scala:84) > at > org.apache.spark.sql.catalyst.rules.RuleExecutor$$anonfun$execute$1.apply(RuleExecutor.scala:84) > at > org.apache.spark.sql.catalyst.rules.RuleExecutor$$anonfun$execute$1.apply(RuleExecutor.scala:76) > at scala.collection.immutable.List.foreach(List.scala:392) > at > org.apache.spark.sql.catalyst.rules.RuleExecutor.execute(RuleExecutor.scala:76) > at > org.apache.spark.sql.catalyst.analysis.Analyzer.org$apache$spark$sql$catalyst$analysis$Analyzer$$executeSameContext(Analyzer.scala:127) > at > org.apache.spark.sql.catalyst.analysis.Analyzer.execute(Analyzer.scala:121) > at > org.apache.spark.sql.catalyst.analysis.Analyzer$$anonfun$executeAndCheck$1.apply(Analyzer.scala:106) > at > org.apache.spark.sql.catalyst.analysis.Analyzer$$anonfun$executeAndCheck$1.apply(Analyzer.scala:105) > at > org.apache.spark.sql.catalyst.plans.logical.AnalysisHelper$.markInAnalyzer(AnalysisHelper.scala:201) > at > org.apache.spark.sql.catalyst.analysis.Analyzer.executeAndCheck(Analyzer.scala:105) > at > org.apache.spark.sql.execution.QueryExecution.analyzed$lzycompute(QueryExecution.scala:58) > at > org.apache.spark.sql.execution.QueryExecution.analyzed(QueryExecution.scala:56) > at > org.apache.spark.sql.execution.QueryExecution.assertAnalyzed(QueryExecution.scala:48) > at org.apache.spark.sql.Dataset$.ofRows(Dataset.scala:78) > at org.apache.spark.sql.SparkSession.sql(SparkSession.scala:642) > at o
[jira] [Closed] (CARBONDATA-3825) Refresh table in carbonsession using carbonextension fails for a table created in sparkfile format
[ https://issues.apache.org/jira/browse/CARBONDATA-3825?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chetan Bhat closed CARBONDATA-3825. --- Resolution: Invalid Issue is analyzed as invalid and closed. > Refresh table in carbonsession using carbonextension fails for a table > created in sparkfile format > --- > > Key: CARBONDATA-3825 > URL: https://issues.apache.org/jira/browse/CARBONDATA-3825 > Project: CarbonData > Issue Type: Bug > Components: data-load >Affects Versions: 2.0.0 > Environment: Spark 2.3.2, 2.4.5 >Reporter: Chetan Bhat >Priority: Major > > In 1.6.1 or 2.0 version create a table in a db in spark file format and > insert records in the table. > Take a backup of the table store, drop database > In carbonsession using carbonextension create a database with same name as > the db in sparkfileformat and copy table store of sparkfileformat to db path > in hdfs. > Execute the refresh table command. > Refresh table fails with error "Table or view not found in database" -- This message was sent by Atlassian Jira (v8.3.4#803005)
[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3695: [WIP] partition optimization
CarbonDataQA1 commented on pull request #3695: URL: https://github.com/apache/carbondata/pull/3695#issuecomment-706389331 Build Failed with Spark 2.3.4, Please check CI http://121.244.95.60:12545/job/ApacheCarbonPRBuilder2.3/4351/ This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3695: [WIP] partition optimization
CarbonDataQA1 commented on pull request #3695: URL: https://github.com/apache/carbondata/pull/3695#issuecomment-706389702 Build Failed with Spark 2.4.5, Please check CI http://121.244.95.60:12545/job/ApacheCarbon_PR_Builder_2.4.5/2601/ This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [carbondata] Pickupolddriver commented on a change in pull request #3935: [CARBONDATA-3993] Remove auto data deletion in IUD processs
Pickupolddriver commented on a change in pull request #3935: URL: https://github.com/apache/carbondata/pull/3935#discussion_r502735735 ## File path: integration/spark/src/main/scala/org/apache/carbondata/spark/rdd/CarbonDataRDDFactory.scala ## @@ -267,9 +266,8 @@ object CarbonDataRDDFactory { throw new Exception("Exception in compaction " + exception.getMessage) } } finally { - executor.shutdownNow() try { -compactor.deletePartialLoadsInCompaction() Review comment: (a). My PR will dependent on #3934 , if it merged first we don't need to worry about the same segment ID in compaction situations. So it the deletePartialLoadsInCompaction could be deleted. (b). In this PR's design, the stale files would not be deleted automatically. Users need to call cleanFiles function to clean them besides PR #3917 can handle the clean files function. (c). More places of calling claenStaleFiles have been removed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3935: [CARBONDATA-3993] Remove auto data deletion in IUD processs
CarbonDataQA1 commented on pull request #3935: URL: https://github.com/apache/carbondata/pull/3935#issuecomment-706487703 Build Success with Spark 2.3.4, Please check CI http://121.244.95.60:12545/job/ApacheCarbonPRBuilder2.3/4352/ This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3935: [CARBONDATA-3993] Remove auto data deletion in IUD processs
CarbonDataQA1 commented on pull request #3935: URL: https://github.com/apache/carbondata/pull/3935#issuecomment-706488603 Build Success with Spark 2.4.5, Please check CI http://121.244.95.60:12545/job/ApacheCarbon_PR_Builder_2.4.5/2602/ This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org