[GitHub] [carbondata] ajantha-bhat commented on pull request #3985: [CARBONDATA-3965]Fixed float variable target datatype in case of adaptive encoding
ajantha-bhat commented on pull request #3985: URL: https://github.com/apache/carbondata/pull/3985#issuecomment-711712512 LGTM This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [carbondata] QiangCai commented on pull request #3948: [HOTFIX] Fix random 11 testcase failure in CI
QiangCai commented on pull request #3948: URL: https://github.com/apache/carbondata/pull/3948#issuecomment-711717006 LGTM, we will raise more PRs to fix other random failures in CI This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [carbondata] asfgit closed pull request #3948: [HOTFIX] Fix random 11 testcase failure in CI
asfgit closed pull request #3948: URL: https://github.com/apache/carbondata/pull/3948 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [carbondata] akashrn5 commented on a change in pull request #3980: [CARBONDATA-3901] [CARBONDATA-3903] [CARBONDATA-3824] SI creation on unkbown table and doc changes.
akashrn5 commented on a change in pull request #3980: URL: https://github.com/apache/carbondata/pull/3980#discussion_r507519490 ## File path: integration/spark/src/main/scala/org/apache/spark/sql/execution/strategy/DDLStrategy.scala ## @@ -204,6 +204,10 @@ class DDLStrategy(sparkSession: SparkSession) extends SparkStrategy { ExecutedCommandExec(CarbonCreateSecondaryIndexCommand( indexModel, tableProperties, ifNotExists, isDeferredRefresh, isCreateSIndex)) :: Nil } else { + if (!sparkSession.sessionState.catalog. Review comment: already we are calling `tableExists `funtion to check whether carbon table or not, if the table not found, it throws `NoSuchTableException`. But we catch the exception in `org.apache.spark.sql.hive.CarbonFileMetastore#tableExists` and return false. So you can add an error log in CarbonFileMetaStore, saying the table not exists mentioning table name and avoid the calling to hive metastore here in your changes which is a costly operation. Also, you can just change the error to more generalized way to satisfy the non-carbon table scenario and table not exists. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [carbondata] asfgit closed pull request #3985: [CARBONDATA-3965]Fixed float variable target datatype in case of adaptive encoding
asfgit closed pull request #3985: URL: https://github.com/apache/carbondata/pull/3985 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[jira] [Resolved] (CARBONDATA-3965) Adaptive encoding of Complex primitive float is using log value to store float (4 bytes) data
[ https://issues.apache.org/jira/browse/CARBONDATA-3965?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ajantha Bhat resolved CARBONDATA-3965. -- Fix Version/s: 2.1.0 Resolution: Fixed > Adaptive encoding of Complex primitive float is using log value to store > float (4 bytes) data > - > > Key: CARBONDATA-3965 > URL: https://issues.apache.org/jira/browse/CARBONDATA-3965 > Project: CarbonData > Issue Type: Bug >Reporter: Ajantha Bhat >Priority: Major > Fix For: 2.1.0 > > Time Spent: 4h > Remaining Estimate: 0h > > I have tested, With current UT itself it is hitting. for [Null, 5.512] it is > using long as storage for complex primitive adaptive. Base behavior needs to > check. I guess it can be analyzed separately > > For this, I have checked > If No complex type, (if it is just primitive type) same values goes to > DirectCompress, not adaptive. But for complex primitive it goes to adaptive > because of below code. And as min max is stored as double precision. Long is > chosen for this. > {{DefaultEncodingFactory#selectCodecByAlgorithmForFloating()}} > > {{} else if (decimalCount < 0 && !isComplexPrimitive) \{ > return new DirectCompressCodec(DataTypes.DOUBLE); > } else \{ > return getColumnPageCodec(stats, isComplexPrimitive, columnSpec, > srcDataType, maxValue, > minValue, decimalCount, absMaxValue); > }}} > {{}} > I don't know (remember) why complex primitive should not enter direct > compress. why that check is explicitly added.{{}} -- This message was sent by Atlassian Jira (v8.3.4#803005)
[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3914: [CARBONDATA-3979] Added Hive local dictionary support example
CarbonDataQA1 commented on pull request #3914: URL: https://github.com/apache/carbondata/pull/3914#issuecomment-711781738 Build Failed with Spark 2.3.4, Please check CI http://121.244.95.60:12545/job/ApacheCarbonPRBuilder2.3/4506/ This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3950: [CARBONDATA-3889] Enable scalastyle check for all scala test code
CarbonDataQA1 commented on pull request #3950: URL: https://github.com/apache/carbondata/pull/3950#issuecomment-711793379 Build Failed with Spark 2.4.5, Please check CI http://121.244.95.60:12545/job/ApacheCarbon_PR_Builder_2.4.5/2750/ This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3986: [CARBONDATA-4034] Improve the time-consuming of Horizontal Compaction for update
CarbonDataQA1 commented on pull request #3986: URL: https://github.com/apache/carbondata/pull/3986#issuecomment-711817892 Build Success with Spark 2.4.5, Please check CI http://121.244.95.60:12545/job/ApacheCarbon_PR_Builder_2.4.5/2753/ This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3914: [CARBONDATA-3979] Added Hive local dictionary support example
CarbonDataQA1 commented on pull request #3914: URL: https://github.com/apache/carbondata/pull/3914#issuecomment-711832466 Build Failed with Spark 2.4.5, Please check CI http://121.244.95.60:12545/job/ApacheCarbon_PR_Builder_2.4.5/2752/ This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3986: [CARBONDATA-4034] Improve the time-consuming of Horizontal Compaction for update
CarbonDataQA1 commented on pull request #3986: URL: https://github.com/apache/carbondata/pull/3986#issuecomment-711838443 Build Failed with Spark 2.3.4, Please check CI http://121.244.95.60:12545/job/ApacheCarbonPRBuilder2.3/4507/ This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [carbondata] nihal0107 commented on a change in pull request #3980: [CARBONDATA-3901] [CARBONDATA-3903] [CARBONDATA-3824] SI creation on unkbown table and doc changes.
nihal0107 commented on a change in pull request #3980: URL: https://github.com/apache/carbondata/pull/3980#discussion_r507589219 ## File path: integration/spark/src/main/scala/org/apache/spark/sql/execution/strategy/DDLStrategy.scala ## @@ -204,6 +204,10 @@ class DDLStrategy(sparkSession: SparkSession) extends SparkStrategy { ExecutedCommandExec(CarbonCreateSecondaryIndexCommand( indexModel, tableProperties, ifNotExists, isDeferredRefresh, isCreateSIndex)) :: Nil } else { + if (!sparkSession.sessionState.catalog. Review comment: done ## File path: docs/spatial-index-guide.md ## @@ -62,13 +62,16 @@ create table source_index(id BIGINT, latitude long, longitude long) stored by 'c 'SPATIAL_INDEX.mygeohash.maxLatitude'='20.225281', 'SPATIAL_INDEX.mygeohash.conversionRatio'='100'); ``` -Note: `mygeohash` in the above example represent the index name. +Note: + * `mygeohash` in the above example represent the index name. + * Columns present in spatial_index table properties cannot be altered +i.e., sourcecolumns: `longitude, latitude` and index column: `mygeohash` in the above example. List of spatial index table properties |Name|Description| |---|-| -| SPATIAL_INDEX | Used to configure Spatial Index name. This name is appended to `SPATIAL_INDEX` in the subsequent sub-property configurations. `xxx` in the below sub-properties refer to index name.| +| SPATIAL_INDEX | Used to configure Spatial Index name. This name is appended to `SPATIAL_INDEX` in the subsequent sub-property configurations. `xxx` in the below sub-properties refer to index name. Newly created column name is same as that of spatial index name. This column is not allowed in any properties except in SORT_COLUMNS table property.| Review comment: done This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3950: [CARBONDATA-3889] Enable scalastyle check for all scala test code
CarbonDataQA1 commented on pull request #3950: URL: https://github.com/apache/carbondata/pull/3950#issuecomment-711928114 Build Failed with Spark 2.3.4, Please check CI http://121.244.95.60:12545/job/ApacheCarbonPRBuilder2.3/4508/ This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[jira] [Commented] (CARBONDATA-4025) storage space for MV is double to that of a table on which MV has been created.
[ https://issues.apache.org/jira/browse/CARBONDATA-4025?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17216618#comment-17216618 ] suyash yadav commented on CARBONDATA-4025: -- Hi Team, Can somebody look into this request? Regards Suyash Yadav > storage space for MV is double to that of a table on which MV has been > created. > --- > > Key: CARBONDATA-4025 > URL: https://issues.apache.org/jira/browse/CARBONDATA-4025 > Project: CarbonData > Issue Type: Improvement > Components: core >Affects Versions: 2.0.1 > Environment: Apcahe carbondata 2.0.1 > Apache spark 2.4.5 > Hadoop 2.7.2 >Reporter: suyash yadav >Priority: Major > > We are doing a POC based on carbondata but we have observed that when we > create n MV on a table with timeseries function of same granualarity the MV > takes double the space of the table. > > In my scenario, My table has 1.3 million records and MV also has same number > of records but the size of the table is 3.6 MB but the size of the MV is > around 6.5 MB. > This is really important for us as critical business decision are getting > affected due to this behaviour. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3980: [CARBONDATA-3901] [CARBONDATA-3903] [CARBONDATA-3824] SI creation on unkbown table and doc changes.
CarbonDataQA1 commented on pull request #3980: URL: https://github.com/apache/carbondata/pull/3980#issuecomment-712049673 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3950: [CARBONDATA-3889] Enable scalastyle check for all scala test code
CarbonDataQA1 commented on pull request #3950: URL: https://github.com/apache/carbondata/pull/3950#issuecomment-712066559 Build Failed with Spark 2.4.5, Please check CI http://121.244.95.60:12545/job/ApacheCarbon_PR_Builder_2.4.5/2754/ This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3950: [CARBONDATA-3889] Enable scalastyle check for all scala test code
CarbonDataQA1 commented on pull request #3950: URL: https://github.com/apache/carbondata/pull/3950#issuecomment-712097634 Build Failed with Spark 2.3.4, Please check CI http://121.244.95.60:12545/job/ApacheCarbonPRBuilder2.3/4512/ This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [carbondata] Indhumathi27 commented on a change in pull request #3984: [CARBONDATA-4035]Fix MV query issue with aggregation on decimal column
Indhumathi27 commented on a change in pull request #3984: URL: https://github.com/apache/carbondata/pull/3984#discussion_r507702557 ## File path: integration/spark/src/test/scala/org/apache/carbondata/view/rewrite/MVCreateTestCase.scala ## @@ -1471,6 +1471,16 @@ class MVCreateTestCase extends QueryTest with BeforeAndAfterAll { sql("drop table if exists t2") } + test("test sum aggregations on decimal columns") { +sql("drop table if exists sum_agg_decimal") +sql("create table sum_agg_decimal(salary1 decimal(7,2),salary2 decimal(7,2),salary3 decimal(7,2),salary4 decimal(7,2),empname string) stored as carbondata") +sql("drop materialized view if exists decimal_mv") +sql("create materialized view decimal_mv as select empname, sum(salary1 - salary2) from sum_agg_decimal group by empname") +sql("explain select empname, sum( salary1 - salary2) from sum_agg_decimal group by empname").show(false) Review comment: can revert this change This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [carbondata] Indhumathi27 commented on a change in pull request #3984: [CARBONDATA-4035]Fix MV query issue with aggregation on decimal column
Indhumathi27 commented on a change in pull request #3984: URL: https://github.com/apache/carbondata/pull/3984#discussion_r507703419 ## File path: integration/spark/src/test/scala/org/apache/carbondata/view/rewrite/MVCreateTestCase.scala ## @@ -1471,6 +1471,16 @@ class MVCreateTestCase extends QueryTest with BeforeAndAfterAll { sql("drop table if exists t2") } + test("test sum aggregations on decimal columns") { +sql("drop table if exists sum_agg_decimal") Review comment: can add drop to afterAll also This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3950: [CARBONDATA-3889] Enable scalastyle check for all scala test code
CarbonDataQA1 commented on pull request #3950: URL: https://github.com/apache/carbondata/pull/3950#issuecomment-712125005 Build Failed with Spark 2.4.5, Please check CI http://121.244.95.60:12545/job/ApacheCarbon_PR_Builder_2.4.5/2758/ This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3970: [CARBONDATA-4007] Fix multiple issues in SDK
CarbonDataQA1 commented on pull request #3970: URL: https://github.com/apache/carbondata/pull/3970#issuecomment-712129726 Build Failed with Spark 2.3.4, Please check CI http://121.244.95.60:12545/job/ApacheCarbonPRBuilder2.3/4513/ This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3970: [CARBONDATA-4007] Fix multiple issues in SDK
CarbonDataQA1 commented on pull request #3970: URL: https://github.com/apache/carbondata/pull/3970#issuecomment-712132333 Build Failed with Spark 2.4.5, Please check CI http://121.244.95.60:12545/job/ApacheCarbon_PR_Builder_2.4.5/2759/ This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3970: [CARBONDATA-4007] Fix multiple issues in SDK
CarbonDataQA1 commented on pull request #3970: URL: https://github.com/apache/carbondata/pull/3970#issuecomment-712134800 Build Failed with Spark 2.4.5, Please check CI http://121.244.95.60:12545/job/ApacheCarbon_PR_Builder_2.4.5/2760/ This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3970: [CARBONDATA-4007] Fix multiple issues in SDK
CarbonDataQA1 commented on pull request #3970: URL: https://github.com/apache/carbondata/pull/3970#issuecomment-712135326 Build Failed with Spark 2.3.4, Please check CI http://121.244.95.60:12545/job/ApacheCarbonPRBuilder2.3/4514/ This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [carbondata] akkio-97 commented on pull request #3967: [CARBONDATA-4004] [CARBONDATA-4012] Issue with select after update command
akkio-97 commented on pull request #3967: URL: https://github.com/apache/carbondata/pull/3967#issuecomment-712186261 retest this please This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [carbondata] akashrn5 commented on a change in pull request #3986: [CARBONDATA-4034] Improve the time-consuming of Horizontal Compaction for update
akashrn5 commented on a change in pull request #3986: URL: https://github.com/apache/carbondata/pull/3986#discussion_r507834166 ## File path: core/src/main/java/org/apache/carbondata/core/statusmanager/SegmentUpdateStatusManager.java ## @@ -415,44 +415,66 @@ public boolean accept(CarbonFile pathName) { } /** - * Return all delta file for a block. - * @param segmentId - * @param blockName - * @return + * Get all delete delta files mapped to each block of the specified segment. + * First list all deletedelta files in the segment dir, then loop the files and find + * a map of blocks and .deletedelta files related to each block. + * + * @param seg the segment which is to find blocks + * @return a map of block and its file list */ - public CarbonFile[] getDeleteDeltaFilesList(final Segment segmentId, final String blockName) { -String segmentPath = CarbonTablePath.getSegmentPath( -identifier.getTablePath(), segmentId.getSegmentNo()); -CarbonFile segDir = -FileFactory.getCarbonFile(segmentPath); + public Map> getDeleteDeltaFilesList(final Segment seg) { + +Map blockDeltaStartAndEndTimestampMap = new HashMap<>(); Review comment: @shenjiayu17 why exactly do we need to change here? Introduce map and all, still we are doing the list files which is costly operation. I have some points, please check 1. Here no need to create these map, again listing files to fill map. we are already getting the blockname and every blovck will have one corresponding deletedelta file only right. So always the delta files per block block will be one. 2. The updatedetails contains the blockname, actual blockname, timestamps and delete delta timestamps so you can check if the timestamp not empty, you can yourself form the delta file name based on these info and return from this method. With the above approach, you will avoid list files operation, filtering operation based on timestamp and creating all these maps. So you can avoid all these changes. Always we keep the horizontal compaction threshold as 1, and we dont change and dont recommend users to change to get the better performance. ## File path: integration/spark/src/main/scala/org/apache/spark/sql/execution/command/mutation/HorizontalCompaction.scala ## @@ -173,6 +176,9 @@ object HorizontalCompaction { val db = carbonTable.getDatabaseName val table = carbonTable.getTableName + +LOG.info(s"Horizontal Delete Compaction operation is getting valid segments for [$db.$table].") Review comment: same as above ## File path: integration/spark/src/main/scala/org/apache/spark/sql/execution/command/mutation/HorizontalCompaction.scala ## @@ -125,6 +125,9 @@ object HorizontalCompaction { segLists: util.List[Segment]): Unit = { val db = carbonTable.getDatabaseName val table = carbonTable.getTableName + +LOG.info(s"Horizontal Update Compaction operation is getting valid segments for [$db.$table].") Review comment: i think this log does not give any useful info here, if you put some log after line 133 and print `validSegList ` it looks little useful This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3970: [CARBONDATA-4007] Fix multiple issues in SDK
CarbonDataQA1 commented on pull request #3970: URL: https://github.com/apache/carbondata/pull/3970#issuecomment-712260963 Build Failed with Spark 2.3.4, Please check CI http://121.244.95.60:12545/job/ApacheCarbonPRBuilder2.3/4515/ This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[jira] [Created] (CARBONDATA-4037) Improve the table status and segment file writing
SHREELEKHYA GAMPA created CARBONDATA-4037: - Summary: Improve the table status and segment file writing Key: CARBONDATA-4037 URL: https://issues.apache.org/jira/browse/CARBONDATA-4037 Project: CarbonData Issue Type: Improvement Reporter: SHREELEKHYA GAMPA Currently, we update table status and segment files multiple times for a single iud/merge/compact operation and delete the index files immediately after merge. When concurrent queries are run, there may be situations like user query is trying to access the segment index files and they are not present, which is availability issue. * Instead of deleting carbon index files immediately after merge, delete index files only when clean files command is executed and delete only those that have existed for more than 1 hour. * Generate segment file after merge index and update table status at beginning and after merge index. order: create table status file => index files => merge index => generate segment file => update table status -- This message was sent by Atlassian Jira (v8.3.4#803005)
[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3967: [CARBONDATA-4004] [CARBONDATA-4012] Issue with select after update command
CarbonDataQA1 commented on pull request #3967: URL: https://github.com/apache/carbondata/pull/3967#issuecomment-712270661 Build Success with Spark 2.3.4, Please check CI http://121.244.95.60:12545/job/ApacheCarbonPRBuilder2.3/4516/ This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3967: [CARBONDATA-4004] [CARBONDATA-4012] Issue with select after update command
CarbonDataQA1 commented on pull request #3967: URL: https://github.com/apache/carbondata/pull/3967#issuecomment-712271195 Build Success with Spark 2.4.5, Please check CI http://121.244.95.60:12545/job/ApacheCarbon_PR_Builder_2.4.5/2762/ This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3970: [CARBONDATA-4007] Fix multiple issues in SDK
CarbonDataQA1 commented on pull request #3970: URL: https://github.com/apache/carbondata/pull/3970#issuecomment-712277239 Build Failed with Spark 2.4.5, Please check CI http://121.244.95.60:12545/job/ApacheCarbon_PR_Builder_2.4.5/2761/ This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [carbondata] akashrn5 commented on a change in pull request #3980: [CARBONDATA-3901] [CARBONDATA-3903] [CARBONDATA-3824] SI creation on unkbown table and doc changes.
akashrn5 commented on a change in pull request #3980: URL: https://github.com/apache/carbondata/pull/3980#discussion_r507902315 ## File path: integration/spark/src/main/scala/org/apache/spark/sql/hive/CarbonFileMetastore.scala ## @@ -267,6 +267,7 @@ class CarbonFileMetastore extends CarbonMetaStore { lookupRelation(tableIdentifier)(sparkSession) } catch { case _: NoSuchTableException => +LOGGER.error(s"Table ${tableIdentifier.table} does not exist.") Review comment: i think this can be debug log, else user can get confused with the error This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [carbondata] Karan980 commented on pull request #3970: [CARBONDATA-4007] Fix multiple issues in SDK
Karan980 commented on pull request #3970: URL: https://github.com/apache/carbondata/pull/3970#issuecomment-712318046 retest this please This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3970: [CARBONDATA-4007] Fix multiple issues in SDK
CarbonDataQA1 commented on pull request #3970: URL: https://github.com/apache/carbondata/pull/3970#issuecomment-712386082 Build Failed with Spark 2.3.4, Please check CI http://121.244.95.60:12545/job/ApacheCarbonPRBuilder2.3/4517/ This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3979: [Carbondata-3954] Fix insertion from ORC table into carbon table when sort scope is global sort
CarbonDataQA1 commented on pull request #3979: URL: https://github.com/apache/carbondata/pull/3979#issuecomment-712395039 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3970: [CARBONDATA-4007] Fix multiple issues in SDK
CarbonDataQA1 commented on pull request #3970: URL: https://github.com/apache/carbondata/pull/3970#issuecomment-712405367 Build Failed with Spark 2.4.5, Please check CI http://121.244.95.60:12545/job/ApacheCarbon_PR_Builder_2.4.5/2763/ This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [carbondata] nihal0107 commented on a change in pull request #3980: [CARBONDATA-3901] [CARBONDATA-3903] [CARBONDATA-3824] SI creation on unkbown table and doc changes.
nihal0107 commented on a change in pull request #3980: URL: https://github.com/apache/carbondata/pull/3980#discussion_r508168162 ## File path: integration/spark/src/main/scala/org/apache/spark/sql/hive/CarbonFileMetastore.scala ## @@ -267,6 +267,7 @@ class CarbonFileMetastore extends CarbonMetaStore { lookupRelation(tableIdentifier)(sparkSession) } catch { case _: NoSuchTableException => +LOGGER.error(s"Table ${tableIdentifier.table} does not exist.") Review comment: done This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [carbondata] marchpure commented on pull request #3982: [CARBONDATA-4032] Fix drop partition command clean data issue
marchpure commented on pull request #3982: URL: https://github.com/apache/carbondata/pull/3982#issuecomment-712560057 retest this please This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3950: [CARBONDATA-3889] Enable scalastyle check for all scala test code
CarbonDataQA1 commented on pull request #3950: URL: https://github.com/apache/carbondata/pull/3950#issuecomment-712559981 Build Failed with Spark 2.4.5, Please check CI http://121.244.95.60:12545/job/ApacheCarbon_PR_Builder_2.4.5/2766/ This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3950: [CARBONDATA-3889] Enable scalastyle check for all scala test code
CarbonDataQA1 commented on pull request #3950: URL: https://github.com/apache/carbondata/pull/3950#issuecomment-712560439 Build Failed with Spark 2.3.4, Please check CI http://121.244.95.60:12545/job/ApacheCarbonPRBuilder2.3/4520/ This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3980: [CARBONDATA-3901] [CARBONDATA-3903] [CARBONDATA-3824] SI creation on unkbown table and doc changes.
CarbonDataQA1 commented on pull request #3980: URL: https://github.com/apache/carbondata/pull/3980#issuecomment-712578745 Build Success with Spark 2.3.4, Please check CI http://121.244.95.60:12545/job/ApacheCarbonPRBuilder2.3/4522/ This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3980: [CARBONDATA-3901] [CARBONDATA-3903] [CARBONDATA-3824] SI creation on unkbown table and doc changes.
CarbonDataQA1 commented on pull request #3980: URL: https://github.com/apache/carbondata/pull/3980#issuecomment-712579135 Build Success with Spark 2.4.5, Please check CI http://121.244.95.60:12545/job/ApacheCarbon_PR_Builder_2.4.5/2768/ This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [carbondata] akashrn5 commented on pull request #3980: [CARBONDATA-3901] [CARBONDATA-3903] [CARBONDATA-3824] SI creation on unkbown table and doc changes.
akashrn5 commented on pull request #3980: URL: https://github.com/apache/carbondata/pull/3980#issuecomment-712582284 LGTM This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [carbondata] asfgit closed pull request #3980: [CARBONDATA-3901] [CARBONDATA-3903] [CARBONDATA-3824] SI creation on unkbown table and doc changes.
asfgit closed pull request #3980: URL: https://github.com/apache/carbondata/pull/3980 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[jira] [Resolved] (CARBONDATA-3901) Documentation issues in https://github.com/apache/carbondata/tree/master/docs
[ https://issues.apache.org/jira/browse/CARBONDATA-3901?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Akash R Nilugal resolved CARBONDATA-3901. - Fix Version/s: 2.1.0 Resolution: Fixed > Documentation issues in https://github.com/apache/carbondata/tree/master/docs > - > > Key: CARBONDATA-3901 > URL: https://issues.apache.org/jira/browse/CARBONDATA-3901 > Project: CarbonData > Issue Type: Bug > Components: docs >Affects Versions: 2.0.1 > Environment: https://github.com/apache/carbondata/tree/master/docs >Reporter: Chetan Bhat >Priority: Minor > Fix For: 2.1.0 > > Time Spent: 2h 40m > Remaining Estimate: 0h > > *Issue 1 :* > [https://github.com/apache/carbondata/blob/master/docs/alluxio-guide.md] > getOrCreateCarbonSession not used in Carbon 2.0 version and should be > removed.Issue 1 : > [https://github.com/apache/carbondata/blob/master/docs/alluxio-guide.md] > getOrCreateCarbonSession not used in Carbon 2.0 version and should be removed. > Testing use alluxio by CarbonSessionimport > org.apache.spark.sql.CarbonSession._import org.apache.spark.sql.SparkSession > val carbon = > SparkSession.builder().master("local").appName("test").getOrCreateCarbonSession("alluxio://localhost:19998/carbondata");carbon.sql("CREATE > TABLE carbon_alluxio(id String,name String, city String,age Int) STORED as > carbondata");carbon.sql(s"LOAD DATA LOCAL INPATH > '${CARBONDATA_PATH}/integration/spark/src/test/resources/sample.csv' into > table carbon_alluxio");carbon.sql("select * from carbon_alluxio").show > *Issue 2 -* > [https://github.com/apache/carbondata/blob/master/docs/ddl-of-carbondata.mdSORT_SCOPE] > Sort scope of the load.Options include no sort, local sort ,batch sort and > global sort --> Batch sort to be removed as its not supported. > *Issue 3 -* > [https://github.com/apache/carbondata/blob/master/docs/streaming-guide.md#close-stream] > CLOSE STREAM link is not working. > *Issue 4 -* > [https://github.com/apache/carbondata/blob/master/docs/index/bloomfilter-index-guide.md] > Explain query does not hit the bloom. Hence the line "User can verify > whether a query can leverage BloomFilter Index by executing {{EXPLAIN}} > command, which will show the transformed logical plan, and thus user can > check whether the BloomFilter Index can skip blocklets during the scan." > needs to be removed. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (CARBONDATA-3901) Documentation issues in https://github.com/apache/carbondata/tree/master/docs
[ https://issues.apache.org/jira/browse/CARBONDATA-3901?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chetan Bhat updated CARBONDATA-3901: Description: *Issue 1 -* [https://github.com/apache/carbondata/blob/master/docs/ddl-of-carbondata.mdSORT_SCOPE] Sort scope of the load.Options include no sort, local sort ,batch sort and global sort --> Batch sort to be removed as its not supported. *Issue 2 -* [https://github.com/apache/carbondata/blob/master/docs/streaming-guide.md#close-stream] CLOSE STREAM link is not working. was: *Issue 1 :* [https://github.com/apache/carbondata/blob/master/docs/alluxio-guide.md] getOrCreateCarbonSession not used in Carbon 2.0 version and should be removed.Issue 1 : [https://github.com/apache/carbondata/blob/master/docs/alluxio-guide.md] getOrCreateCarbonSession not used in Carbon 2.0 version and should be removed. Testing use alluxio by CarbonSessionimport org.apache.spark.sql.CarbonSession._import org.apache.spark.sql.SparkSession val carbon = SparkSession.builder().master("local").appName("test").getOrCreateCarbonSession("alluxio://localhost:19998/carbondata");carbon.sql("CREATE TABLE carbon_alluxio(id String,name String, city String,age Int) STORED as carbondata");carbon.sql(s"LOAD DATA LOCAL INPATH '${CARBONDATA_PATH}/integration/spark/src/test/resources/sample.csv' into table carbon_alluxio");carbon.sql("select * from carbon_alluxio").show *Issue 2 -* [https://github.com/apache/carbondata/blob/master/docs/ddl-of-carbondata.mdSORT_SCOPE] Sort scope of the load.Options include no sort, local sort ,batch sort and global sort --> Batch sort to be removed as its not supported. *Issue 3 -* [https://github.com/apache/carbondata/blob/master/docs/streaming-guide.md#close-stream] CLOSE STREAM link is not working. *Issue 4 -* [https://github.com/apache/carbondata/blob/master/docs/index/bloomfilter-index-guide.md] Explain query does not hit the bloom. Hence the line "User can verify whether a query can leverage BloomFilter Index by executing {{EXPLAIN}} command, which will show the transformed logical plan, and thus user can check whether the BloomFilter Index can skip blocklets during the scan." needs to be removed. > Documentation issues in https://github.com/apache/carbondata/tree/master/docs > - > > Key: CARBONDATA-3901 > URL: https://issues.apache.org/jira/browse/CARBONDATA-3901 > Project: CarbonData > Issue Type: Bug > Components: docs >Affects Versions: 2.0.1 > Environment: https://github.com/apache/carbondata/tree/master/docs >Reporter: Chetan Bhat >Priority: Minor > Fix For: 2.1.0 > > Time Spent: 2h 40m > Remaining Estimate: 0h > > *Issue 1 -* > [https://github.com/apache/carbondata/blob/master/docs/ddl-of-carbondata.mdSORT_SCOPE] > Sort scope of the load.Options include no sort, local sort ,batch sort and > global sort --> Batch sort to be removed as its not supported. > *Issue 2 -* > [https://github.com/apache/carbondata/blob/master/docs/streaming-guide.md#close-stream] > CLOSE STREAM link is not working. > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Resolved] (CARBONDATA-3903) Documentation Issue in Github Docs Link https://github.com/apache/carbondata/tree/master/docs
[ https://issues.apache.org/jira/browse/CARBONDATA-3903?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Akash R Nilugal resolved CARBONDATA-3903. - Fix Version/s: 2.1.0 Resolution: Fixed > Documentation Issue in Github Docs Link > https://github.com/apache/carbondata/tree/master/docs > -- > > Key: CARBONDATA-3903 > URL: https://issues.apache.org/jira/browse/CARBONDATA-3903 > Project: CarbonData > Issue Type: Bug > Components: docs >Affects Versions: 2.0.1 > Environment: https://github.com/apache/carbondata/tree/master/docs >Reporter: PURUJIT CHAUGULE >Priority: Minor > Fix For: 2.1.0 > > > dml-of-carbondata.md > LOAD DATA: > * Mention Each Load is considered as a Segment. > * Give all possible options for SORT_SCOPE like > GLOBAL_SORT/LOCAL_SORT/NO_SORT (with explanation of difference between each > type). > * Add Example Of complete Load query with/without use of OPTIONS. > INSERT DATA: > * Mention each insert is a Segment. > LOAD Using Static/Dynamic Partitioning: > * Can give a hyperlink to Static/Dynamic partitioning. > UPDATE/DELETE: > * Mention about delta files concept in update and delete. > DELETE: > * Add example for deletion of all records from a table (delete from > tablename). > COMPACTION: > * Can mention Minor compaction of two types Auto and Manual( > carbon.auto.load.merge =true/false), and that if > carbon.auto.load.merge=false, trigger should be done manually. > * Hyperlink to Configurable properties of Compaction. > * Mention that compacted segments do not get cleaned automatically and > should be triggered manually using clean files. > > flink-integration-guide.md > * Mention what are stages, how is it used. > * Process of insertion, deletion of stages in carbontable. (How is it stored > in carbontable). > > language-manual.md > * Mention Compaction Hyperlink in DML section. > > spatial-index-guide.md > * Mention the TBLPROPERTIES supported / not supported for Geo table. > * Mention Spatial Index does not make a new column. > * CTAS from one geo table to another does not create another Geo table can > be mentioned. > * Mention that a certain combination of Spatial Index table properties need > to be added in create table, without which a geo table does not get created. > * Mention that we cannot alter columns (change datatype, change name, drop) > mentioned in spatial_index. > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Resolved] (CARBONDATA-3824) Error when Secondary index tried to be created on table that does not exist is not correct.
[ https://issues.apache.org/jira/browse/CARBONDATA-3824?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Akash R Nilugal resolved CARBONDATA-3824. - Fix Version/s: 2.1.0 Resolution: Fixed > Error when Secondary index tried to be created on table that does not exist > is not correct. > --- > > Key: CARBONDATA-3824 > URL: https://issues.apache.org/jira/browse/CARBONDATA-3824 > Project: CarbonData > Issue Type: Bug > Components: data-query >Affects Versions: 2.0.0 > Environment: Spark 2.3.2, Spark 2.4.5 >Reporter: Chetan Bhat >Priority: Minor > Fix For: 2.1.0 > > > *Issue :-* > Table uniqdata_double does not exist. > Secondary index tried to be created on table. Error message is incorrect. > CREATE INDEX indextable2 ON TABLE uniqdata_double (DOB) AS 'carbondata' > PROPERTIES('carbon.column.compressor'='zstd'); > *Error: java.lang.RuntimeException: Operation not allowed on non-carbon table > (state=,code=0)* > > *Expected :-* > *Error: java.lang.RuntimeException: Table does not exist* *(state=,code=0)*** -- This message was sent by Atlassian Jira (v8.3.4#803005)
[GitHub] [carbondata] Karan980 commented on pull request #3970: [CARBONDATA-4007] Fix multiple issues in SDK
Karan980 commented on pull request #3970: URL: https://github.com/apache/carbondata/pull/3970#issuecomment-712595181 retest this please This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [carbondata] Indhumathi27 commented on pull request #3789: [CARBONDATA-3864] Store Size Optimization
Indhumathi27 commented on pull request #3789: URL: https://github.com/apache/carbondata/pull/3789#issuecomment-712598719 retest this please This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [carbondata] kunal642 commented on pull request #3983: [CARBONDATA-4036]Fix special char(`) issue in create table, when column name contains ` character
kunal642 commented on pull request #3983: URL: https://github.com/apache/carbondata/pull/3983#issuecomment-712600158 retest this please This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3982: [CARBONDATA-4032] Fix drop partition command clean data issue
CarbonDataQA1 commented on pull request #3982: URL: https://github.com/apache/carbondata/pull/3982#issuecomment-712602226 Build Failed with Spark 2.4.5, Please check CI http://121.244.95.60:12545/job/ApacheCarbon_PR_Builder_2.4.5/2769/ This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [carbondata] ShreelekhyaG commented on pull request #3914: [CARBONDATA-3979] Added Hive local dictionary support example
ShreelekhyaG commented on pull request #3914: URL: https://github.com/apache/carbondata/pull/3914#issuecomment-712602617 retest this please This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3982: [CARBONDATA-4032] Fix drop partition command clean data issue
CarbonDataQA1 commented on pull request #3982: URL: https://github.com/apache/carbondata/pull/3982#issuecomment-712604515 Build Failed with Spark 2.3.4, Please check CI http://121.244.95.60:12545/job/ApacheCarbonPRBuilder2.3/4523/ This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [carbondata] akashrn5 commented on a change in pull request #3875: [CARBONDATA-3934]Support write transactional table with presto.
akashrn5 commented on a change in pull request #3875: URL: https://github.com/apache/carbondata/pull/3875#discussion_r508236769 ## File path: integration/hive/src/main/java/org/apache/carbondata/hive/MapredCarbonOutputFormat.java ## @@ -92,6 +95,14 @@ public void checkOutputSpecs(FileSystem fileSystem, JobConf jobConf) throws IOEx } String tablePath = FileFactory.getCarbonFile(carbonLoadModel.getTablePath()).getAbsolutePath(); TaskAttemptID taskAttemptID = TaskAttemptID.forName(jc.get("mapred.task.id")); +// taskAttemptID will be null when the insert job is fired from presto. Presto send the JobConf +// and since presto does not use the MR framework for execution, the mapred.task.id will be +// null, so prepare a new ID. +if (taskAttemptID == null) { + SimpleDateFormat formatter = new SimpleDateFormat("MMddHHmm"); + String jobTrackerId = formatter.format(new Date()); + taskAttemptID = new TaskAttemptID(jobTrackerId, 0, TaskType.MAP, 0, 0); Review comment: Here `taskAttemptID ` is `TaskAttemptID` object. Since for every writer it creates new task, there should be no problem. We get the jobconf from presto, we prepare the taskattemptid just for writer close purpose and initialize, so it should be fine i guess. what you think? With respect to ORC writer if you see, ORC uses the different `FIleOutPutFormat `from `mapred `package, we use `mapreduce `package, In mapred, taskcontext is not used, so they are not using this. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3917: [CARBONDATA-3978] Clean Files Refactor and support for trash folder in carbondata
CarbonDataQA1 commented on pull request #3917: URL: https://github.com/apache/carbondata/pull/3917#issuecomment-712638299 Build Failed with Spark 2.4.5, Please check CI http://121.244.95.60:12545/job/ApacheCarbon_PR_Builder_2.4.5/2774/ This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org