[GitHub] [carbondata] CarbonDataQA2 commented on pull request #4020: [CARBONDATA-4054] Support data size control for minor compaction
CarbonDataQA2 commented on pull request #4020: URL: https://github.com/apache/carbondata/pull/4020#issuecomment-732722971 Build Success with Spark 2.4.5, Please check CI http://121.244.95.60:12545/job/ApacheCarbon_PR_Builder_2.4.5/3121/ This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [carbondata] CarbonDataQA2 commented on pull request #4020: [CARBONDATA-4054] Support data size control for minor compaction
CarbonDataQA2 commented on pull request #4020: URL: https://github.com/apache/carbondata/pull/4020#issuecomment-732721963 Build Success with Spark 2.3.4, Please check CI http://121.244.95.60:12545/job/ApacheCarbonPRBuilder2.3/4874/ This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [carbondata] CarbonDataQA2 commented on pull request #4004: [WIP] update benchmark
CarbonDataQA2 commented on pull request #4004: URL: https://github.com/apache/carbondata/pull/4004#issuecomment-732691284 Build Failed with Spark 2.4.5, Please check CI http://121.244.95.60:12545/job/ApacheCarbon_PR_Builder_2.4.5/3120/ This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [carbondata] CarbonDataQA2 commented on pull request #4004: [WIP] update benchmark
CarbonDataQA2 commented on pull request #4004: URL: https://github.com/apache/carbondata/pull/4004#issuecomment-732686545 Build Failed with Spark 2.3.4, Please check CI http://121.244.95.60:12545/job/ApacheCarbonPRBuilder2.3/4873/ This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [carbondata] Karan980 commented on pull request #4009: [CARBONDATA-4029] Fix Old Timestamp issue in alter add segement
Karan980 commented on pull request #4009: URL: https://github.com/apache/carbondata/pull/4009#issuecomment-732682464 retest this please This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [carbondata] Indhumathi27 commented on a change in pull request #4000: [CARBONDATA-4020] Fixed drop index when multiple index exists
Indhumathi27 commented on a change in pull request #4000: URL: https://github.com/apache/carbondata/pull/4000#discussion_r529229059 ## File path: integration/spark/src/test/scala/org/apache/carbondata/index/bloom/BloomCoarseGrainIndexFunctionSuite.scala ## @@ -660,6 +660,22 @@ class BloomCoarseGrainIndexFunctionSuite sql(s"SELECT * FROM $normalTable WHERE salary='1040'")) } + test("test drop index when more than one bloom index exists") { +sql(s"CREATE TABLE $bloomSampleTable " + + "(id int,name string,salary int)STORED as carbondata TBLPROPERTIES('SORT_COLUMNS'='id')") +sql(s"CREATE index index1 ON TABLE $bloomSampleTable(id) as 'bloomfilter' " + + "PROPERTIES ( 'BLOOM_SIZE'='64', 'BLOOM_FPP'='0.1', 'BLOOM_COMPRESS'='true')") +sql(s"CREATE index index2 ON TABLE $bloomSampleTable (name) as 'bloomfilter' " + + "PROPERTIES ('BLOOM_SIZE'='64', 'BLOOM_FPP'='0.1', 'BLOOM_COMPRESS'='true')") +sql(s"insert into $bloomSampleTable values(1,'nihal',20)") +sql(s"SHOW INDEXES ON TABLE $bloomSampleTable").collect() +checkExistence(sql(s"SHOW INDEXES ON TABLE $bloomSampleTable"), true, "index1") Review comment: ```suggestion checkExistence(sql(s"SHOW INDEXES ON TABLE $bloomSampleTable"), true, "index1", "index2" ) ``` Remove next line ## File path: integration/spark/src/test/scala/org/apache/carbondata/index/bloom/BloomCoarseGrainIndexFunctionSuite.scala ## @@ -660,6 +660,22 @@ class BloomCoarseGrainIndexFunctionSuite sql(s"SELECT * FROM $normalTable WHERE salary='1040'")) } + test("test drop index when more than one bloom index exists") { +sql(s"CREATE TABLE $bloomSampleTable " + + "(id int,name string,salary int)STORED as carbondata TBLPROPERTIES('SORT_COLUMNS'='id')") +sql(s"CREATE index index1 ON TABLE $bloomSampleTable(id) as 'bloomfilter' " + + "PROPERTIES ( 'BLOOM_SIZE'='64', 'BLOOM_FPP'='0.1', 'BLOOM_COMPRESS'='true')") +sql(s"CREATE index index2 ON TABLE $bloomSampleTable (name) as 'bloomfilter' " + + "PROPERTIES ('BLOOM_SIZE'='64', 'BLOOM_FPP'='0.1', 'BLOOM_COMPRESS'='true')") +sql(s"insert into $bloomSampleTable values(1,'nihal',20)") +sql(s"SHOW INDEXES ON TABLE $bloomSampleTable").collect() +checkExistence(sql(s"SHOW INDEXES ON TABLE $bloomSampleTable"), true, "index1") +checkExistence(sql(s"SHOW INDEXES ON TABLE $bloomSampleTable"), true, "index2") +sql(s"drop index index1 on $bloomSampleTable") +sql(s"show indexes on table $bloomSampleTable").show() Review comment: remove this line ## File path: integration/spark/src/main/scala/org/apache/spark/sql/execution/command/index/DropIndexCommand.scala ## @@ -184,10 +184,10 @@ private[sql] case class DropIndexCommand( parentCarbonTable = getRefreshedParentTable(sparkSession, dbName) val indexMetadata = parentCarbonTable.getIndexMetadata if (null != indexMetadata && null != indexMetadata.getIndexesMap) { - val hasCgFgIndexes = -!(indexMetadata.getIndexesMap.size() == 1 && - indexMetadata.getIndexesMap.containsKey(IndexType.SI.getIndexProviderName)) - if (hasCgFgIndexes) { + val hasCgFgIndexes = indexMetadata.getIndexesMap.size() != 0 && Review comment: Please add a comment, on which case, we need to set 'indexExists' to false This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[jira] [Closed] (CARBONDATA-4021) With Index server running, Upon executing count* we are getting the below error, after adding the parquet and ORC segment.
[ https://issues.apache.org/jira/browse/CARBONDATA-4021?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Prasanna Ravichandran closed CARBONDATA-4021. - Resolution: Not A Problem > With Index server running, Upon executing count* we are getting the below > error, after adding the parquet and ORC segment. > --- > > Key: CARBONDATA-4021 > URL: https://issues.apache.org/jira/browse/CARBONDATA-4021 > Project: CarbonData > Issue Type: Bug >Affects Versions: 2.0.0 >Reporter: Prasanna Ravichandran >Priority: Major > > We are getting below issues while index server enable and index server > fallback disable is configured as true. With count* we are getting the below > error, after adding the parquet and ORC segment. > Queries and error: > > use rps; > +-+| > Result | > +-+ > +-+ > No rows selected (0.054 seconds) > > drop table if exists uniqdata; > +-+ > |Result| > +-+ > +-+ > No rows selected (0.229 seconds) > > CREATE TABLE uniqdata (cust_id int,cust_name String,active_emui_version > > string, dob timestamp, doj timestamp, bigint_column1 bigint,bigint_column2 > > bigint,decimal_column1 decimal(30,10), decimal_column2 > > decimal(36,36),double_column1 double, double_column2 double,integer_column1 > > int) stored as carbondata; > +-+ > |Result| > +-+ > +-+ > No rows selected (0.756 seconds) > > load data inpath 'hdfs://hacluster/user/prasanna/2000_UniqData.csv' into > > table uniqdata > > options('fileheader'='cust_id,cust_name,active_emui_version,dob,doj,bigint_column1,bigint_column2,decimal_column1,decimal_column2,double_column1,double_column2,integer_column1','bad_records_action'='force'); > INFO : Execution ID: 95 > +-+| > Result | > +-+ > +-+ > No rows selected(2.789 seconds) > > use default; > +-+ > |Result| > +-+ > +-+ > No rows selected (0.052 seconds) > > drop table if exists uniqdata; > +-+ > |Result| > +-+ > +-+ > No rows selected (1.122 seconds) > > CREATE TABLE uniqdata (cust_id int,cust_name String,active_emui_version > > string, dob timestamp, doj timestamp, bigint_column1 bigint,bigint_column2 > > bigint,decimal_column1 decimal(30,10), decimal_column2 > > decimal(36,36),double_column1 double, double_column2 double,integer_column1 > > int) stored as carbondata; > +-+ > | Result | > +-+ > +-+ > No rows selected (0.508 seconds) > > load data inpath 'hdfs://hacluster/user/prasanna/2000_UniqData.csv' into > > table uniqdata > > options('fileheader'='cust_id,cust_name,active_emui_version,dob,doj,bigint_column1,bigint_column2,decimal_column1,decimal_column2,double_column1,double_column2,integer_column1','bad_records_action'='force'); > INFO : Execution ID: 108 > +-+ > |Result| > +-+ > +-+ > No rows selected (1.316 seconds) > > drop table if exists uniqdata_parquet; > +-+ > |Result| > +-+ > +-+ > No rows selected (0.668 seconds) > > CREATE TABLE uniqdata_parquet (cust_id int,cust_name > > String,active_emui_version string, dob timestamp, doj timestamp, > > bigint_column1 bigint,bigint_column2 bigint,decimal_column1 decimal(30,10), > > decimal_column2 decimal(36,36),double_column1 double, double_column2 > > double,integer_column1 int) stored as parquet; > +-+ > |Result| > +-+ > +-+ > No rows selected (0.397 seconds) > > insert into uniqdata_parquet select * from uniqdata; > INFO : Execution ID: 116 > +-+ > |Result| > +-+ > +-+ > No rows selected (4.805 seconds) > > drop table if exists uniqdata_orc; > +-+ > |Result| > +-+ > +-+ > No rows selected (0.553 seconds) > > CREATE TABLE uniqdata_orc (cust_id int,cust_name String,active_emui_version > > string, dob timestamp, doj timestamp, bigint_column1 bigint,bigint_column2 > > bigint,decimal_column1 decimal(30,10), decimal_column2 > > decimal(36,36),double_column1 double, double_column2 double,integer_column1 > > int) using orc; > +-+ > |Result| > +-+ > +-+ > No rows selected (0.396 seconds) > > insert into uniqdata_orc select * from uniqdata; > INFO : Execution ID: 122 > +-+ > |Result| > +-+ > +-+ > No rows selected (3.403 seconds) > > use rps; > +-+ > |Result| > +-+ > +-+ > No rows selected (0.06 seconds) > > Alter table uniqdata add segment options > > ('path'='hdfs://hacluster/user/hive/warehouse/uniqdata_parquet','format'='parquet'); > INFO : Execution ID: 126 > +-+ > |Result| > +-+ > +-+ > No rows selected (1.511 seconds) > > Alter table uniqdata add segment options
[GitHub] [carbondata] VenuReddy2103 commented on a change in pull request #4012: [CARBONDATA-4051] Geo spatial index algorithm improvement and UDFs enhancement
VenuReddy2103 commented on a change in pull request #4012: URL: https://github.com/apache/carbondata/pull/4012#discussion_r529216859 ## File path: geo/src/main/java/org/apache/carbondata/geo/GeoConstants.java ## @@ -26,4 +26,31 @@ private GeoConstants() { // GeoHash type Spatial Index public static final String GEOHASH = "geohash"; + + // Regular expression to parse input polygons for IN_POLYGON_LIST + // public static final String POLYGON_REG_EXPRESSION = "POLYGON \\(\\(.*?\\)\\)"; + public static final String POLYGON_REG_EXPRESSION = "(?<=POLYGON \\(\\()(.*?)(?=(\\)\\)))"; + + // Regular expression to parse input polylines for IN_POLYLINE_LIST + public static final String POLYLINE_REG_EXPRESSION = "LINESTRING \\(.*?\\)"; + + // Regular expression to parse input rangelists for IN_POLYGON_RANGE_LIST + public static final String RANGELIST_REG_EXPRESSION = "(?<=RANGELIST \\()(.*?)(?=\\))"; + + // delimiter of input points or ranges + public static final String DEFAULT_DELIMITER = ","; + + // conversion factor of angle to radian + public static final double CONVERT_FACTOR = 180.0; + // Earth radius + public static final double EARTH_RADIUS = 6371004.0; Review comment: Can remove EARTH_RADIUS const def from GeoHashIndex.java now. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [carbondata] CarbonDataQA2 commented on pull request #4020: [CARBONDATA-4054] Support data size control for minor compaction
CarbonDataQA2 commented on pull request #4020: URL: https://github.com/apache/carbondata/pull/4020#issuecomment-732662834 Build Failed with Spark 2.3.4, Please check CI http://121.244.95.60:12545/job/ApacheCarbonPRBuilder2.3/4871/ This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [carbondata] CarbonDataQA2 commented on pull request #4020: [CARBONDATA-4054] Support data size control for minor compaction
CarbonDataQA2 commented on pull request #4020: URL: https://github.com/apache/carbondata/pull/4020#issuecomment-732662506 Build Failed with Spark 2.4.5, Please check CI http://121.244.95.60:12545/job/ApacheCarbon_PR_Builder_2.4.5/3118/ This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [carbondata] marchpure commented on pull request #4004: [WIP] update benchmark
marchpure commented on pull request #4004: URL: https://github.com/apache/carbondata/pull/4004#issuecomment-732653163 retest this please This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [carbondata] CarbonDataQA2 commented on pull request #4019: [CARBONDATA-4053] Fix alter table rename column failed when column name is "a"
CarbonDataQA2 commented on pull request #4019: URL: https://github.com/apache/carbondata/pull/4019#issuecomment-732651208 Build Success with Spark 2.4.5, Please check CI http://121.244.95.60:12545/job/ApacheCarbon_PR_Builder_2.4.5/3117/ This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [carbondata] CarbonDataQA2 commented on pull request #4004: [WIP] update benchmark
CarbonDataQA2 commented on pull request #4004: URL: https://github.com/apache/carbondata/pull/4004#issuecomment-732649478 Build Success with Spark 2.4.5, Please check CI http://121.244.95.60:12545/job/ApacheCarbon_PR_Builder_2.4.5/3116/ This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [carbondata] CarbonDataQA2 commented on pull request #4019: [CARBONDATA-4053] Fix alter table rename column failed when column name is "a"
CarbonDataQA2 commented on pull request #4019: URL: https://github.com/apache/carbondata/pull/4019#issuecomment-732649284 Build Success with Spark 2.3.4, Please check CI http://121.244.95.60:12545/job/ApacheCarbonPRBuilder2.3/4870/ This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [carbondata] CarbonDataQA2 commented on pull request #4004: [WIP] update benchmark
CarbonDataQA2 commented on pull request #4004: URL: https://github.com/apache/carbondata/pull/4004#issuecomment-732648042 Build Success with Spark 2.3.4, Please check CI http://121.244.95.60:12545/job/ApacheCarbonPRBuilder2.3/4869/ This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [carbondata] VenuReddy2103 commented on a change in pull request #4010: [CARBONDATA-4050]Avoid redundant RPC calls to get file status when CarbonFile is instantiated with fileStatus construct
VenuReddy2103 commented on a change in pull request #4010: URL: https://github.com/apache/carbondata/pull/4010#discussion_r529192088 ## File path: core/src/main/java/org/apache/carbondata/core/datastore/filesystem/AbstractDFSCarbonFile.java ## @@ -541,7 +547,10 @@ public boolean createNewLockFile() throws IOException { @Override public String[] getLocations() throws IOException { BlockLocation[] blkLocations; -FileStatus fileStatus = fileSystem.getFileStatus(path); +FileStatus fileStatus = this.fileStatus; Review comment: Please refer to reply in below comment. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [carbondata] Zhangshunyu opened a new pull request #4020: [CARBONDATA-4054] Support data size control for minor compaction
Zhangshunyu opened a new pull request #4020: URL: https://github.com/apache/carbondata/pull/4020 ### Why is this PR needed? Currentlly, minor compaction only consider the num of segments and major compaction only consider the SUM size of segments, but consider a scenario that the user want to use minor compaction by the num of segments but he dont want to merge the segment whose datasize larger the threshold for example 2GB, as it is no need to merge so much big segment and it is time costly. ### What changes were proposed in this PR? add a parameter to control the threshold of segment included in minor compaction, so that the user can specify the segment not included in minor compaction once the datasize exeed the threshold, system level and table level can be set, and if not set the use default value. ### Does this PR introduce any user interface change? - No ### Is any new testcase added? - Yes This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[jira] [Updated] (CARBONDATA-4051) Geo spatial index algorithm improvement and UDFs enhancement
[ https://issues.apache.org/jira/browse/CARBONDATA-4051?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jiayu Shen updated CARBONDATA-4051: --- Attachment: Genex Cloud Carbon Spatial Index Specification.docx > Geo spatial index algorithm improvement and UDFs enhancement > > > Key: CARBONDATA-4051 > URL: https://issues.apache.org/jira/browse/CARBONDATA-4051 > Project: CarbonData > Issue Type: New Feature >Reporter: Jiayu Shen >Priority: Minor > Attachments: Genex Cloud Carbon Spatial Index > Specification.docx > > Time Spent: 1h 50m > Remaining Estimate: 0h > > The requirement is from SEQ,related algorithms are provided by group > Discovery. > 1. Replace geohash encoded algorithm, and reduce required properties of > CREATE TABLE. For example, > {code:java} > CREATE TABLE geoTable( > timevalue BIGINT, > longitude LONG, > latitude LONG) COMMENT "This is a GeoTable" > STORED AS carbondata > TBLPROPERTIES ($customProperties 'SPATIAL_INDEX'='mygeohash', > 'SPATIAL_INDEX.mygeohash.type'='geohash', > 'SPATIAL_INDEX.mygeohash.sourcecolumns'='longitude, latitude', > 'SPATIAL_INDEX.mygeohash.originLatitude'='39.832277', > 'SPATIAL_INDEX.mygeohash.gridSize'='50', > 'SPATIAL_INDEX.mygeohash.conversionRatio'='100'){code} > 2. Add geo query UDFs > query filter UDFs : > * _*InPolygonList (List polygonList, OperationType opType)*_ > * _*InPolylineList (List polylineList, Float bufferInMeter)*_ > * _*InPolygonRangeList (List RangeList, **OperationType opType**)*_ > *operation only support :* > * *"OR", means calculating union of two polygons* > * *"AND", means calculating intersection of two polygons* > geo util UDFs : > * _*GeoIdToGridXy(Long geoId) :* *Pair*_ > * _*LatLngToGeoId(**Long* *latitude, Long* *longitude) : Long*_ > * _*GeoIdToLatLng(Long geoId) : Pair*_ > * _*ToUpperLayerGeoId(Long geoId) : Long*_ > * _*ToRangeList (String polygon) : List*_ > 3. Currently GeoID is a column created internally for spatial tables, this PR > will support GeoID column to be customized during LOAD/INSERT INTO. For > example, > {code:java} > INSERT INTO geoTable SELECT 0,157542840,116285807,40084087; > It uesed to be as below, '855280799612' is generated internally, > ++-+-++ > |mygeohash |timevalue |longitude|latitude| > ++-+-++ > |855280799612|157542840|116285807|40084087| > ++-+-++ > but now is > ++-+-++ > |mygeohash |timevalue |longitude|latitude| > ++-+-++ > |0 |157542840|116285807|40084087| > ++-+-++{code} > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[GitHub] [carbondata] ajantha-bhat commented on a change in pull request #4019: [CARBONDATA-4053] Fix alter table rename column failed when column name is "a"
ajantha-bhat commented on a change in pull request #4019: URL: https://github.com/apache/carbondata/pull/4019#discussion_r529175578 ## File path: integration/spark/src/main/scala/org/apache/spark/util/AlterTableUtil.scala ## @@ -332,12 +332,26 @@ object AlterTableUtil { tableProperties: mutable.Map[String, String], oldColumnName: String, newColumnName: String): Unit = { +val columnProperties = Seq("NO_INVERTED_INDEX", + "INVERTED_INDEX", + "INDEX_COLUMNS", + "COLUMN_META_CACHE", + "DICTIONARY_INCLUDE", Review comment: Also INDEX_COLUMNS are changed to SPATIAL_INDEX I think, refer docs/spatial-index-guide.md This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [carbondata] ajantha-bhat commented on a change in pull request #4019: [CARBONDATA-4053] Fix alter table rename column failed when column name is "a"
ajantha-bhat commented on a change in pull request #4019: URL: https://github.com/apache/carbondata/pull/4019#discussion_r529174796 ## File path: integration/spark/src/test/scala/org/apache/spark/carbondata/restructure/vectorreader/AlterTableColumnRenameTestCase.scala ## @@ -38,6 +38,15 @@ class AlterTableColumnRenameTestCase extends QueryTest with BeforeAndAfterAll { assert(null == carbonTable.getColumnByName("empname")) } + test("CARBONDATA-4053 test rename column when column name is a") { +sql("create table simple_table(a int) stored as carbondata") +sql("alter table simple_table change a a1 int") +val carbonTable = CarbonMetadata.getInstance().getCarbonTable("default", "simple_table") Review comment: This issue happens when you have table properties say SORT_COLUMNS = "a,b" Then the column name changed from a to a1, then it was changing SORT_COLUMNS = "a1" instead of SORT_COLUMNS = "a1,b". So, your current testcase cannot reproduce issue. so, add the testcase that gives issue without the change by keeping some table properties. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [carbondata] ajantha-bhat commented on a change in pull request #4019: [CARBONDATA-4053] Fix alter table rename column failed when column name is "a"
ajantha-bhat commented on a change in pull request #4019: URL: https://github.com/apache/carbondata/pull/4019#discussion_r529174200 ## File path: integration/spark/src/main/scala/org/apache/spark/util/AlterTableUtil.scala ## @@ -332,12 +332,26 @@ object AlterTableUtil { tableProperties: mutable.Map[String, String], oldColumnName: String, newColumnName: String): Unit = { +val columnProperties = Seq("NO_INVERTED_INDEX", + "INVERTED_INDEX", + "INDEX_COLUMNS", + "COLUMN_META_CACHE", + "DICTIONARY_INCLUDE", Review comment: we have removed DICTIONARY_INCLUDE, DICTIONARY_EXCLUDE from 2.0, please check the applicable table properties and keep only those also add if something new added in 2.0 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [carbondata] jack86596 opened a new pull request #4019: [CARBONDATA-4053] Fix alter table rename column failed when column name is "a"
jack86596 opened a new pull request #4019: URL: https://github.com/apache/carbondata/pull/4019 ### Why is this PR needed? Alter table rename column failed because incorrectly replace the content in tblproperties by new column name, which the content is not related to column name. ### What changes were proposed in this PR? Instead of calling replace method on property value directly, first filter out the properties which related to column name, then find the matched old column name, replace it with new name. ### Does this PR introduce any user interface change? - No ### Is any new testcase added? - Yes This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [carbondata] CarbonDataQA2 commented on pull request #4004: [WIP] update benchmark
CarbonDataQA2 commented on pull request #4004: URL: https://github.com/apache/carbondata/pull/4004#issuecomment-732553078 Build Failed with Spark 2.4.5, Please check CI http://121.244.95.60:12545/job/ApacheCarbon_PR_Builder_2.4.5/3115/ This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [carbondata] CarbonDataQA2 commented on pull request #4004: [WIP] update benchmark
CarbonDataQA2 commented on pull request #4004: URL: https://github.com/apache/carbondata/pull/4004#issuecomment-732552402 Build Failed with Spark 2.3.4, Please check CI http://121.244.95.60:12545/job/ApacheCarbonPRBuilder2.3/4868/ This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [carbondata] CarbonDataQA2 commented on pull request #4009: [CARBONDATA-4029] Fix Old Timestamp issue in alter add segement
CarbonDataQA2 commented on pull request #4009: URL: https://github.com/apache/carbondata/pull/4009#issuecomment-732364295 Build Failed with Spark 2.4.5, Please check CI http://121.244.95.60:12545/job/ApacheCarbon_PR_Builder_2.4.5/3114/ This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [carbondata] CarbonDataQA2 commented on pull request #4009: [CARBONDATA-4029] Fix Old Timestamp issue in alter add segement
CarbonDataQA2 commented on pull request #4009: URL: https://github.com/apache/carbondata/pull/4009#issuecomment-732363870 Build Success with Spark 2.3.4, Please check CI http://121.244.95.60:12545/job/ApacheCarbonPRBuilder2.3/4867/ This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[jira] [Commented] (CARBONDATA-3896) Throw an exception using an index server query
[ https://issues.apache.org/jira/browse/CARBONDATA-3896?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17237544#comment-17237544 ] Karan commented on CARBONDATA-3896: --- Please share the query details upon which you are getting above error. > Throw an exception using an index server query > -- > > Key: CARBONDATA-3896 > URL: https://issues.apache.org/jira/browse/CARBONDATA-3896 > Project: CarbonData > Issue Type: Bug > Components: core >Affects Versions: 1.6.1 >Reporter: li >Priority: Major > Fix For: 1.6.1 > > > 2020-07-10 10:49:02 WARN Server:1853 - Unable to read call parameters for > client 10.10.151.15on connection protocol Server for rpcKind RPC_WRITABLE > java.io.EOFException > at java.io.DataInputStream.readFully(DataInputStream.java:197) > at java.io.DataInputStream.readUTF(DataInputStream.java:609) > at java.io.DataInputStream.readUTF(DataInputStream.java:564) > at > org.apache.carbondata.core.datamap.DistributableDataMapFormat.readFields(DistributableDataMapFormat.java:286) > at org.apache.hadoop.io.ObjectWritable.readObject(ObjectWritable.java:285) > at > org.apache.hadoop.ipc.WritableRpcEngine$Invocation.readFields(WritableRpcEngine.java:161) > at > org.apache.hadoop.ipc.Server$Connection.processRpcRequest(Server.java:1851) > at org.apache.hadoop.ipc.Server$Connection.processOneRpc(Server.java:1783) > at org.apache.hadoop.ipc.Server$Connection.readAndProcess(Server.java:1541) > at org.apache.hadoop.ipc.Server$Listener.doRead(Server.java:771) > at org.apache.hadoop.ipc.Server$Listener$Reader.doRunLoop(Server.java:637) > at org.apache.hadoop.ipc.Server$Listener$Reader.run(Server.java:608) > 2020-07-10 10:49:02 INFO Server:780 - Socket Reader #1 for port 9596: > readAndProcess from client 10.10.151.15 threw exception > [org.apache.hadoop.ipc.RpcServerException: IPC server unable to read call > parameters: null] > 2020-07-10 10:50:00 WARN Server:1853 - Unable to read call parameters for > client 10.10.151.15on connection protocol Server for rpcKind RPC_WRITABLE > java.io.EOFException > at java.io.DataInputStream.readFully(DataInputStream.java:197) > at java.io.DataInputStream.readUTF(DataInputStream.java:609) > at java.io.DataInputStream.readUTF(DataInputStream.java:564) > at > org.apache.carbondata.core.datamap.DistributableDataMapFormat.readFields(DistributableDataMapFormat.java:286) > at org.apache.hadoop.io.ObjectWritable.readObject(ObjectWritable.java:285) > at > org.apache.hadoop.ipc.WritableRpcEngine$Invocation.readFields(WritableRpcEngine.java:161) > at > org.apache.hadoop.ipc.Server$Connection.processRpcRequest(Server.java:1851) > at org.apache.hadoop.ipc.Server$Connection.processOneRpc(Server.java:1783) > at org.apache.hadoop.ipc.Server$Connection.readAndProcess(Server.java:1541) > at org.apache.hadoop.ipc.Server$Listener.doRead(Server.java:771) > at org.apache.hadoop.ipc.Server$Listener$Reader.doRunLoop(Server.java:637) > at org.apache.hadoop.ipc.Server$Listener$Reader.run(Server.java:608) > 2020-07-10 10:50:00 INFO Server:780 - Socket Reader #1 for port 9596: > readAndProcess from client 10.10.151.15 threw exception > [org.apache.hadoop.ipc.RpcServerException: IPC server unable to read call > parameters: null] -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (CARBONDATA-4021) With Index server running, Upon executing count* we are getting the below error, after adding the parquet and ORC segment.
[ https://issues.apache.org/jira/browse/CARBONDATA-4021?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17237543#comment-17237543 ] Karan commented on CARBONDATA-4021: --- Caching parquet or ORC segments in Index Server is not supported. Please do not enable index server while querying on carbonn data table having parquet or ORC segments. Even if index Server is ON, please make sure that fallback is not disabled. > With Index server running, Upon executing count* we are getting the below > error, after adding the parquet and ORC segment. > --- > > Key: CARBONDATA-4021 > URL: https://issues.apache.org/jira/browse/CARBONDATA-4021 > Project: CarbonData > Issue Type: Bug >Affects Versions: 2.0.0 >Reporter: Prasanna Ravichandran >Priority: Major > > We are getting below issues while index server enable and index server > fallback disable is configured as true. With count* we are getting the below > error, after adding the parquet and ORC segment. > Queries and error: > > use rps; > +-+| > Result | > +-+ > +-+ > No rows selected (0.054 seconds) > > drop table if exists uniqdata; > +-+ > |Result| > +-+ > +-+ > No rows selected (0.229 seconds) > > CREATE TABLE uniqdata (cust_id int,cust_name String,active_emui_version > > string, dob timestamp, doj timestamp, bigint_column1 bigint,bigint_column2 > > bigint,decimal_column1 decimal(30,10), decimal_column2 > > decimal(36,36),double_column1 double, double_column2 double,integer_column1 > > int) stored as carbondata; > +-+ > |Result| > +-+ > +-+ > No rows selected (0.756 seconds) > > load data inpath 'hdfs://hacluster/user/prasanna/2000_UniqData.csv' into > > table uniqdata > > options('fileheader'='cust_id,cust_name,active_emui_version,dob,doj,bigint_column1,bigint_column2,decimal_column1,decimal_column2,double_column1,double_column2,integer_column1','bad_records_action'='force'); > INFO : Execution ID: 95 > +-+| > Result | > +-+ > +-+ > No rows selected(2.789 seconds) > > use default; > +-+ > |Result| > +-+ > +-+ > No rows selected (0.052 seconds) > > drop table if exists uniqdata; > +-+ > |Result| > +-+ > +-+ > No rows selected (1.122 seconds) > > CREATE TABLE uniqdata (cust_id int,cust_name String,active_emui_version > > string, dob timestamp, doj timestamp, bigint_column1 bigint,bigint_column2 > > bigint,decimal_column1 decimal(30,10), decimal_column2 > > decimal(36,36),double_column1 double, double_column2 double,integer_column1 > > int) stored as carbondata; > +-+ > | Result | > +-+ > +-+ > No rows selected (0.508 seconds) > > load data inpath 'hdfs://hacluster/user/prasanna/2000_UniqData.csv' into > > table uniqdata > > options('fileheader'='cust_id,cust_name,active_emui_version,dob,doj,bigint_column1,bigint_column2,decimal_column1,decimal_column2,double_column1,double_column2,integer_column1','bad_records_action'='force'); > INFO : Execution ID: 108 > +-+ > |Result| > +-+ > +-+ > No rows selected (1.316 seconds) > > drop table if exists uniqdata_parquet; > +-+ > |Result| > +-+ > +-+ > No rows selected (0.668 seconds) > > CREATE TABLE uniqdata_parquet (cust_id int,cust_name > > String,active_emui_version string, dob timestamp, doj timestamp, > > bigint_column1 bigint,bigint_column2 bigint,decimal_column1 decimal(30,10), > > decimal_column2 decimal(36,36),double_column1 double, double_column2 > > double,integer_column1 int) stored as parquet; > +-+ > |Result| > +-+ > +-+ > No rows selected (0.397 seconds) > > insert into uniqdata_parquet select * from uniqdata; > INFO : Execution ID: 116 > +-+ > |Result| > +-+ > +-+ > No rows selected (4.805 seconds) > > drop table if exists uniqdata_orc; > +-+ > |Result| > +-+ > +-+ > No rows selected (0.553 seconds) > > CREATE TABLE uniqdata_orc (cust_id int,cust_name String,active_emui_version > > string, dob timestamp, doj timestamp, bigint_column1 bigint,bigint_column2 > > bigint,decimal_column1 decimal(30,10), decimal_column2 > > decimal(36,36),double_column1 double, double_column2 double,integer_column1 > > int) using orc; > +-+ > |Result| > +-+ > +-+ > No rows selected (0.396 seconds) > > insert into uniqdata_orc select * from uniqdata; > INFO : Execution ID: 122 > +-+ > |Result| > +-+ > +-+ > No rows selected (3.403 seconds) > > use rps; > +-+ > |Result| > +-+ > +-+ > No rows selected (0.06 seconds) > > Alter table uniqdata add segment options > >
[GitHub] [carbondata] ajantha-bhat commented on pull request #4009: [CARBONDATA-4029] Fix Old Timestamp issue in alter add segement
ajantha-bhat commented on pull request #4009: URL: https://github.com/apache/carbondata/pull/4009#issuecomment-732304718 retest this please This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [carbondata] ajantha-bhat commented on a change in pull request #4009: [CARBONDATA-4029] Fix Old Timestamp issue in alter add segement
ajantha-bhat commented on a change in pull request #4009: URL: https://github.com/apache/carbondata/pull/4009#discussion_r528870553 ## File path: hadoop/src/main/java/org/apache/carbondata/hadoop/api/CarbonTableInputFormat.java ## @@ -572,4 +574,18 @@ public ReadCommittedScope getReadCommitted(JobContext job, AbsoluteTableIdentifi public void setReadCommittedScope(ReadCommittedScope readCommittedScope) { this.readCommittedScope = readCommittedScope; } + + public String getSegmentIdFromFilePath(String filePath) { Review comment: ok This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [carbondata] CarbonDataQA2 commented on pull request #4018: [wip]test
CarbonDataQA2 commented on pull request #4018: URL: https://github.com/apache/carbondata/pull/4018#issuecomment-732267359 Build Success with Spark 2.4.5, Please check CI http://121.244.95.60:12545/job/ApacheCarbon_PR_Builder_2.4.5/3113/ This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [carbondata] CarbonDataQA2 commented on pull request #4004: [WIP] update benchmark
CarbonDataQA2 commented on pull request #4004: URL: https://github.com/apache/carbondata/pull/4004#issuecomment-732264929 Build Success with Spark 2.4.5, Please check CI http://121.244.95.60:12545/job/ApacheCarbon_PR_Builder_2.4.5/3112/ This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [carbondata] CarbonDataQA2 commented on pull request #4018: [wip]test
CarbonDataQA2 commented on pull request #4018: URL: https://github.com/apache/carbondata/pull/4018#issuecomment-732263808 Build Success with Spark 2.3.4, Please check CI http://121.244.95.60:12545/job/ApacheCarbonPRBuilder2.3/4866/ This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [carbondata] CarbonDataQA2 commented on pull request #4004: [WIP] update benchmark
CarbonDataQA2 commented on pull request #4004: URL: https://github.com/apache/carbondata/pull/4004#issuecomment-732260135 Build Success with Spark 2.3.4, Please check CI http://121.244.95.60:12545/job/ApacheCarbonPRBuilder2.3/4865/ This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [carbondata] akashrn5 commented on pull request #4013: [WIP] Remove automatic data cleaning function from all features
akashrn5 commented on pull request #4013: URL: https://github.com/apache/carbondata/pull/4013#issuecomment-732219124 @QiangCai i think we need to wait before considering these changes, because already multiple people are working on same area This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [carbondata] marchpure commented on pull request #4012: [CARBONDATA-4051] Geo spatial index algorithm improvement and UDFs enhancement
marchpure commented on pull request #4012: URL: https://github.com/apache/carbondata/pull/4012#issuecomment-732209550 @MarvinLitt Please review This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [carbondata] akashrn5 opened a new pull request #4018: [wip]test
akashrn5 opened a new pull request #4018: URL: https://github.com/apache/carbondata/pull/4018 ### Why is this PR needed? ### What changes were proposed in this PR? ### Does this PR introduce any user interface change? - No - Yes. (please explain the change and update document) ### Is any new testcase added? - No - Yes This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [carbondata] CarbonDataQA2 commented on pull request #4004: [WIP] update benchmark
CarbonDataQA2 commented on pull request #4004: URL: https://github.com/apache/carbondata/pull/4004#issuecomment-732162102 Build Failed with Spark 2.4.5, Please check CI http://121.244.95.60:12545/job/ApacheCarbon_PR_Builder_2.4.5/3111/ This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [carbondata] CarbonDataQA2 commented on pull request #4004: [WIP] update benchmark
CarbonDataQA2 commented on pull request #4004: URL: https://github.com/apache/carbondata/pull/4004#issuecomment-732158583 Build Failed with Spark 2.3.4, Please check CI http://121.244.95.60:12545/job/ApacheCarbonPRBuilder2.3/4864/ This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [carbondata] CarbonDataQA2 commented on pull request #4004: [WIP] update benchmark
CarbonDataQA2 commented on pull request #4004: URL: https://github.com/apache/carbondata/pull/4004#issuecomment-732105514 Build Failed with Spark 2.3.4, Please check CI http://121.244.95.60:12545/job/ApacheCarbonPRBuilder2.3/4863/ This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [carbondata] CarbonDataQA2 commented on pull request #4004: [WIP] update benchmark
CarbonDataQA2 commented on pull request #4004: URL: https://github.com/apache/carbondata/pull/4004#issuecomment-732104984 Build Failed with Spark 2.4.5, Please check CI http://121.244.95.60:12545/job/ApacheCarbon_PR_Builder_2.4.5/3110/ This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [carbondata] CarbonDataQA2 commented on pull request #4017: [CARBONDATA-4022] Fix invalid path issue for segment added through alter table add segment query.
CarbonDataQA2 commented on pull request #4017: URL: https://github.com/apache/carbondata/pull/4017#issuecomment-732098794 Build Success with Spark 2.4.5, Please check CI http://121.244.95.60:12545/job/ApacheCarbon_PR_Builder_2.4.5/3107/ This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [carbondata] CarbonDataQA2 commented on pull request #4017: [CARBONDATA-4022] Fix invalid path issue for segment added through alter table add segment query.
CarbonDataQA2 commented on pull request #4017: URL: https://github.com/apache/carbondata/pull/4017#issuecomment-732095318 Build Success with Spark 2.3.4, Please check CI http://121.244.95.60:12545/job/ApacheCarbonPRBuilder2.3/4860/ This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [carbondata] CarbonDataQA2 commented on pull request #4009: [CARBONDATA-4029] Fix Old Timestamp issue in alter add segement
CarbonDataQA2 commented on pull request #4009: URL: https://github.com/apache/carbondata/pull/4009#issuecomment-732088276 Build Failed with Spark 2.4.5, Please check CI http://121.244.95.60:12545/job/ApacheCarbon_PR_Builder_2.4.5/3106/ This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [carbondata] CarbonDataQA2 commented on pull request #4009: [CARBONDATA-4029] Fix Old Timestamp issue in alter add segement
CarbonDataQA2 commented on pull request #4009: URL: https://github.com/apache/carbondata/pull/4009#issuecomment-732084303 Build Success with Spark 2.3.4, Please check CI http://121.244.95.60:12545/job/ApacheCarbonPRBuilder2.3/4859/ This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [carbondata] CarbonDataQA2 commented on pull request #4004: [WIP] update benchmark
CarbonDataQA2 commented on pull request #4004: URL: https://github.com/apache/carbondata/pull/4004#issuecomment-732048532 Build Failed with Spark 2.4.5, Please check CI http://121.244.95.60:12545/job/ApacheCarbon_PR_Builder_2.4.5/3108/ This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [carbondata] CarbonDataQA2 commented on pull request #4004: [WIP] update benchmark
CarbonDataQA2 commented on pull request #4004: URL: https://github.com/apache/carbondata/pull/4004#issuecomment-732046920 Build Failed with Spark 2.3.4, Please check CI http://121.244.95.60:12545/job/ApacheCarbonPRBuilder2.3/4861/ This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [carbondata] Karan980 commented on a change in pull request #4017: [CARBONDATA-4022] Fix invalid path issue for segment added through alter table add segment query.
Karan980 commented on a change in pull request #4017: URL: https://github.com/apache/carbondata/pull/4017#discussion_r528569260 ## File path: core/src/main/java/org/apache/carbondata/core/indexstore/ExtendedBlocklet.java ## @@ -219,7 +220,13 @@ public void deserializeFields(DataInput in, String[] locations, String tablePath if (in.readBoolean()) { indexUniqueId = in.readUTF(); } -setFilePath(tablePath + getPath()); +String filePath = getPath(); +if (filePath.startsWith(CarbonCommonConstants.FILE_SEPARATOR) || Review comment: Done This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [carbondata] CarbonDataQA2 commented on pull request #4004: [WIP] update benchmark
CarbonDataQA2 commented on pull request #4004: URL: https://github.com/apache/carbondata/pull/4004#issuecomment-732037653 Build Failed with Spark 2.4.5, Please check CI http://121.244.95.60:12545/job/ApacheCarbon_PR_Builder_2.4.5/3103/ This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [carbondata] CarbonDataQA2 commented on pull request #4004: [WIP] update benchmark
CarbonDataQA2 commented on pull request #4004: URL: https://github.com/apache/carbondata/pull/4004#issuecomment-732032392 Build Failed with Spark 2.3.4, Please check CI http://121.244.95.60:12545/job/ApacheCarbonPRBuilder2.3/4856/ This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [carbondata] Karan980 commented on a change in pull request #4009: [CARBONDATA-4029] Fix Old Timestamp issue in alter add segement
Karan980 commented on a change in pull request #4009: URL: https://github.com/apache/carbondata/pull/4009#discussion_r528558754 ## File path: integration/spark/src/test/scala/org/apache/carbondata/spark/testsuite/addsegment/AddSegmentTestCase.scala ## @@ -781,6 +781,26 @@ class AddSegmentTestCase extends QueryTest with BeforeAndAfterAll { sql(s"drop table $tableName") } + test("Test add segment by carbon written by sdk having old timestamp") { +sql(s"drop table if exists external_primitive") +sql( + s""" + |create table external_primitive (id int, name string, rank smallint, salary double, + | active boolean, dob date, doj timestamp, city string, dept string) stored as carbondata + |""".stripMargin) +val externalSegmentPathWithOldTimestamp = storeLocation + "/" + + "external_segment_with_old_timestamp" +val externalSegmentPath = storeLocation + "/" + "external_segment" +FileFactory.deleteAllFilesOfDir(new File(externalSegmentPath)) +copy(externalSegmentPathWithOldTimestamp, externalSegmentPath) Review comment: Done This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [carbondata] Karan980 commented on a change in pull request #4009: [CARBONDATA-4029] Fix Old Timestamp issue in alter add segement
Karan980 commented on a change in pull request #4009: URL: https://github.com/apache/carbondata/pull/4009#discussion_r528558123 ## File path: hadoop/src/main/java/org/apache/carbondata/hadoop/api/CarbonTableInputFormat.java ## @@ -572,4 +574,18 @@ public ReadCommittedScope getReadCommitted(JobContext job, AbsoluteTableIdentifi public void setReadCommittedScope(ReadCommittedScope readCommittedScope) { this.readCommittedScope = readCommittedScope; } + + public String getSegmentIdFromFilePath(String filePath) { Review comment: getSegmentId() also return segmentId from LoadMetaDataDetails of segment which is not null. Only null for segmentId is present in name of carbondata file of SDK segment and there is no already existing method to get that. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [carbondata] Karan980 commented on a change in pull request #4009: [CARBONDATA-4029] Fix Old Timestamp issue in alter add segement
Karan980 commented on a change in pull request #4009: URL: https://github.com/apache/carbondata/pull/4009#discussion_r528556523 ## File path: hadoop/src/main/java/org/apache/carbondata/hadoop/api/CarbonTableInputFormat.java ## @@ -367,8 +368,9 @@ protected FileSplit makeSplit(String segmentId, String filePath, long start, lon String[] deleteDeltaFilePath = null; if (isIUDTable) { // In case IUD is not performed in this table avoid searching for -// invalidated blocks. -if (CarbonUtil +// invalidated blocks. No need to check validation for splits written by SDK. +String segmentId = getSegmentIdFromFilePath(inputSplit.getFilePath()); Review comment: Done This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [carbondata] CarbonDataQA2 commented on pull request #4012: [CARBONDATA-4051] Geo spatial index algorithm improvement and UDFs enhancement
CarbonDataQA2 commented on pull request #4012: URL: https://github.com/apache/carbondata/pull/4012#issuecomment-732021580 Build Success with Spark 2.3.4, Please check CI http://121.244.95.60:12545/job/ApacheCarbonPRBuilder2.3/4854/ This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [carbondata] CarbonDataQA2 commented on pull request #4012: [CARBONDATA-4051] Geo spatial index algorithm improvement and UDFs enhancement
CarbonDataQA2 commented on pull request #4012: URL: https://github.com/apache/carbondata/pull/4012#issuecomment-732021180 Build Success with Spark 2.4.5, Please check CI http://121.244.95.60:12545/job/ApacheCarbon_PR_Builder_2.4.5/3101/ This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[jira] [Updated] (CARBONDATA-4054) Size control of minor compaction
[ https://issues.apache.org/jira/browse/CARBONDATA-4054?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ZHANGSHUNYU updated CARBONDATA-4054: Description: {{Currentlly, minor compaction only consider the num of segments and major}} compaction only consider the SUM size of segments, but consider a scenario that the user want to use minor compaction by the num of segments but he dont want to merge the segment whose datasize larger the threshold for example 2GB, as it is no need to merge so much big segment and it is time costly. so we need to add a parameter to control the threshold of segment included in minor compaction, so that the user can specify the segment not included in minor compaction once the datasize exeed the threshold, of course default value must be threre. was: h1. Currentlly, minor compaction only consider the num of segments and major compaction only consider the SUM size of segments, but consider a scenario that the user want to use minor compaction by the num of segments but he dont want to merge the segment whose datasize larger the threshold for example 2GB, as it is no need to merge so much big segment and it is time costly. so we need to add a parameter to control the threshold of segment included in minor compaction, so that the user can specify the segment not included in minor compaction once the datasize exeed the threshold, of course default value must be threre. > Size control of minor compaction > > > Key: CARBONDATA-4054 > URL: https://issues.apache.org/jira/browse/CARBONDATA-4054 > Project: CarbonData > Issue Type: Improvement >Reporter: ZHANGSHUNYU >Priority: Major > > {{Currentlly, minor compaction only consider the num of segments and major}} > compaction only consider the SUM size of segments, but consider a scenario > that the user want to use minor compaction by the num of segments but he > dont want to merge the segment whose datasize larger the threshold for > example 2GB, as it is no need to merge so much big segment and it is time > costly. > so we need to add a parameter to control the threshold of segment included > in minor compaction, so that the user can specify the segment not included > in minor compaction once the datasize exeed the threshold, of course default > value must be threre. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (CARBONDATA-4054) Size control of minor compaction
ZHANGSHUNYU created CARBONDATA-4054: --- Summary: Size control of minor compaction Key: CARBONDATA-4054 URL: https://issues.apache.org/jira/browse/CARBONDATA-4054 Project: CarbonData Issue Type: Improvement Reporter: ZHANGSHUNYU h1. Currentlly, minor compaction only consider the num of segments and major compaction only consider the SUM size of segments, but consider a scenario that the user want to use minor compaction by the num of segments but he dont want to merge the segment whose datasize larger the threshold for example 2GB, as it is no need to merge so much big segment and it is time costly. so we need to add a parameter to control the threshold of segment included in minor compaction, so that the user can specify the segment not included in minor compaction once the datasize exeed the threshold, of course default value must be threre. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[GitHub] [carbondata] CarbonDataQA2 commented on pull request #4015: [CARBONDATA-4052] Handled insert overwrite scenario for SI
CarbonDataQA2 commented on pull request #4015: URL: https://github.com/apache/carbondata/pull/4015#issuecomment-732002235 Build Success with Spark 2.4.5, Please check CI http://121.244.95.60:12545/job/ApacheCarbon_PR_Builder_2.4.5/3100/ This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [carbondata] CarbonDataQA2 commented on pull request #4015: [CARBONDATA-4052] Handled insert overwrite scenario for SI
CarbonDataQA2 commented on pull request #4015: URL: https://github.com/apache/carbondata/pull/4015#issuecomment-731997987 Build Success with Spark 2.3.4, Please check CI http://121.244.95.60:12545/job/ApacheCarbonPRBuilder2.3/4853/ This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org