[GitHub] carbondata issue #2362: [CARBONDATA-2578] fixed memory leak inside CarbonRea...
Github user ravipesala commented on the issue: https://github.com/apache/carbondata/pull/2362 SDV Build Success , Please check CI http://144.76.159.231:8080/job/ApacheSDVTests/5262/ ---
[GitHub] carbondata issue #2362: [CARBONDATA-2578] fixed memory leak inside CarbonRea...
Github user CarbonDataQA commented on the issue: https://github.com/apache/carbondata/pull/2362 Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/6303/ ---
[GitHub] carbondata issue #2362: [CARBONDATA-2578] fixed memory leak inside CarbonRea...
Github user CarbonDataQA commented on the issue: https://github.com/apache/carbondata/pull/2362 Build Success with Spark 2.2.1, Please check CI http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/5140/ ---
[GitHub] carbondata pull request #2328: [CARBONDATA-2504][STREAM] Support StreamSQL f...
Github user jackylk commented on a diff in the pull request: https://github.com/apache/carbondata/pull/2328#discussion_r194650822 --- Diff: integration/spark2/src/main/scala/org/apache/spark/sql/execution/command/stream/CarbonShowStreamsCommand.scala --- @@ -0,0 +1,74 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + *http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.spark.sql.execution.command.stream + +import java.util.Date +import java.util.concurrent.TimeUnit + +import org.apache.spark.sql.{CarbonEnv, Row, SparkSession} +import org.apache.spark.sql.catalyst.TableIdentifier +import org.apache.spark.sql.catalyst.expressions.{Attribute, AttributeReference} +import org.apache.spark.sql.execution.command.MetadataCommand +import org.apache.spark.sql.types.StringType + +import org.apache.carbondata.stream.StreamJobManager + +/** + * Show all streams created or on a specified table + */ +case class CarbonShowStreamsCommand( +tableOp: Option[TableIdentifier] +) extends MetadataCommand { + override def output: Seq[Attribute] = { +Seq(AttributeReference("JobId", StringType, nullable = false)(), + AttributeReference("Status", StringType, nullable = false)(), + AttributeReference("Source", StringType, nullable = false)(), + AttributeReference("Sink", StringType, nullable = false)(), + AttributeReference("Start Time", StringType, nullable = false)(), + AttributeReference("Time Elapse", StringType, nullable = false)()) + } + + override def processMetadata(sparkSession: SparkSession): Seq[Row] = { +val jobs = tableOp match { + case None => StreamJobManager.getAllJobs.toSeq + case Some(table) => +val carbonTable = CarbonEnv.getCarbonTable(table.database, table.table)(sparkSession) +StreamJobManager.getAllJobs.filter { job => + job.sinkTable.equalsIgnoreCase(carbonTable.getTableName) && + job.sinkDb.equalsIgnoreCase(carbonTable.getDatabaseName) +}.toSeq +} + +jobs.map { job => + val elapsedTime = System.currentTimeMillis() - job.startTime + Row( +job.streamingQuery.id.toString, +if (job.streamingQuery.isActive) "RUNNING" else "FAILED", +s"${ job.sourceDb }.${ job.sourceTable }", +s"${ job.sinkDb }.${ job.sinkTable }", +new Date(job.startTime).toString, +String.format( + "%s days, %s hours, %s min, %s sec", + TimeUnit.MILLISECONDS.toDays(elapsedTime).toString, + TimeUnit.MILLISECONDS.toHours(elapsedTime).toString, --- End diff -- ok ---
[GitHub] carbondata pull request #2328: [CARBONDATA-2504][STREAM] Support StreamSQL f...
Github user jackylk commented on a diff in the pull request: https://github.com/apache/carbondata/pull/2328#discussion_r194651326 --- Diff: integration/spark2/src/main/scala/org/apache/carbondata/stream/StreamJobManager.scala --- @@ -0,0 +1,131 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + *http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.carbondata.stream + +import java.util.concurrent.{CountDownLatch, TimeUnit} + +import scala.collection.mutable + +import org.apache.spark.sql.{DataFrame, SparkSession} +import org.apache.spark.sql.streaming.StreamingQuery + +import org.apache.carbondata.common.logging.LogServiceFactory +import org.apache.carbondata.core.metadata.schema.table.CarbonTable +import org.apache.carbondata.spark.StreamingOption +import org.apache.carbondata.streaming.CarbonStreamException + +object StreamJobManager { + private val LOGGER = LogServiceFactory.getLogService(this.getClass.getCanonicalName) + private val jobs = mutable.Map[String, StreamJobDesc]() + + /** + * Start a spark streaming query + * @param sparkSession session instance + * @param sourceTable stream source table + * @param sinkTable sink table to insert to + * @param query query string + * @param streamDf dataframe that containing the query from stream source table + * @param options options provided by user + * @return Job ID + */ + def startJob( + sparkSession: SparkSession, + sourceTable: CarbonTable, + sinkTable: CarbonTable, + query: String, + streamDf: DataFrame, + options: StreamingOption): String = { +val latch = new CountDownLatch(1) +var exception: Throwable = null +var job: StreamingQuery = null + +// start a new thread to run the streaming ingest job, the job will be running +// until user stops it by STOP STREAM JOB --- End diff -- fixed ---
[GitHub] carbondata pull request #2328: [CARBONDATA-2504][STREAM] Support StreamSQL f...
Github user jackylk commented on a diff in the pull request: https://github.com/apache/carbondata/pull/2328#discussion_r194651389 --- Diff: integration/spark2/src/main/scala/org/apache/spark/sql/execution/command/stream/CarbonCreateStreamSourceCommand.scala --- @@ -0,0 +1,50 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + *http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.spark.sql.execution.command.stream + +import scala.collection.mutable + +import org.apache.spark.sql.{Row, SparkSession} +import org.apache.spark.sql.execution.command.{Field, MetadataCommand, TableNewProcessor} +import org.apache.spark.sql.execution.command.table.CarbonCreateTableCommand +import org.apache.spark.sql.parser.CarbonSpark2SqlParser + +/** + * This command is used to create Stream Source, which is implemented as a Carbon Table + */ +case class CarbonCreateStreamSourceCommand( +dbName: Option[String], +tableName: String, +fields: Seq[Field], +tblProperties: Map[String, String] +) extends MetadataCommand { + override def processMetadata(sparkSession: SparkSession): Seq[Row] = { +val tableModel = new CarbonSpark2SqlParser().prepareTableModel( + ifNotExistPresent = false, + dbName, + tableName, + fields, + Seq.empty, + mutable.Map[String, String](tblProperties.toSeq: _*), + None +) +val tableInfo = TableNewProcessor.apply(tableModel) --- End diff -- Create Stream Source is removed ---
[GitHub] carbondata pull request #2328: [CARBONDATA-2504][STREAM] Support StreamSQL f...
Github user jackylk commented on a diff in the pull request: https://github.com/apache/carbondata/pull/2328#discussion_r194665653 --- Diff: integration/spark2/src/main/scala/org/apache/spark/sql/parser/CarbonSpark2SqlParser.scala --- @@ -145,6 +149,55 @@ class CarbonSpark2SqlParser extends CarbonDDLSqlParser { CarbonAlterTableFinishStreaming(dbName, table) } + /** + * The syntax of CREATE STREAM SOURCE + * CREATE STREAM SOURCE [dbName.]tableName (schema list) + * [TBLPROPERTIES('KEY'='VALUE')] + */ + protected lazy val createStreamSource: Parser[LogicalPlan] = +CREATE ~> STREAM ~> SOURCE ~> (ident <~ ".").? ~ ident ~ +("(" ~> repsep(anyFieldDef, ",") <~ ")") ~ +(TBLPROPERTIES ~> "(" ~> repsep(loadOptions, ",") <~ ")").? <~ opt(";") ^^ { + case dbName ~ tableName ~ fields ~ map => +val tblProperties = map.getOrElse(List[(String, String)]()).toMap[String, String] +CarbonCreateStreamSourceCommand(dbName, tableName, fields, tblProperties) +} + + /** + * The syntax of CREATE STREAM + * CREATE STREAM ON TABLE [dbName.]tableName + * [STMPROPERTIES('KEY'='VALUE')] + * AS SELECT COUNT(COL1) FROM tableName + */ + protected lazy val createStream: Parser[LogicalPlan] = +CREATE ~> STREAM ~> ON ~> TABLE ~> (ident <~ ".").? ~ ident ~ +(STMPROPERTIES ~> "(" ~> repsep(loadOptions, ",") <~ ")").? ~ +(AS ~> restInput) <~ opt(";") ^^ { + case dbName ~ tableName ~ options ~ query => +val optionMap = options.getOrElse(List[(String, String)]()).toMap[String, String] +CarbonCreateStreamCommand(dbName, tableName, optionMap, query) +} + + /** + * The syntax of KILL STREAM + * KILL STREAM ON TABLE [dbName].tableName + */ + protected lazy val killStream: Parser[LogicalPlan] = --- End diff -- If the stream is dropped, user need to trigger CREATE STREAM again ---
[GitHub] carbondata pull request #2328: [CARBONDATA-2504][STREAM] Support StreamSQL f...
Github user jackylk commented on a diff in the pull request: https://github.com/apache/carbondata/pull/2328#discussion_r194666440 --- Diff: integration/spark2/src/main/scala/org/apache/spark/sql/execution/command/stream/CarbonCreateStreamSourceCommand.scala --- @@ -0,0 +1,50 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + *http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.spark.sql.execution.command.stream + +import scala.collection.mutable + +import org.apache.spark.sql.{Row, SparkSession} +import org.apache.spark.sql.execution.command.{Field, MetadataCommand, TableNewProcessor} +import org.apache.spark.sql.execution.command.table.CarbonCreateTableCommand +import org.apache.spark.sql.parser.CarbonSpark2SqlParser + +/** + * This command is used to create Stream Source, which is implemented as a Carbon Table + */ +case class CarbonCreateStreamSourceCommand( +dbName: Option[String], +tableName: String, +fields: Seq[Field], +tblProperties: Map[String, String] +) extends MetadataCommand { + override def processMetadata(sparkSession: SparkSession): Seq[Row] = { +val tableModel = new CarbonSpark2SqlParser().prepareTableModel( --- End diff -- 1. ok ---
[GitHub] carbondata issue #2328: [CARBONDATA-2504][STREAM] Support StreamSQL for stre...
Github user CarbonDataQA commented on the issue: https://github.com/apache/carbondata/pull/2328 Build Failed with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/6304/ ---
[GitHub] carbondata issue #2328: [CARBONDATA-2504][STREAM] Support StreamSQL for stre...
Github user CarbonDataQA commented on the issue: https://github.com/apache/carbondata/pull/2328 Build Failed with Spark 2.2.1, Please check CI http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/5141/ ---
[GitHub] carbondata issue #2328: [CARBONDATA-2504][STREAM] Support StreamSQL for stre...
Github user ravipesala commented on the issue: https://github.com/apache/carbondata/pull/2328 SDV Build Fail , Please check CI http://144.76.159.231:8080/job/ApacheSDVTests/5263/ ---
[GitHub] carbondata issue #2362: [CARBONDATA-2578] fixed memory leak inside CarbonRea...
Github user kumarvishal09 commented on the issue: https://github.com/apache/carbondata/pull/2362 LGTM ---
[GitHub] carbondata pull request #2368: [CARBONDATA-2603] Fix: error handling during ...
GitHub user ajantha-bhat opened a pull request: https://github.com/apache/carbondata/pull/2368 [CARBONDATA-2603] Fix: error handling during reader build failure problem : When the CarbonReaderBuilder.build() is failed due to some problems like invalid projection that leads to query model creation failure. Blocklet datamap is not cleared for that table.So, the next reader instance uses old blocklet datamap . That creates error. Solution: Clear the blocklet datamap if the reader build is failed. Be sure to do all of the following checklist to help us incorporate your contribution quickly and easily: - [ ] Any interfaces changed? NA - [ ] Any backward compatibility impacted? NA - [ ] Document update required? NA - [ ] Testing done updated the UT test cases - [ ] For large changes, please consider breaking it into sub-tasks under an umbrella JIRA. NA You can merge this pull request into a Git repository by running: $ git pull https://github.com/ajantha-bhat/carbondata issue_fix Alternatively you can review and apply these changes as the patch at: https://github.com/apache/carbondata/pull/2368.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #2368 commit 1fbdcdde84fc8a7650d63945a950e5fb59f665cf Author: ajantha-bhat Date: 2018-06-11T13:47:33Z [CARBONDATA-2603] Fix: error handling during reader build failure ---
[GitHub] carbondata pull request #2362: [CARBONDATA-2578] fixed memory leak inside Ca...
Github user asfgit closed the pull request at: https://github.com/apache/carbondata/pull/2362 ---
[jira] [Resolved] (CARBONDATA-2578) RowBatch Object is present even after CarbonReader is closed.
[ https://issues.apache.org/jira/browse/CARBONDATA-2578?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] kumar vishal resolved CARBONDATA-2578. -- Resolution: Fixed Fix Version/s: 1.4.0 NONE > RowBatch Object is present even after CarbonReader is closed. > - > > Key: CARBONDATA-2578 > URL: https://issues.apache.org/jira/browse/CARBONDATA-2578 > Project: CarbonData > Issue Type: Improvement >Reporter: Rahul Kumar >Assignee: Rahul Kumar >Priority: Major > Fix For: NONE, 1.4.0 > > Time Spent: 5h 40m > Remaining Estimate: 0h > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[GitHub] carbondata issue #2368: [CARBONDATA-2603] Fix: error handling during reader ...
Github user ravipesala commented on the issue: https://github.com/apache/carbondata/pull/2368 SDV Build Success , Please check CI http://144.76.159.231:8080/job/ApacheSDVTests/5264/ ---
[GitHub] carbondata issue #2368: [CARBONDATA-2603] Fix: error handling during reader ...
Github user CarbonDataQA commented on the issue: https://github.com/apache/carbondata/pull/2368 Build Failed with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/6305/ ---
[GitHub] carbondata pull request #2207: [CARBONDATA-2428] Support flat folder for man...
Github user ravipesala commented on a diff in the pull request: https://github.com/apache/carbondata/pull/2207#discussion_r194722392 --- Diff: core/src/main/java/org/apache/carbondata/core/metadata/SegmentFileStore.java --- @@ -140,15 +141,19 @@ public static String genSegmentFileName(String segmentId, String UUID) { /** * Write segment file to the metadata folder of the table + * * @param tablePath table path * @param segmentId segment id - * @param UUID a UUID string used to construct the segment file name + * @param UUID a UUID string used to construct the segment file name * @return segment file name */ public static String writeSegmentFile(String tablePath, String segmentId, String UUID) throws IOException { String segmentPath = CarbonTablePath.getSegmentPath(tablePath, segmentId); CarbonFile segmentFolder = FileFactory.getCarbonFile(segmentPath); +boolean supportFlatFolder = Boolean.parseBoolean(CarbonProperties.getInstance() --- End diff -- Yes, Added table property `flat_folder` , and default value is false. ---
[GitHub] carbondata issue #2368: [CARBONDATA-2603] Fix: error handling during reader ...
Github user CarbonDataQA commented on the issue: https://github.com/apache/carbondata/pull/2368 Build Success with Spark 2.2.1, Please check CI http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/5143/ ---
[GitHub] carbondata issue #2207: [CARBONDATA-2428] Support flat folder for managed ca...
Github user ravipesala commented on the issue: https://github.com/apache/carbondata/pull/2207 @xuchuanyin yes, all features which work on segment folder can also work in flat folder ---
[GitHub] carbondata issue #2207: [CARBONDATA-2428] Support flat folder for managed ca...
Github user CarbonDataQA commented on the issue: https://github.com/apache/carbondata/pull/2207 Build Failed with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/6307/ ---
[GitHub] carbondata issue #2207: [CARBONDATA-2428] Support flat folder for managed ca...
Github user CarbonDataQA commented on the issue: https://github.com/apache/carbondata/pull/2207 Build Failed with Spark 2.2.1, Please check CI http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/5145/ ---
[jira] [Updated] (CARBONDATA-2428) Support Flat folder structure in carbon.
[ https://issues.apache.org/jira/browse/CARBONDATA-2428?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ravindra Pesala updated CARBONDATA-2428: Description: 1. Flat folder makes all carbondata files store flat under table path. 2. It is controlled through table property `flat_folder`. By default it is false. 3. It cannot be hybrid, so user cannot change the property once table created. 4. Segment file is created for each loading.And segment file is created under MetaData folder under table path. 5. Segment number is added as part of carbondata and index files. 6. All datamap files now create directly under table path with dm IUD : It supports but list files during IUD may hit performance. Compaction: Supports Delete Segment : No impact Clean files : No impact Alter table : No impact Pre Agg : Property need to inherited to child, so it also supports flat folder structure. Partition : No Impact on this feature as it already has flat folder structure. Streaming : Only during handoff it supports flat folder structure. Streaming segment location is no change. was:Currently carbondata writing happens in fixed path tablepath/Fact/Part0/Segment_NUM folder and it is not same as hive/parquet folder structure. This PR makes all files written will be inside tablepath, it does not maintain any segment folder structure. Only for partition it adds the folder. > Support Flat folder structure in carbon. > > > Key: CARBONDATA-2428 > URL: https://issues.apache.org/jira/browse/CARBONDATA-2428 > Project: CarbonData > Issue Type: Improvement >Reporter: Ravindra Pesala >Priority: Major > Time Spent: 6h 50m > Remaining Estimate: 0h > > 1. Flat folder makes all carbondata files store flat under table path. > 2. It is controlled through table property `flat_folder`. By default it is > false. > 3. It cannot be hybrid, so user cannot change the property once table created. > 4. Segment file is created for each loading.And segment file is created under > MetaData folder under table path. > 5. Segment number is added as part of carbondata and index files. > 6. All datamap files now create directly under table path with > dm > > IUD : It supports but list files during IUD may hit performance. > Compaction: Supports > Delete Segment : No impact > Clean files : No impact > Alter table : No impact > Pre Agg : Property need to inherited to child, so it also supports flat > folder structure. > Partition : No Impact on this feature as it already has flat folder structure. > Streaming : Only during handoff it supports flat folder structure. Streaming > segment location is no change. > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[GitHub] carbondata issue #2368: [CARBONDATA-2603] Fix: error handling during reader ...
Github user kunal642 commented on the issue: https://github.com/apache/carbondata/pull/2368 retest this please ---
[GitHub] carbondata issue #2252: [CARBONDATA-2420][32K] Support string longer than 32...
Github user ravipesala commented on the issue: https://github.com/apache/carbondata/pull/2252 SDV Build Success , Please check CI http://144.76.159.231:8080/job/ApacheSDVTests/5265/ ---
[GitHub] carbondata issue #2252: [CARBONDATA-2420][32K] Support string longer than 32...
Github user CarbonDataQA commented on the issue: https://github.com/apache/carbondata/pull/2252 Build Failed with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/6306/ ---
[jira] [Created] (CARBONDATA-2604) ArrayIndexOutOfBoundException during compaction after IUD in cluster
Rahul Kumar created CARBONDATA-2604: --- Summary: ArrayIndexOutOfBoundException during compaction after IUD in cluster Key: CARBONDATA-2604 URL: https://issues.apache.org/jira/browse/CARBONDATA-2604 Project: CarbonData Issue Type: Improvement Reporter: Rahul Kumar Assignee: Rahul Kumar *Exception :* !image-2018-06-12-19-19-05-257.png! *To reproduce the issue follow the following steps :* {quote} * *create table brinjal (imei string,AMSize string,channelsId string,ActiveCountry string, Activecity string,gamePointId double,deviceInformationId double,productionDate Timestamp,deliveryDate timestamp,deliverycharge double) STORED BY 'org.apache.carbondata.format' TBLPROPERTIES('table_blocksize'='2000','sort_columns'='imei');* * *LOAD DATA INPATH '/user/loader/xyz.csv' INTO TABLE brinjal OPTIONS('DELIMITER'=',', 'QUOTECHAR'= '','BAD_RECORDS_ACTION'='FORCE','FILEHEADER'= 'imei,deviceInformationId,AMSize,channelsId,ActiveCountry,Activecity,gamePointId,productionDate,deliveryDate,deliverycharge');* * *LOAD DATA INPATH '/user/loader/xyz.csv' INTO TABLE brinjal OPTIONS('DELIMITER'=',', 'QUOTECHAR'= '','BAD_RECORDS_ACTION'='FORCE','FILEHEADER'= 'imei,deviceInformationId,AMSize,channelsId,ActiveCountry,Activecity,gamePointId,productionDate,deliveryDate,deliverycharge');* * *LOAD DATA INPATH '/user/loader/xyz.csv' INTO TABLE brinjal OPTIONS('DELIMITER'=',', 'QUOTECHAR'= '','BAD_RECORDS_ACTION'='FORCE','FILEHEADER'= 'imei,deviceInformationId,AMSize,channelsId,ActiveCountry,Activecity,gamePointId,productionDate,deliveryDate,deliverycharge');* * *insert into brinjal select * from brinjal;* * *update brinjal set (AMSize)= ('8RAM size') where AMSize='4RAM size';* * *delete from brinjal where AMSize='8RAM size';* * *delete from table brinjal where segment.id IN(0);* * *clean files for table brinjal;* * *alter table brinjal compact 'minor';* * *alter table brinjal compact 'major';*{quote} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (CARBONDATA-2604) getting ArrayIndexOutOfBoundException during compaction after IUD in cluster
[ https://issues.apache.org/jira/browse/CARBONDATA-2604?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rahul Kumar updated CARBONDATA-2604: Summary: getting ArrayIndexOutOfBoundException during compaction after IUD in cluster (was: ArrayIndexOutOfBoundException during compaction after IUD in cluster) > getting ArrayIndexOutOfBoundException during compaction after IUD in cluster > > > Key: CARBONDATA-2604 > URL: https://issues.apache.org/jira/browse/CARBONDATA-2604 > Project: CarbonData > Issue Type: Improvement >Reporter: Rahul Kumar >Assignee: Rahul Kumar >Priority: Major > > *Exception :* > !image-2018-06-12-19-19-05-257.png! > *To reproduce the issue follow the following steps :* > {quote} * *create table brinjal (imei string,AMSize string,channelsId > string,ActiveCountry string, Activecity string,gamePointId > double,deviceInformationId double,productionDate Timestamp,deliveryDate > timestamp,deliverycharge double) STORED BY 'org.apache.carbondata.format' > TBLPROPERTIES('table_blocksize'='2000','sort_columns'='imei');* > * *LOAD DATA INPATH '/user/loader/xyz.csv' INTO TABLE brinjal > OPTIONS('DELIMITER'=',', 'QUOTECHAR'= > '','BAD_RECORDS_ACTION'='FORCE','FILEHEADER'= > 'imei,deviceInformationId,AMSize,channelsId,ActiveCountry,Activecity,gamePointId,productionDate,deliveryDate,deliverycharge');* > * *LOAD DATA INPATH '/user/loader/xyz.csv' INTO TABLE brinjal > OPTIONS('DELIMITER'=',', 'QUOTECHAR'= > '','BAD_RECORDS_ACTION'='FORCE','FILEHEADER'= > 'imei,deviceInformationId,AMSize,channelsId,ActiveCountry,Activecity,gamePointId,productionDate,deliveryDate,deliverycharge');* > * *LOAD DATA INPATH '/user/loader/xyz.csv' INTO TABLE brinjal > OPTIONS('DELIMITER'=',', 'QUOTECHAR'= > '','BAD_RECORDS_ACTION'='FORCE','FILEHEADER'= > 'imei,deviceInformationId,AMSize,channelsId,ActiveCountry,Activecity,gamePointId,productionDate,deliveryDate,deliverycharge');* > * *insert into brinjal select * from brinjal;* > * *update brinjal set (AMSize)= ('8RAM size') where AMSize='4RAM size';* > * *delete from brinjal where AMSize='8RAM size';* > * *delete from table brinjal where segment.id IN(0);* > * *clean files for table brinjal;* > * *alter table brinjal compact 'minor';* > * *alter table brinjal compact 'major';*{quote} > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[GitHub] carbondata pull request #2369: [CARBONDATA-2604] getting ArrayIndexOutOfBoun...
GitHub user rahulforallp opened a pull request: https://github.com/apache/carbondata/pull/2369 [CARBONDATA-2604] getting ArrayIndexOutOfBoundException during compaction after IUD in cluster is fixed - [ ] Any interfaces changed? No - [ ] Any backward compatibility impacted? No - [ ] Document update required? No - [ ] Testing done => Yes, tested on cluster - [ ] For large changes, please consider breaking it into sub-tasks under an umbrella JIRA. NA You can merge this pull request into a Git repository by running: $ git pull https://github.com/rahulforallp/incubator-carbondata compaction_issue Alternatively you can review and apply these changes as the patch at: https://github.com/apache/carbondata/pull/2369.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #2369 commit 9bb2aa97995a21b0ac36026b01fc02e885205399 Author: rahul Date: 2018-06-12T13:56:40Z [CARBONDATA-2604] getting ArrayIndexOutOfBoundException during compaction after IUD in cluster is fixed ---
[GitHub] carbondata issue #2252: [CARBONDATA-2420][32K] Support string longer than 32...
Github user CarbonDataQA commented on the issue: https://github.com/apache/carbondata/pull/2252 Build Failed with Spark 2.2.1, Please check CI http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/5144/ ---
[GitHub] carbondata issue #2368: [CARBONDATA-2603] Fix: error handling during reader ...
Github user CarbonDataQA commented on the issue: https://github.com/apache/carbondata/pull/2368 Build Success with Spark 2.2.1, Please check CI http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/5146/ ---
[GitHub] carbondata issue #2207: [CARBONDATA-2428] Support flat folder for managed ca...
Github user ravipesala commented on the issue: https://github.com/apache/carbondata/pull/2207 SDV Build Success , Please check CI http://144.76.159.231:8080/job/ApacheSDVTests/5266/ ---
[GitHub] carbondata issue #2368: [CARBONDATA-2603] Fix: error handling during reader ...
Github user CarbonDataQA commented on the issue: https://github.com/apache/carbondata/pull/2368 Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/6308/ ---
[GitHub] carbondata issue #2368: [CARBONDATA-2603] Fix: error handling during reader ...
Github user kunal642 commented on the issue: https://github.com/apache/carbondata/pull/2368 LGTM ---
[GitHub] carbondata pull request #2368: [CARBONDATA-2603] Fix: error handling during ...
Github user asfgit closed the pull request at: https://github.com/apache/carbondata/pull/2368 ---
[jira] [Resolved] (CARBONDATA-2603) if creation of one CarbonReader for non-transactional table fails then we are not able to create new object of CarbonReader
[ https://issues.apache.org/jira/browse/CARBONDATA-2603?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] kumar vishal resolved CARBONDATA-2603. -- Resolution: Fixed > if creation of one CarbonReader for non-transactional table fails then we > are not able to create new object of CarbonReader > > > Key: CARBONDATA-2603 > URL: https://issues.apache.org/jira/browse/CARBONDATA-2603 > Project: CarbonData > Issue Type: Improvement >Reporter: Rahul Kumar >Assignee: Rahul Kumar >Priority: Major > Time Spent: 1.5h > Remaining Estimate: 0h > > *to reproduce follow the following steps :* > # create a carbonReader for non transactional Table and give the wrong > projection column name (so that it will fail). > # create another carbonReader for non-transactional table with correct > values. > Expectation : Second reader should be successfully created > Actual : creation of second carbonReader is failing with following exception. > > TestCase to reproduce the issue ; > {{Field[] fields = new Field[] { new Field("c1", "string"),}} > {{ new Field("c2", "int") };}} > {{Schema schema = new Schema(fields);}} > {{CarbonWriterBuilder builder = CarbonWriter.builder();}} > {{CarbonWriter carbonWriter =}} > {{ > builder.outputPath("D:/mydata").isTransactionalTable(false).uniqueIdentifier(12345)}} > {{ .buildWriterForCSVInput(schema);}} > {{carbonWriter.write(new String[] \{ "MNO", "100" });}} > {{carbonWriter.close();}} > {{Field[] fields1 = new Field[] { new Field("p1", "string"),}} > {{ new Field("p2", "int") };}} > {{Schema schema1 = new Schema(fields1);}} > {{CarbonWriterBuilder builder1 = CarbonWriter.builder();}} > {{CarbonWriter carbonWriter1 =}} > {{ > builder1.outputPath("D:/mydata1").isTransactionalTable(false).uniqueIdentifier(12345)}} > {{ .buildWriterForCSVInput(schema1);}} > {{carbonWriter1.write(new String[] \{ "PQR", "200" });}} > {{carbonWriter1.close();}} > {{try {}} > {{ CarbonReader reader =}} > {{ CarbonReader.builder("D:/mydata", "_temp").}} > {{ projection(new String[] \{ "c1", "c3" })}} > {{ .isTransactionalTable(false).build();}} > {{} catch (Exception e){}} > {{ System.out.println("Success");}} > {{}}} > {{CarbonReader reader1 =}} > {{ CarbonReader.builder("D:/mydata1", "_temp1")}} > {{ .projection(new String[] \{ "p1", "p2" })}} > {{ .isTransactionalTable(false).build();}} > {{while (reader1.hasNext()) {}} > {{ Object[] row1 = (Object[]) reader1.readNextRow();}} > {{ System.out.println(row1[0]);}} > {{ System.out.println(row1[1]);}} > {{}}} > {{reader1.close();}}{{}} > > > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[GitHub] carbondata issue #2369: [CARBONDATA-2604] getting ArrayIndexOutOfBoundExcept...
Github user ravipesala commented on the issue: https://github.com/apache/carbondata/pull/2369 SDV Build Success , Please check CI http://144.76.159.231:8080/job/ApacheSDVTests/5267/ ---
[GitHub] carbondata issue #2369: [CARBONDATA-2604] getting ArrayIndexOutOfBoundExcept...
Github user ravipesala commented on the issue: https://github.com/apache/carbondata/pull/2369 SDV Build Fail , Please check CI http://144.76.159.231:8080/job/ApacheSDVTests/5268/ ---
[GitHub] carbondata issue #2369: [CARBONDATA-2604] getting ArrayIndexOutOfBoundExcept...
Github user CarbonDataQA commented on the issue: https://github.com/apache/carbondata/pull/2369 Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/6310/ ---
[GitHub] carbondata issue #2369: [CARBONDATA-2604] getting ArrayIndexOutOfBoundExcept...
Github user CarbonDataQA commented on the issue: https://github.com/apache/carbondata/pull/2369 Build Success with Spark 2.2.1, Please check CI http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/5148/ ---
[GitHub] carbondata pull request #2370: [CARBONDATA-2599]Use RowStreamParserImp as de...
GitHub user zzcclp opened a pull request: https://github.com/apache/carbondata/pull/2370 [CARBONDATA-2599]Use RowStreamParserImp as default value of config 'carbon.stream.parser' Parser 'RowStreamParserImpl' is used more often for real scene, so use 'RowStreamParserImpl' as default value of config 'carbon.stream.parser' Be sure to do all of the following checklist to help us incorporate your contribution quickly and easily: - [ ] Any interfaces changed? - [ ] Any backward compatibility impacted? - [ ] Document update required? - [ ] Testing done Please provide details on - Whether new unit test cases have been added or why no new tests are required? - How it is tested? Please attach test report. - Is it a performance related change? Please attach the performance test report. - Any additional information to help reviewers in testing this change. - [ ] For large changes, please consider breaking it into sub-tasks under an umbrella JIRA. You can merge this pull request into a Git repository by running: $ git pull https://github.com/zzcclp/carbondata CARBONDATA-2599 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/carbondata/pull/2370.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #2370 commit 373786787fbad05c058fbb9c65e8604c92246105 Author: Zhang Zhichao <441586683@...> Date: 2018-06-12T15:55:57Z [CARBONDATA-2599]Use RowStreamParserImp as default value of config 'carbon.stream.parser' Parser 'RowStreamParserImpl' is used more often for real scene, so use 'RowStreamParserImpl' as default value of config 'carbon.stream.parser' ---
[GitHub] carbondata issue #2370: [CARBONDATA-2599]Use RowStreamParserImp as default v...
Github user ravipesala commented on the issue: https://github.com/apache/carbondata/pull/2370 SDV Build Success , Please check CI http://144.76.159.231:8080/job/ApacheSDVTests/5269/ ---
[GitHub] carbondata issue #2370: [CARBONDATA-2599]Use RowStreamParserImp as default v...
Github user CarbonDataQA commented on the issue: https://github.com/apache/carbondata/pull/2370 Build Failed with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/6312/ ---
[GitHub] carbondata issue #2370: [CARBONDATA-2599]Use RowStreamParserImp as default v...
Github user CarbonDataQA commented on the issue: https://github.com/apache/carbondata/pull/2370 Build Success with Spark 2.2.1, Please check CI http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/5150/ ---
[GitHub] carbondata issue #2370: [CARBONDATA-2599]Use RowStreamParserImp as default v...
Github user ravipesala commented on the issue: https://github.com/apache/carbondata/pull/2370 SDV Build Success , Please check CI http://144.76.159.231:8080/job/ApacheSDVTests/5270/ ---
[GitHub] carbondata issue #2314: [CARBONDATA-2309][DataLoad] Add strategy to generate...
Github user jackylk commented on the issue: https://github.com/apache/carbondata/pull/2314 LGTM ---
[GitHub] carbondata pull request #2314: [CARBONDATA-2309][DataLoad] Add strategy to g...
Github user asfgit closed the pull request at: https://github.com/apache/carbondata/pull/2314 ---
[jira] [Resolved] (CARBONDATA-2309) Add strategy to generate bigger carbondata files in case of small amount of data
[ https://issues.apache.org/jira/browse/CARBONDATA-2309?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jacky Li resolved CARBONDATA-2309. -- Resolution: Fixed Fix Version/s: 1.4.1 1.5.0 > Add strategy to generate bigger carbondata files in case of small amount of > data > > > Key: CARBONDATA-2309 > URL: https://issues.apache.org/jira/browse/CARBONDATA-2309 > Project: CarbonData > Issue Type: Improvement > Components: data-load >Reporter: xuchuanyin >Assignee: wangsen >Priority: Major > Fix For: 1.5.0, 1.4.1 > > Time Spent: 3h 20m > Remaining Estimate: 0h > > In some scenario, the input amount of loading data is small, but carbondata > still distribute them to each executors (nodes) to do local-sort, thus > resulting to small carbondata files generated by each executor. > In some extreme conditions, if the cluster is big enough or if the amount of > data is small enough, the carbondata file contains only one blocklet or page. > I think a new strategy should be introduced to solve the above problem. > The new strategy should: > # be able to control the minimum amount of input data for each node > # ignore data locality otherwise it may always choose a small portion of > particular nodes -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[GitHub] carbondata issue #2328: [CARBONDATA-2504][STREAM] Support StreamSQL for stre...
Github user CarbonDataQA commented on the issue: https://github.com/apache/carbondata/pull/2328 Build Failed with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/6313/ ---
[GitHub] carbondata issue #2328: [CARBONDATA-2504][STREAM] Support StreamSQL for stre...
Github user ravipesala commented on the issue: https://github.com/apache/carbondata/pull/2328 SDV Build Fail , Please check CI http://144.76.159.231:8080/job/ApacheSDVTests/5271/ ---
[GitHub] carbondata issue #2328: [CARBONDATA-2504][STREAM] Support StreamSQL for stre...
Github user CarbonDataQA commented on the issue: https://github.com/apache/carbondata/pull/2328 Build Failed with Spark 2.2.1, Please check CI http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/5151/ ---
[GitHub] carbondata issue #2370: [CARBONDATA-2599]Use RowStreamParserImp as default v...
Github user zzcclp commented on the issue: https://github.com/apache/carbondata/pull/2370 retest this please ---
[GitHub] carbondata issue #2370: [CARBONDATA-2599]Use RowStreamParserImp as default v...
Github user CarbonDataQA commented on the issue: https://github.com/apache/carbondata/pull/2370 Build Failed with Spark 2.2.1, Please check CI http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/5152/ ---
[GitHub] carbondata issue #2370: [CARBONDATA-2599]Use RowStreamParserImp as default v...
Github user CarbonDataQA commented on the issue: https://github.com/apache/carbondata/pull/2370 Build Failed with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/6314/ ---
[GitHub] carbondata issue #2252: [CARBONDATA-2420][32K] Support string longer than 32...
Github user xuchuanyin commented on the issue: https://github.com/apache/carbondata/pull/2252 retest it please ---
[GitHub] carbondata issue #2370: [CARBONDATA-2599]Use RowStreamParserImp as default v...
Github user zzcclp commented on the issue: https://github.com/apache/carbondata/pull/2370 retest this please ---
[GitHub] carbondata issue #2370: [CARBONDATA-2599]Use RowStreamParserImp as default v...
Github user CarbonDataQA commented on the issue: https://github.com/apache/carbondata/pull/2370 Build Failed with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/6315/ ---
[GitHub] carbondata issue #2370: [CARBONDATA-2599]Use RowStreamParserImp as default v...
Github user CarbonDataQA commented on the issue: https://github.com/apache/carbondata/pull/2370 Build Failed with Spark 2.2.1, Please check CI http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/5153/ ---
[GitHub] carbondata issue #2328: [CARBONDATA-2504][STREAM] Support StreamSQL for stre...
Github user CarbonDataQA commented on the issue: https://github.com/apache/carbondata/pull/2328 Build Failed with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/6316/ ---
[GitHub] carbondata issue #2328: [CARBONDATA-2504][STREAM] Support StreamSQL for stre...
Github user CarbonDataQA commented on the issue: https://github.com/apache/carbondata/pull/2328 Build Failed with Spark 2.2.1, Please check CI http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/5154/ ---
[GitHub] carbondata issue #2328: [CARBONDATA-2504][STREAM] Support StreamSQL for stre...
Github user ravipesala commented on the issue: https://github.com/apache/carbondata/pull/2328 SDV Build Fail , Please check CI http://144.76.159.231:8080/job/ApacheSDVTests/5272/ ---
[GitHub] carbondata pull request #2371: [HOTFIX] fix java style errors
GitHub user zzcclp opened a pull request: https://github.com/apache/carbondata/pull/2371 [HOTFIX] fix java style errors Be sure to do all of the following checklist to help us incorporate your contribution quickly and easily: - [ ] Any interfaces changed? - [ ] Any backward compatibility impacted? - [ ] Document update required? - [ ] Testing done Please provide details on - Whether new unit test cases have been added or why no new tests are required? - How it is tested? Please attach test report. - Is it a performance related change? Please attach the performance test report. - Any additional information to help reviewers in testing this change. - [ ] For large changes, please consider breaking it into sub-tasks under an umbrella JIRA. You can merge this pull request into a Git repository by running: $ git pull https://github.com/zzcclp/carbondata HOTFIX_javastyle_error Alternatively you can review and apply these changes as the patch at: https://github.com/apache/carbondata/pull/2371.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #2371 commit dd5a981df58c563116815e0e98ad677c3f525a4e Author: Zhang Zhichao <441586683@...> Date: 2018-06-13T04:13:41Z [HOTFIX] fix java style errors ---
[GitHub] carbondata issue #2371: [HOTFIX] fix java style errors
Github user zzcclp commented on the issue: https://github.com/apache/carbondata/pull/2371 @jackylk @ravipesala please review and merge this pr to fix build errors ---
[GitHub] carbondata issue #2371: [HOTFIX] fix java style errors
Github user CarbonDataQA commented on the issue: https://github.com/apache/carbondata/pull/2371 Build Failed with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/6317/ ---
[GitHub] carbondata issue #2371: [HOTFIX] fix java style errors
Github user CarbonDataQA commented on the issue: https://github.com/apache/carbondata/pull/2371 Build Failed with Spark 2.2.1, Please check CI http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/5155/ ---
[GitHub] carbondata issue #2371: [HOTFIX] fix java style errors
Github user ravipesala commented on the issue: https://github.com/apache/carbondata/pull/2371 SDV Build Success , Please check CI http://144.76.159.231:8080/job/ApacheSDVTests/5273/ ---
[GitHub] carbondata issue #2369: [CARBONDATA-2604] getting ArrayIndexOutOfBoundExcept...
Github user rahulforallp commented on the issue: https://github.com/apache/carbondata/pull/2369 retest sdv please ---
[GitHub] carbondata issue #2369: [CARBONDATA-2604] getting ArrayIndexOutOfBoundExcept...
Github user ravipesala commented on the issue: https://github.com/apache/carbondata/pull/2369 SDV Build Fail , Please check CI http://144.76.159.231:8080/job/ApacheSDVTests/5274/ ---
[GitHub] carbondata issue #2371: [HOTFIX] fix java style errors
Github user CarbonDataQA commented on the issue: https://github.com/apache/carbondata/pull/2371 Build Failed with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/6318/ ---
[GitHub] carbondata issue #2371: [HOTFIX] fix java style errors
Github user CarbonDataQA commented on the issue: https://github.com/apache/carbondata/pull/2371 Build Failed with Spark 2.2.1, Please check CI http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/5156/ ---