[GitHub] [carbondata] vikramahuja1001 commented on pull request #4005: [CARBONDATA-3978] Trash Folder support in carbondata

2020-11-12 Thread GitBox


vikramahuja1001 commented on pull request #4005:
URL: https://github.com/apache/carbondata/pull/4005#issuecomment-725988902


   retest this please



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[jira] [Created] (CARBONDATA-4049) Sometimes refresh table fails with error "table not found in database" error

2020-11-12 Thread Chetan Bhat (Jira)
Chetan Bhat created CARBONDATA-4049:
---

 Summary: Sometimes refresh table fails with error "table not found 
in database" error
 Key: CARBONDATA-4049
 URL: https://issues.apache.org/jira/browse/CARBONDATA-4049
 Project: CarbonData
  Issue Type: Bug
  Components: data-query
Affects Versions: 2.1.0
 Environment: Spark 2.4.5
Reporter: Chetan Bhat


In Carbon 2.1 version user creates a database.

user copies a old version store such as 1.6.1 to HDFS folder of the database in 
the In Carbon 2.1 version

In Spark-SQL or beeline the user accesses the database using the use db command.

Refresh table command is executed on the old version store table and then the 
subsequent operations on the table are performed.

Next refresh table command is tried to be executed on another old version store 
table .

 

Issue : Sometimes refresh table fails with error "table not found in database" 
error.

spark-sql> refresh table brinjal_deleteseg;
*Error in query: Table or view 'brinjal_deleteseg' not found in database 
'1_6_1';*

 

**Log -

2020-11-12 18:55:46,922 | INFO  | [main] | Created broadcast 171 from 
broadCastHadoopConf at CarbonRDD.scala:58 | 
org.apache.spark.internal.Logging$class.logInfo(Logging.scala:54)2020-11-12 
18:55:46,922 | INFO  | [main] | Created broadcast 171 from broadCastHadoopConf 
at CarbonRDD.scala:58 | 
org.apache.spark.internal.Logging$class.logInfo(Logging.scala:54)2020-11-12 
18:55:46,924 | INFO  | [main] | Pushed Filters:  | 
org.apache.spark.internal.Logging$class.logInfo(Logging.scala:54)2020-11-12 
18:55:46,939 | INFO  | [main] | Distributed Index server is enabled for 
1_6_1.brinjal_update | 
org.apache.carbondata.core.util.CarbonProperties.isDistributedPruningEnabled(CarbonProperties.java:1742)2020-11-12
 18:55:46,939 | INFO  | [main] | Started block pruning ... | 
org.apache.carbondata.hadoop.api.CarbonInputFormat.getPrunedBlocklets(CarbonInputFormat.java:526)2020-11-12
 18:55:46,940 | INFO  | [main] | Distributed Index server is enabled for 
1_6_1.brinjal_update | 
org.apache.carbondata.core.util.CarbonProperties.isDistributedPruningEnabled(CarbonProperties.java:1742)2020-11-12
 18:55:46,945 | INFO  | [main] | Successfully Created directory: 
hdfs://hacluster/tmp/indexservertmp/4b6353d4-65d7-4856-b3cd-b3bc11d15c55 | 
org.apache.carbondata.core.util.CarbonUtil.createTempFolderForIndexServer(CarbonUtil.java:3273)2020-11-12
 18:55:46,945 | INFO  | [main] | Temp folder path for Query ID: 
4b6353d4-65d7-4856-b3cd-b3bc11d15c55 is 
org.apache.carbondata.core.datastore.filesystem.HDFSCarbonFile@b8f2e1bf | 
org.apache.carbondata.indexserver.DistributedIndexJob.execute(IndexJobs.scala:57)2020-11-12
 18:55:46,946 | ERROR | [main] | Configured port for index server is not a 
valid number | 
org.apache.carbondata.core.util.CarbonProperties.getIndexServerPort(CarbonProperties.java:1779)java.lang.NumberFormatException:
 null at java.lang.Integer.parseInt(Integer.java:542) at 
java.lang.Integer.parseInt(Integer.java:615) at 
org.apache.carbondata.core.util.CarbonProperties.getIndexServerPort(CarbonProperties.java:1777)
 at 
org.apache.carbondata.indexserver.IndexServer$.serverPort$lzycompute(IndexServer.scala:88)
 at 
org.apache.carbondata.indexserver.IndexServer$.serverPort(IndexServer.scala:88) 
at 
org.apache.carbondata.indexserver.IndexServer$.getClient(IndexServer.scala:312) 
at 
org.apache.carbondata.indexserver.IndexServer$.getClient(IndexServer.scala:301) 
at 
org.apache.carbondata.indexserver.DistributedIndexJob$$anonfun$1.apply(IndexJobs.scala:83)
 at 
org.apache.carbondata.indexserver.DistributedIndexJob$$anonfun$1.apply(IndexJobs.scala:59)
 at 
org.apache.carbondata.spark.util.CarbonScalaUtil$.logTime(CarbonScalaUtil.scala:769)
 at 
org.apache.carbondata.indexserver.DistributedIndexJob.execute(IndexJobs.scala:58)
 at 
org.apache.carbondata.core.index.IndexUtil.executeIndexJob(IndexUtil.java:304) 
at 
org.apache.carbondata.hadoop.api.CarbonInputFormat.getDistributedSplit(CarbonInputFormat.java:431)
 at 
org.apache.carbondata.hadoop.api.CarbonInputFormat.getPrunedBlocklets(CarbonInputFormat.java:532)
 at 
org.apache.carbondata.hadoop.api.CarbonInputFormat.getDataBlocksOfSegment(CarbonInputFormat.java:477)
 at 
org.apache.carbondata.hadoop.api.CarbonTableInputFormat.getSplits(CarbonTableInputFormat.java:356)
 at 
org.apache.carbondata.hadoop.api.CarbonTableInputFormat.getSplits(CarbonTableInputFormat.java:204)
 at 
org.apache.carbondata.spark.rdd.CarbonScanRDD.internalGetPartitions(CarbonScanRDD.scala:159)
 at org.apache.carbondata.spark.rdd.CarbonRDD.getPartitions(CarbonRDD.scala:68) 
at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:273) at 
org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:269) at 
scala.Option.getOrElse(Option.scala:121) at 
org.apache.spark.rdd.RDD.partitions(RDD.scala:269) at 
org.apache.spark.rdd.MapPartitio

[GitHub] [carbondata] CarbonDataQA1 commented on pull request #4005: [CARBONDATA-3978] Trash Folder support in carbondata

2020-11-12 Thread GitBox


CarbonDataQA1 commented on pull request #4005:
URL: https://github.com/apache/carbondata/pull/4005#issuecomment-726039924


   Build Success with Spark 2.3.4, Please check CI 
http://121.244.95.60:12545/job/ApacheCarbonPRBuilder2.3/4797/
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [carbondata] CarbonDataQA1 commented on pull request #4005: [CARBONDATA-3978] Trash Folder support in carbondata

2020-11-12 Thread GitBox


CarbonDataQA1 commented on pull request #4005:
URL: https://github.com/apache/carbondata/pull/4005#issuecomment-726042641


   Build Failed  with Spark 2.4.5, Please check CI 
http://121.244.95.60:12545/job/ApacheCarbon_PR_Builder_2.4.5/3039/
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [carbondata] ShreelekhyaG commented on a change in pull request #3988: [CARBONDATA-4037] Improve the table status and segment file writing

2020-11-12 Thread GitBox


ShreelekhyaG commented on a change in pull request #3988:
URL: https://github.com/apache/carbondata/pull/3988#discussion_r522092928



##
File path: 
integration/spark/src/test/scala/org/apache/carbondata/spark/testsuite/cleanfiles/TestCleanFileCommand.scala
##
@@ -0,0 +1,170 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.carbondata.spark.testsuite.cleanfiles
+
+import org.apache.spark.sql.Row
+import org.apache.spark.sql.test.util.QueryTest
+import org.scalatest.BeforeAndAfterAll
+
+import org.apache.carbondata.core.constants.CarbonCommonConstants
+import org.apache.carbondata.core.datastore.filesystem.{CarbonFile, 
CarbonFileFilter}
+import org.apache.carbondata.core.datastore.impl.FileFactory
+import org.apache.carbondata.core.index.Segment
+import org.apache.carbondata.core.metadata.{CarbonMetadata, SegmentFileStore}
+import org.apache.carbondata.core.util.CarbonProperties
+import org.apache.carbondata.core.util.path.CarbonTablePath
+
+class TestCleanFileCommand extends QueryTest with BeforeAndAfterAll {
+
+  var count = 0
+
+  override protected def beforeAll(): Unit = {
+sql("DROP TABLE IF EXISTS cleantest")
+CarbonProperties.getInstance()
+  .addProperty(CarbonCommonConstants.MAX_QUERY_EXECUTION_TIME, "0")
+  }
+
+  override protected def afterAll(): Unit = {
+sql("DROP TABLE IF EXISTS cleantest")
+CarbonProperties.getInstance()
+  .addProperty(CarbonCommonConstants.MAX_QUERY_EXECUTION_TIME,
+CarbonCommonConstants.DEFAULT_MAX_QUERY_EXECUTION_TIME.toString)
+  }
+
+  test("test clean files command for index files") {
+sql("drop table if exists cleantest")
+sql("create table cleantest(id int, issue date) STORED AS carbondata")
+sql("insert into table cleantest select '1','2000-02-01'")
+assert(getIndexFileCountFromSegmentPath("default_cleantest", "0") == 1)
+sql("clean files for table cleantest")
+assert(getIndexFileCountFromSegmentPath("default_cleantest", "0") == 0)
+sql("drop table if exists cleantest")
+  }
+
+  test("test clean files command for index files with SI") {
+sql("drop table if exists cleantest")
+sql("create table cleantest(id int, issue date, name string) STORED AS 
carbondata")
+sql("insert into table cleantest select '1','2000-02-01', 'abc' ")
+sql("create index indextable1 on table cleantest (name) AS 'carbondata'")
+assert(getIndexFileCountFromSegmentPath("default_cleantest", "0") == 1)
+assert(getIndexFileCountFromSegmentPath("default_indextable1", "0") == 1)
+sql("clean files for table cleantest")
+assert(getIndexFileCountFromSegmentPath("default_cleantest", "0") == 0)
+assert(getIndexFileCountFromSegmentPath("default_indextable1", "0") == 0)
+sql("drop table if exists cleantest")
+  }
+
+  test("test clean files command on partition table") {
+sql("drop table if exists cleantest")
+sql("create table cleantest(id int, issue date) STORED AS carbondata " +
+"partitioned by (name string)")
+sql("insert into table cleantest select '1','2000-02-01', 'abc' ")
+assert(getIndexFileCountFromSegmentPath("default_cleantest", "0") == 1)
+sql("clean files for table cleantest")
+assert(getIndexFileCountFromSegmentPath("default_cleantest", "0") == 0)
+sql("drop table if exists cleantest")
+  }
+
+  test("test clean files command for index files after update") {
+sql("drop table if exists cleantest")
+sql("create table cleantest(id int, name string) STORED AS carbondata")
+sql("insert into table cleantest select '1', 'abc' ")
+sql("insert into table cleantest select '2', 'abc' ")
+assert(getIndexFileCountFromSegmentPath("default_cleantest", "0") == 1)
+assert(getIndexFileCountFromSegmentPath("default_cleantest", "1") == 1)
+sql("update cleantest set (name)=('xyz') where id=2")
+assert(getIndexFileCountFromSegmentPath("default_cleantest", "1") == 2)
+sql("clean files for table cleantest")
+assert(getIndexFileCountFromSegmentPath("default_cleantest", "1") == 1)
+checkAnswer(sql("select *from cleantest where id=2"), Seq(Row(2, "xyz")))
+  }
+
+  test("test clean files without mergeindex") {
+CarbonProperties.getInstanc

[GitHub] [carbondata] CarbonDataQA1 commented on pull request #4005: [CARBONDATA-3978] Trash Folder support in carbondata

2020-11-12 Thread GitBox


CarbonDataQA1 commented on pull request #4005:
URL: https://github.com/apache/carbondata/pull/4005#issuecomment-726164564


   Build Success with Spark 2.3.4, Please check CI 
http://121.244.95.60:12545/job/ApacheCarbonPRBuilder2.3/4798/
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [carbondata] CarbonDataQA1 commented on pull request #4005: [CARBONDATA-3978] Trash Folder support in carbondata

2020-11-12 Thread GitBox


CarbonDataQA1 commented on pull request #4005:
URL: https://github.com/apache/carbondata/pull/4005#issuecomment-726165740


   Build Failed  with Spark 2.4.5, Please check CI 
http://121.244.95.60:12545/job/ApacheCarbon_PR_Builder_2.4.5/3040/
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [carbondata] CarbonDataQA1 commented on pull request #4005: [CARBONDATA-3978] Trash Folder support in carbondata

2020-11-12 Thread GitBox


CarbonDataQA1 commented on pull request #4005:
URL: https://github.com/apache/carbondata/pull/4005#issuecomment-726238728


   Build Failed  with Spark 2.4.5, Please check CI 
http://121.244.95.60:12545/job/ApacheCarbon_PR_Builder_2.4.5/3041/
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [carbondata] CarbonDataQA1 commented on pull request #4005: [CARBONDATA-3978] Trash Folder support in carbondata

2020-11-12 Thread GitBox


CarbonDataQA1 commented on pull request #4005:
URL: https://github.com/apache/carbondata/pull/4005#issuecomment-726239379


   Build Failed  with Spark 2.3.4, Please check CI 
http://121.244.95.60:12545/job/ApacheCarbonPRBuilder2.3/4799/
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [carbondata] CarbonDataQA1 commented on pull request #4005: [CARBONDATA-3978] Trash Folder support in carbondata

2020-11-12 Thread GitBox


CarbonDataQA1 commented on pull request #4005:
URL: https://github.com/apache/carbondata/pull/4005#issuecomment-726241500


   Build Failed  with Spark 2.4.5, Please check CI 
http://121.244.95.60:12545/job/ApacheCarbon_PR_Builder_2.4.5/3042/
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [carbondata] CarbonDataQA1 commented on pull request #4005: [CARBONDATA-3978] Trash Folder support in carbondata

2020-11-12 Thread GitBox


CarbonDataQA1 commented on pull request #4005:
URL: https://github.com/apache/carbondata/pull/4005#issuecomment-726298853


   Build Success with Spark 2.3.4, Please check CI 
http://121.244.95.60:12545/job/ApacheCarbonPRBuilder2.3/4800/
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [carbondata] CarbonDataQA1 commented on pull request #4005: [CARBONDATA-3978] Trash Folder support in carbondata

2020-11-12 Thread GitBox


CarbonDataQA1 commented on pull request #4005:
URL: https://github.com/apache/carbondata/pull/4005#issuecomment-726301744


   Build Failed  with Spark 2.4.5, Please check CI 
http://121.244.95.60:12545/job/ApacheCarbon_PR_Builder_2.4.5/3043/
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [carbondata] QiangCai commented on pull request #3935: [CARBONDATA-3993] Remove auto data deletion in IUD processs

2020-11-12 Thread GitBox


QiangCai commented on pull request #3935:
URL: https://github.com/apache/carbondata/pull/3935#issuecomment-726450628


   @Pickupolddriver I will rework by basing on this PR and raise another PR.  
Please close this PR, thanks for your contribution.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [carbondata] Pickupolddriver commented on pull request #3935: [CARBONDATA-3993] Remove auto data deletion in IUD processs

2020-11-12 Thread GitBox


Pickupolddriver commented on pull request #3935:
URL: https://github.com/apache/carbondata/pull/3935#issuecomment-726555194


   > @Pickupolddriver I will rework by basing on this PR and raise another PR. 
Please close this PR, thanks for your contribution.
   
   OK, PR closed.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [carbondata] Pickupolddriver closed pull request #3935: [CARBONDATA-3993] Remove auto data deletion in IUD processs

2020-11-12 Thread GitBox


Pickupolddriver closed pull request #3935:
URL: https://github.com/apache/carbondata/pull/3935


   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org