jack86596 commented on a change in pull request #4105:
URL: https://github.com/apache/carbondata/pull/4105#discussion_r595406480



##########
File path: 
index/secondary-index/src/test/scala/org/apache/carbondata/spark/testsuite/secondaryindex/TestIndexRepair.scala
##########
@@ -119,6 +119,19 @@ class TestIndexRepair extends QueryTest with 
BeforeAndAfterAll {
     sql("drop table if exists maintable")
   }
 
+  test("reindex command with stale files") {
+    sql("drop table if exists maintable")
+    sql("CREATE TABLE maintable(a INT, b STRING, c STRING) stored as 
carbondata")
+    sql("CREATE INDEX indextable1 on table maintable(c) as 'carbondata'")
+    sql("INSERT INTO maintable SELECT 1,'string1', 'string2'")
+    sql("INSERT INTO maintable SELECT 1,'string1', 'string2'")
+    sql("INSERT INTO maintable SELECT 1,'string1', 'string2'")
+    sql("DELETE FROM TABLE INDEXTABLE1 WHERE SEGMENT.ID IN(0,1,2)")
+    sql("REINDEX INDEX TABLE indextable1 ON MAINTABLE WHERE SEGMENT.ID IN 
(0,1)")

Review comment:
       1. "we shouldn't allow delete segments on index table itself." please 
refer to the second last comment right before your comment. If you ever solve 
production issue, you could not say this. There are thousand of query failed 
issues just because of SI segment is broken. We need to first delete the broken 
SI segment then repair it again(last two to three years, countless issues 
because of SI segment broken or not sync with main table). So please get to 
know customer, not build software without any knowing about how customer use 
the software. And please during coding, stand also at maintainer side, 
implement the feature with more maintainability. Thanks.
   2. "And during repair index, if have segment with partial data, we should 
delete the segment completely(segment folder, segment file, probably 
tablestatus entry for the segment as well) before proceeding with segment 
repair." You suggestion of course is right but too more complicate comparing to 
existing implementation, so please first do the complete analysis and design 
and then we can discuss and plan next step. 




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]


Reply via email to