jack86596 commented on a change in pull request #4105:
URL: https://github.com/apache/carbondata/pull/4105#discussion_r595406480



##########
File path: 
index/secondary-index/src/test/scala/org/apache/carbondata/spark/testsuite/secondaryindex/TestIndexRepair.scala
##########
@@ -119,6 +119,19 @@ class TestIndexRepair extends QueryTest with 
BeforeAndAfterAll {
     sql("drop table if exists maintable")
   }
 
+  test("reindex command with stale files") {
+    sql("drop table if exists maintable")
+    sql("CREATE TABLE maintable(a INT, b STRING, c STRING) stored as 
carbondata")
+    sql("CREATE INDEX indextable1 on table maintable(c) as 'carbondata'")
+    sql("INSERT INTO maintable SELECT 1,'string1', 'string2'")
+    sql("INSERT INTO maintable SELECT 1,'string1', 'string2'")
+    sql("INSERT INTO maintable SELECT 1,'string1', 'string2'")
+    sql("DELETE FROM TABLE INDEXTABLE1 WHERE SEGMENT.ID IN(0,1,2)")
+    sql("REINDEX INDEX TABLE indextable1 ON MAINTABLE WHERE SEGMENT.ID IN 
(0,1)")

Review comment:
       1. "we shouldn't allow delete segments on index table itself." please 
refer to the second last comment right before your comment.
   2. "And during repair index, if have segment with partial data, we should 
delete the segment completely(segment folder, segment file, probably 
tablestatus entry for the segment as well) before proceeding with segment 
repair." You suggestion of course is right but too more complicate comparing to 
existing implementation, so please first do the complete analysis and design 
and then we can discuss and plan next step. 




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]


Reply via email to