[GitHub] [pinot] Jackie-Jiang commented on a diff in pull request #9333: Add an option to disable the creation of the forward index for a column

GitBox Tue, 20 Sep 2022 22:48:53 -0700


Jackie-Jiang commented on code in PR #9333:
URL: https://github.com/apache/pinot/pull/9333#discussion_r976058172



##########
pinot-segment-local/src/main/java/org/apache/pinot/segment/local/segment/index/loader/invertedindex/TextIndexHandler.java:
##########
@@ -174,18 +175,46 @@ private void 
createTextIndexForColumn(SegmentDirectory.Writer segmentWriter, Col
     // segmentDirectory is indicated to us by SegmentDirectoryPaths, we create 
lucene index there. There is no
     // further need to move around the lucene index directory since it is 
created with correct directory structure
     // based on segmentVersion.
-    try (ForwardIndexReader forwardIndexReader = 
LoaderUtils.getForwardIndexReader(segmentWriter, columnMetadata);
-        ForwardIndexReaderContext readerContext = 
forwardIndexReader.createContext();
-        TextIndexCreator textIndexCreator = 
indexCreatorProvider.newTextIndexCreator(IndexCreationContext.builder()
-            
.withColumnMetadata(columnMetadata).withIndexDir(segmentDirectory).build().forTextIndex(_fstType,
 true))) {
-      if (columnMetadata.isSingleValue()) {
-        processSVField(segmentWriter, hasDictionary, forwardIndexReader, 
readerContext, textIndexCreator, numDocs,
-            columnMetadata);
+    try (TextIndexCreator textIndexCreator = 
indexCreatorProvider.newTextIndexCreator(
+        
IndexCreationContext.builder().withColumnMetadata(columnMetadata).withIndexDir(segmentDirectory).build()
+            .forTextIndex(_fstType, true))) {
+      boolean forwardIndexDisabled = !segmentWriter.hasIndexFor(columnName, 
ColumnIndexType.FORWARD_INDEX);
+      if (forwardIndexDisabled) {
+        try (Dictionary dictionary = LoaderUtils.getDictionary(segmentWriter, 
columnMetadata)) {
+          // Create the text index if the dictionary length is 1 as this is 
for a default column (i.e. newly added
+          // column). For existing columns it is not possible to create the 
text index without forward index
+          Preconditions.checkState(dictionary.length() == 1, 
String.format("Creating text index for forward index "

Review Comment:
   I don't follow the second approach. For the default column where all the 
values are the same for a column, we should create sorted index and a single 
entry dictionary. This is the same behavior as generating a new segment with a 
column of all the same values.
   IMO second approach will actually cause inconsistent behavior because it 
will have inverted index with only a single entry, and the segment generated 
during the reload or from the raw data without a column will be different.
   I strongly suggest generating sorted index + dictionary for the default 
column.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[GitHub] [pinot] Jackie-Jiang commented on a diff in pull request #9333: Add an option to disable the creation of the forward index for a column

Reply via email to