[incubator-pinot] branch upsert-pr-land updated (5c31201 -> a0f4fba)
This is an automated email from the ASF dual-hosted git repository. jamesshao pushed a change to branch upsert-pr-land in repository https://gitbox.apache.org/repos/asf/incubator-pinot.git. from 5c31201 frst update per cr add a0f4fba adjust how schema work for upsert No new revisions were added by this update. Summary of changes: .../org/apache/pinot/common/data/SchemaTest.java | 138 - .../apache/pinot/spi/data/IngestionModeConfig.java | 114 + .../java/org/apache/pinot/spi/data/Schema.java | 95 +++--- 3 files changed, 292 insertions(+), 55 deletions(-) create mode 100644 pinot-spi/src/main/java/org/apache/pinot/spi/data/IngestionModeConfig.java - To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org For additional commands, e-mail: commits-h...@pinot.apache.org
[GitHub] [incubator-pinot] fx19880617 commented on issue #5190: Nightly publish to bintray
fx19880617 commented on issue #5190: Nightly publish to bintray URL: https://github.com/apache/incubator-pinot/pull/5190#issuecomment-606407522 overall lgtm This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org For additional commands, e-mail: commits-h...@pinot.apache.org
[incubator-pinot] branch master updated (8dfa51a -> 9cb716f)
This is an automated email from the ASF dual-hosted git repository. jlli pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/incubator-pinot.git. from 8dfa51a Lucene DocId to PinotDocId cache to improve performance (#5177) add 9cb716f Nightly publish to bintray (#5190) No new revisions were added by this update. Summary of changes: .travis.yml| 38 +++--- .../.ci.settings.xml | 24 ++ .travis_install.sh => .travis/.travis_install.sh | 6 ++-- .../.travis_nightly_build.sh | 23 +++-- .../.travis_quickstart.sh | 0 .../.travis_quickstart_openjdk.sh | 4 +-- .../.travis_set_deploy_build_opts.sh | 14 +++- .travis_test.sh => .travis/.travis_test.sh | 6 ++-- pinot-broker/pom.xml | 6 ++-- pinot-clients/pinot-java-client/pom.xml| 6 ++-- pinot-clients/pom.xml | 7 ++-- pinot-common/pom.xml | 6 ++-- pinot-controller/pom.xml | 6 ++-- pinot-core/pom.xml | 6 ++-- pinot-distribution/pinot-source-assembly.xml | 2 +- pinot-distribution/pom.xml | 8 +++-- pinot-integration-tests/pom.xml| 6 ++-- pinot-minion/pom.xml | 6 ++-- pinot-perf/pom.xml | 6 ++-- .../pinot-batch-ingestion-common/pom.xml | 9 +++-- .../pinot-batch-ingestion-hadoop/pom.xml | 9 +++-- .../pinot-batch-ingestion-spark/pom.xml| 9 +++-- .../pinot-batch-ingestion-standalone/pom.xml | 9 +++-- pinot-plugins/pinot-batch-ingestion/pom.xml| 7 ++-- .../v0_deprecated/pinot-hadoop/pom.xml | 8 +++-- .../v0_deprecated/pinot-ingestion-common/pom.xml | 9 +++-- .../v0_deprecated/pinot-spark/pom.xml | 8 +++-- .../pinot-batch-ingestion/v0_deprecated/pom.xml| 7 ++-- pinot-plugins/pinot-file-system/pinot-adls/pom.xml | 6 ++-- pinot-plugins/pinot-file-system/pinot-gcs/pom.xml | 7 ++-- pinot-plugins/pinot-file-system/pinot-hdfs/pom.xml | 6 ++-- pinot-plugins/pinot-file-system/pom.xml| 7 ++-- .../pinot-input-format/pinot-avro-base/pom.xml | 8 +++-- .../pinot-input-format/pinot-avro/pom.xml | 8 +++-- .../pinot-confluent-avro/pom.xml | 8 +++-- pinot-plugins/pinot-input-format/pinot-csv/pom.xml | 8 +++-- .../pinot-input-format/pinot-json/pom.xml | 8 +++-- pinot-plugins/pinot-input-format/pinot-orc/pom.xml | 9 +++-- .../pinot-input-format/pinot-parquet/pom.xml | 8 +++-- .../pinot-input-format/pinot-thrift/pom.xml| 8 +++-- pinot-plugins/pinot-input-format/pom.xml | 7 ++-- .../pinot-stream-ingestion/pinot-kafka-0.9/pom.xml | 10 -- .../pinot-stream-ingestion/pinot-kafka-2.0/pom.xml | 10 -- .../pinot-kafka-base/pom.xml | 10 -- pinot-plugins/pinot-stream-ingestion/pom.xml | 7 ++-- pinot-plugins/pom.xml | 9 +++-- pinot-server/pom.xml | 6 ++-- pinot-spi/pom.xml | 6 ++-- pinot-tools/pom.xml| 6 ++-- pom.xml| 20 ++-- 50 files changed, 279 insertions(+), 158 deletions(-) copy pinot-common/src/main/resources/log4j2-fatal-only.xml => .travis/.ci.settings.xml (65%) rename .travis_install.sh => .travis/.travis_install.sh (91%) copy .travis_quickstart_openjdk.sh => .travis/.travis_nightly_build.sh (64%) rename .travis_quickstart.sh => .travis/.travis_quickstart.sh (100%) rename .travis_quickstart_openjdk.sh => .travis/.travis_quickstart_openjdk.sh (95%) copy docker/images/pinot-thirdeye/docker-build-and-push.sh => .travis/.travis_set_deploy_build_opts.sh (79%) rename .travis_test.sh => .travis/.travis_test.sh (91%) - To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org For additional commands, e-mail: commits-h...@pinot.apache.org
[GitHub] [incubator-pinot] siddharthteotia opened a new pull request #5199: Remove the construction of second bitmap in text index reader to improve performance
siddharthteotia opened a new pull request #5199: Remove the construction of second bitmap in text index reader to improve performance URL: https://github.com/apache/incubator-pinot/pull/5199 This is a follow-up to optimization implemented in PR https://github.com/apache/incubator-pinot/pull/5177. Since we now have pre-built mapping of luceneDocId to pinotDocId, we can directly build the result bitmap with pinotDocId. This PR removes the construction of second bitmap since earlier we had to do build the result in two phases -- (1) run search query to get luceneDocIDs in a bitmap. Iterate over this bitmap and build a second bitmap with corresponding pinotDocIds. Now in our Lucene collector callback, we can directly build the final bitmap. This change along with previous PR provides significant performance improvements. Ran an increasingQPS test on real data (single segment with 10million docs and a text index). QPS was increased from 1 to 40 `REPORT FOR TARGET QPS: 1.0 Current Target QPS: 1.0, Time Passed: 30084ms, Queries Executed: 30, Average QPS: 0.997207818109294, Average Broker Time: 44.6ms, Average Client Time: 50.1ms, Queries Queued: 0. REPORT FOR TARGET QPS: 3.0 Current Target QPS: 3.0, Time Passed: 30255ms, Queries Executed: 90, Average QPS: 2.974714923153198, Average Broker Time: 50.3ms, Average Client Time: 53.25ms, Queries Queued: 0. REPORT FOR TARGET QPS: 5.0 Current Target QPS: 5.0, Time Passed: 30412ms, Queries Executed: 150, Average QPS: 4.932263580165724, Average Broker Time: 41.44ms, Average Client Time: 44.1ms, Queries Queued: 0. REPORT FOR TARGET QPS: 7.0 Current Target QPS: 7.0, Time Passed: 30489ms, Queries Executed: 210, Average QPS: 6.88773983961, Average Broker Time: 41.82857142857143ms, Average Client Time: 44.00476190476191ms, Queries Queued: 0. REPORT FOR TARGET QPS: 9.0 Current Target QPS: 9.0, Time Passed: 30868ms, Queries Executed: 270, Average QPS: 8.746922379162887, Average Broker Time: 43.385185185185186ms, Average Client Time: 45.27037037037037ms, Queries Queued: 0. REPORT FOR TARGET QPS: 25.0 Current Target QPS: 25.0, Time Passed: 30233ms, Queries Executed: 694, Average QPS: 22.955049118512882, Average Broker Time: 37.27089337175793ms, Average Client Time: 38.53746397694525ms, Queries Queued: 0. REPORT FOR TARGET QPS: 27.0 Current Target QPS: 27.0, Time Passed: 30254ms, Queries Executed: 740, Average QPS: 24.459575593309975, Average Broker Time: 39.71351351351351ms, Average Client Time: 40.87837837837838ms, Queries Queued: 0. REPORT FOR TARGET QPS: 29.0 Current Target QPS: 29.0, Time Passed: 30147ms, Queries Executed: 798, Average QPS: 26.4702955517962, Average Broker Time: 37.06516290726817ms, Average Client Time: 38.22431077694235ms, Queries Queued: 0. REPORT FOR TARGET QPS: 31.0 Current Target QPS: 31.0, Time Passed: 30160ms, Queries Executed: 843, Average QPS: 27.950928381962864, Average Broker Time: 37.79359430604982ms, Average Client Time: 38.86476868327402ms, Queries Queued: 0. FINAL REPORT: Current Target QPS: 39.0, Time Passed: 27344ms, Queries Executed: 947, Average QPS: 34.632826214160325, Average Broker Time: 36.91024287222809ms, Average Client Time: 37.91129883843717ms. 10th percentile: 4.0 25th percentile: 26.0 50th percentile: 37.0 90th percentile: 68.0 95th percentile: 75.0 99th percentile: 129.0 99.9th percentile: 150.0` The optimizations implemented in this and previous PR are not directly applicable to realtime (we have the exact same performance overhead in realtime too) since we can't have a pre-built mapping there. We mostly need to build a cache on-the-fly as queries are processed on realtime lucene index. A solution is in progress. Will put PR soon This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org For additional commands, e-mail: commits-h...@pinot.apache.org
[GitHub] [incubator-pinot] jackjlli merged pull request #5190: Nightly publish to bintray
jackjlli merged pull request #5190: Nightly publish to bintray URL: https://github.com/apache/incubator-pinot/pull/5190 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org For additional commands, e-mail: commits-h...@pinot.apache.org
[GitHub] [incubator-pinot] siddharthteotia commented on a change in pull request #5177: Lucene DocId to PinotDocId cache
siddharthteotia commented on a change in pull request #5177: Lucene DocId to PinotDocId cache URL: https://github.com/apache/incubator-pinot/pull/5177#discussion_r400574972 ## File path: pinot-core/src/main/java/org/apache/pinot/core/segment/index/readers/text/LuceneTextIndexReader.java ## @@ -169,5 +178,57 @@ public void close() throws IOException { _indexReader.close(); _indexDirectory.close(); +_docIdReaderWriter.close(); + } + + private class DocIdReaderWriter implements Closeable { Review comment: done This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org For additional commands, e-mail: commits-h...@pinot.apache.org
[GitHub] [incubator-pinot] siddharthteotia commented on a change in pull request #5177: Lucene DocId to PinotDocId cache
siddharthteotia commented on a change in pull request #5177: Lucene DocId to PinotDocId cache URL: https://github.com/apache/incubator-pinot/pull/5177#discussion_r400574761 ## File path: pinot-core/src/main/java/org/apache/pinot/core/segment/index/loader/invertedindex/TextIndexHandler.java ## @@ -154,6 +152,11 @@ private void createTextIndexForColumn(ColumnMetadata columnMetadata) int numDocs = columnMetadata.getTotalDocs(); LOGGER.info("Creating new text index for column: {} in segment: {}", column, _segmentName); File segmentIndexDir = SegmentDirectoryPaths.segmentDirectoryFor(_indexDir, _segmentVersion); +// The handlers are always invoked by the preprocessor. Before this ImmutableSegmentLoader would have already Review comment: Done. Wasn't aware of this. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org For additional commands, e-mail: commits-h...@pinot.apache.org
[GitHub] [incubator-pinot] siddharthteotia commented on a change in pull request #5177: Lucene DocId to PinotDocId cache
siddharthteotia commented on a change in pull request #5177: Lucene DocId to PinotDocId cache URL: https://github.com/apache/incubator-pinot/pull/5177#discussion_r400574843 ## File path: pinot-core/src/main/java/org/apache/pinot/core/segment/index/readers/text/LuceneTextIndexReader.java ## @@ -169,5 +178,57 @@ public void close() throws IOException { _indexReader.close(); _indexDirectory.close(); +_docIdReaderWriter.close(); + } + + private class DocIdReaderWriter implements Closeable { Review comment: done This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org For additional commands, e-mail: commits-h...@pinot.apache.org
[GitHub] [incubator-pinot] siddharthteotia commented on a change in pull request #5177: Lucene DocId to PinotDocId cache
siddharthteotia commented on a change in pull request #5177: Lucene DocId to PinotDocId cache URL: https://github.com/apache/incubator-pinot/pull/5177#discussion_r400574849 ## File path: pinot-core/src/main/java/org/apache/pinot/core/segment/index/readers/text/LuceneTextIndexReader.java ## @@ -169,5 +178,57 @@ public void close() throws IOException { _indexReader.close(); _indexDirectory.close(); +_docIdReaderWriter.close(); + } + + private class DocIdReaderWriter implements Closeable { +private PinotDataBuffer _buffer; +private final boolean _mappingExists; + +DocIdReaderWriter(File segmentIndexDir, String column, int numDocs) +throws Exception { + int length = Integer.BYTES * numDocs; + File docIdMappingFile = new File(SegmentDirectoryPaths.findSegmentDirectory(segmentIndexDir), + column + LUCENE_TEXT_INDEX_DOCID_MAPPING_FILE_EXTENSION); + // The mapping is local to a segment. It is created on the server during segment load. + // Unless we are running Pinot on Solaris/SPARC, the underlying architecture is + // LITTLE_ENDIAN (Linux/x86). So use that as byte order. + if (docIdMappingFile.exists()) { +// we will be here for segment reload and server restart +// for refresh, we will not be here since segment is deleted/replaced +// TODO: see if we can prefetch the pages +_mappingExists = true; +_buffer = PinotDataBuffer.mapFile(docIdMappingFile, /* readOnly */ true, 0, length, ByteOrder.LITTLE_ENDIAN, +_column + getClass().getSimpleName()); + } else { +_mappingExists = false; +_buffer = PinotDataBuffer.mapFile(docIdMappingFile, /* readOnly */ false, 0, length, ByteOrder.LITTLE_ENDIAN, +_column + getClass().getSimpleName()); + } +} + +public void buildDocIdMapping(int numDocs) { + if (!_mappingExists) { +for (int i = 0; i < numDocs; i++) { + try { +Document document = _indexSearcher.doc(i); +int pinotDocId = Integer.valueOf(document.get(LuceneTextIndexCreator.LUCENE_INDEX_DOC_ID_COLUMN_NAME)); Review comment: done This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org For additional commands, e-mail: commits-h...@pinot.apache.org
[GitHub] [incubator-pinot] siddharthteotia commented on a change in pull request #5177: Lucene DocId to PinotDocId cache
siddharthteotia commented on a change in pull request #5177: Lucene DocId to PinotDocId cache URL: https://github.com/apache/incubator-pinot/pull/5177#discussion_r400574865 ## File path: pinot-core/src/main/java/org/apache/pinot/core/segment/index/readers/text/LuceneTextIndexReader.java ## @@ -169,5 +178,57 @@ public void close() throws IOException { _indexReader.close(); _indexDirectory.close(); +_docIdReaderWriter.close(); + } + + private class DocIdReaderWriter implements Closeable { +private PinotDataBuffer _buffer; +private final boolean _mappingExists; + +DocIdReaderWriter(File segmentIndexDir, String column, int numDocs) +throws Exception { + int length = Integer.BYTES * numDocs; + File docIdMappingFile = new File(SegmentDirectoryPaths.findSegmentDirectory(segmentIndexDir), + column + LUCENE_TEXT_INDEX_DOCID_MAPPING_FILE_EXTENSION); + // The mapping is local to a segment. It is created on the server during segment load. + // Unless we are running Pinot on Solaris/SPARC, the underlying architecture is + // LITTLE_ENDIAN (Linux/x86). So use that as byte order. + if (docIdMappingFile.exists()) { +// we will be here for segment reload and server restart +// for refresh, we will not be here since segment is deleted/replaced +// TODO: see if we can prefetch the pages +_mappingExists = true; +_buffer = PinotDataBuffer.mapFile(docIdMappingFile, /* readOnly */ true, 0, length, ByteOrder.LITTLE_ENDIAN, +_column + getClass().getSimpleName()); + } else { +_mappingExists = false; +_buffer = PinotDataBuffer.mapFile(docIdMappingFile, /* readOnly */ false, 0, length, ByteOrder.LITTLE_ENDIAN, +_column + getClass().getSimpleName()); + } +} + +public void buildDocIdMapping(int numDocs) { Review comment: done This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org For additional commands, e-mail: commits-h...@pinot.apache.org
[GitHub] [incubator-pinot] siddharthteotia commented on a change in pull request #5177: Lucene DocId to PinotDocId cache
siddharthteotia commented on a change in pull request #5177: Lucene DocId to PinotDocId cache URL: https://github.com/apache/incubator-pinot/pull/5177#discussion_r400574877 ## File path: pinot-core/src/main/java/org/apache/pinot/core/segment/index/readers/text/LuceneTextIndexReader.java ## @@ -60,7 +66,7 @@ * @param column column name * @param segmentIndexDir segment index directory */ - public LuceneTextIndexReader(String column, File segmentIndexDir) { + public LuceneTextIndexReader(String column, File segmentIndexDir, int numDocs) { Review comment: done This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org For additional commands, e-mail: commits-h...@pinot.apache.org
[GitHub] [incubator-pinot] siddharthteotia commented on a change in pull request #5177: Lucene DocId to PinotDocId cache
siddharthteotia commented on a change in pull request #5177: Lucene DocId to PinotDocId cache URL: https://github.com/apache/incubator-pinot/pull/5177#discussion_r400574901 ## File path: pinot-core/src/main/java/org/apache/pinot/core/segment/index/readers/text/LuceneTextIndexReader.java ## @@ -121,7 +131,7 @@ public MutableRoaringBitmap getDocIds(Object value) { try { Query query = _queryParser.parse(searchQuery); _indexSearcher.search(query, docIDCollector); - return getPinotDocIds(docIDs); + return getPinotDocIdsFromMappingFile(docIDs); Review comment: ohh yes, I was initially thinking of implementing two solutions. Reverted This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org For additional commands, e-mail: commits-h...@pinot.apache.org
[GitHub] [incubator-pinot] siddharthteotia commented on a change in pull request #5177: Lucene DocId to PinotDocId cache
siddharthteotia commented on a change in pull request #5177: Lucene DocId to PinotDocId cache URL: https://github.com/apache/incubator-pinot/pull/5177#discussion_r400574916 ## File path: pinot-core/src/main/java/org/apache/pinot/core/segment/index/readers/text/LuceneTextIndexReader.java ## @@ -121,7 +131,7 @@ public MutableRoaringBitmap getDocIds(Object value) { try { Query query = _queryParser.parse(searchQuery); _indexSearcher.search(query, docIDCollector); - return getPinotDocIds(docIDs); + return getPinotDocIdsFromMappingFile(docIDs); } catch (Exception e) { LOGGER.error("Failed while searching the text index for column {}, search query {}, exception {}", _column, Review comment: Removed the e.getMessage() from log. The log is still needed though since I want to capture column name and search expression for the failed query This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org For additional commands, e-mail: commits-h...@pinot.apache.org
[GitHub] [incubator-pinot] siddharthteotia commented on a change in pull request #5177: Lucene DocId to PinotDocId cache
siddharthteotia commented on a change in pull request #5177: Lucene DocId to PinotDocId cache URL: https://github.com/apache/incubator-pinot/pull/5177#discussion_r400574934 ## File path: pinot-core/src/main/java/org/apache/pinot/core/segment/index/readers/text/LuceneTextIndexReader.java ## @@ -142,18 +152,17 @@ public MutableRoaringBitmap getDocIds(Object value) { * * TODO: Explore optimizing this path to avoid building the second bitmap */ - private MutableRoaringBitmap getPinotDocIds(MutableRoaringBitmap luceneDocIds) { + private MutableRoaringBitmap getPinotDocIdsFromMappingFile(MutableRoaringBitmap luceneDocIds) { IntIterator luceneDocIDIterator = luceneDocIds.getIntIterator(); MutableRoaringBitmap actualDocIDs = new MutableRoaringBitmap(); try { while (luceneDocIDIterator.hasNext()) { int luceneDocId = luceneDocIDIterator.next(); -Document document = _indexSearcher.doc(luceneDocId); -int pinotDocId = Integer.valueOf(document.get(LuceneTextIndexCreator.LUCENE_INDEX_DOC_ID_COLUMN_NAME)); +int pinotDocId = _docIdReaderWriter.getInt(luceneDocId * Integer.BYTES); actualDocIDs.add(pinotDocId); } } catch (Exception e) { - throw new RuntimeException("Error: failed while retrieving document from index: " + e); + throw new RuntimeException("Error: failed while retrieving pinot doc id from mapping file: " + e); Review comment: done This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org For additional commands, e-mail: commits-h...@pinot.apache.org
[GitHub] [incubator-pinot] siddharthteotia commented on a change in pull request #5177: Lucene DocId to PinotDocId cache
siddharthteotia commented on a change in pull request #5177: Lucene DocId to PinotDocId cache URL: https://github.com/apache/incubator-pinot/pull/5177#discussion_r400574955 ## File path: pinot-core/src/main/java/org/apache/pinot/core/segment/index/readers/text/LuceneTextIndexReader.java ## @@ -169,5 +178,57 @@ public void close() throws IOException { _indexReader.close(); _indexDirectory.close(); +_docIdReaderWriter.close(); + } + + private class DocIdReaderWriter implements Closeable { +private PinotDataBuffer _buffer; +private final boolean _mappingExists; + +DocIdReaderWriter(File segmentIndexDir, String column, int numDocs) +throws Exception { + int length = Integer.BYTES * numDocs; + File docIdMappingFile = new File(SegmentDirectoryPaths.findSegmentDirectory(segmentIndexDir), + column + LUCENE_TEXT_INDEX_DOCID_MAPPING_FILE_EXTENSION); + // The mapping is local to a segment. It is created on the server during segment load. + // Unless we are running Pinot on Solaris/SPARC, the underlying architecture is + // LITTLE_ENDIAN (Linux/x86). So use that as byte order. + if (docIdMappingFile.exists()) { +// we will be here for segment reload and server restart +// for refresh, we will not be here since segment is deleted/replaced +// TODO: see if we can prefetch the pages +_mappingExists = true; +_buffer = PinotDataBuffer.mapFile(docIdMappingFile, /* readOnly */ true, 0, length, ByteOrder.LITTLE_ENDIAN, +_column + getClass().getSimpleName()); + } else { +_mappingExists = false; +_buffer = PinotDataBuffer.mapFile(docIdMappingFile, /* readOnly */ false, 0, length, ByteOrder.LITTLE_ENDIAN, +_column + getClass().getSimpleName()); + } +} + +public void buildDocIdMapping(int numDocs) { + if (!_mappingExists) { +for (int i = 0; i < numDocs; i++) { + try { +Document document = _indexSearcher.doc(i); +int pinotDocId = Integer.valueOf(document.get(LuceneTextIndexCreator.LUCENE_INDEX_DOC_ID_COLUMN_NAME)); +_buffer.putInt(i * Integer.BYTES, pinotDocId); + } catch (Exception e) { +throw new RuntimeException("Failed to build doc id mapping during segment load: " + e); + } +} + } +} + +int getInt(int offset) { Review comment: done This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org For additional commands, e-mail: commits-h...@pinot.apache.org
[GitHub] [incubator-pinot] siddharthteotia commented on a change in pull request #5177: Lucene DocId to PinotDocId cache
siddharthteotia commented on a change in pull request #5177: Lucene DocId to PinotDocId cache URL: https://github.com/apache/incubator-pinot/pull/5177#discussion_r400574944 ## File path: pinot-core/src/main/java/org/apache/pinot/core/segment/index/readers/text/LuceneTextIndexReader.java ## @@ -169,5 +178,57 @@ public void close() throws IOException { _indexReader.close(); _indexDirectory.close(); +_docIdReaderWriter.close(); + } + + private class DocIdReaderWriter implements Closeable { +private PinotDataBuffer _buffer; +private final boolean _mappingExists; + +DocIdReaderWriter(File segmentIndexDir, String column, int numDocs) +throws Exception { + int length = Integer.BYTES * numDocs; + File docIdMappingFile = new File(SegmentDirectoryPaths.findSegmentDirectory(segmentIndexDir), + column + LUCENE_TEXT_INDEX_DOCID_MAPPING_FILE_EXTENSION); + // The mapping is local to a segment. It is created on the server during segment load. + // Unless we are running Pinot on Solaris/SPARC, the underlying architecture is + // LITTLE_ENDIAN (Linux/x86). So use that as byte order. + if (docIdMappingFile.exists()) { +// we will be here for segment reload and server restart +// for refresh, we will not be here since segment is deleted/replaced +// TODO: see if we can prefetch the pages +_mappingExists = true; +_buffer = PinotDataBuffer.mapFile(docIdMappingFile, /* readOnly */ true, 0, length, ByteOrder.LITTLE_ENDIAN, +_column + getClass().getSimpleName()); Review comment: done This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org For additional commands, e-mail: commits-h...@pinot.apache.org
[GitHub] [incubator-pinot] siddharthteotia commented on a change in pull request #5177: Lucene DocId to PinotDocId cache
siddharthteotia commented on a change in pull request #5177: Lucene DocId to PinotDocId cache URL: https://github.com/apache/incubator-pinot/pull/5177#discussion_r400574735 ## File path: pinot-core/src/test/java/org/apache/pinot/queries/TextSearchQueriesTest.java ## @@ -18,6 +18,7 @@ */ package org.apache.pinot.queries; +import com.google.common.base.Stopwatch; Review comment: Done This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org For additional commands, e-mail: commits-h...@pinot.apache.org
[GitHub] [incubator-pinot] siddharthteotia commented on a change in pull request #5177: Lucene DocId to PinotDocId cache
siddharthteotia commented on a change in pull request #5177: Lucene DocId to PinotDocId cache URL: https://github.com/apache/incubator-pinot/pull/5177#discussion_r400574832 ## File path: pinot-core/src/main/java/org/apache/pinot/core/segment/index/readers/text/LuceneTextIndexReader.java ## @@ -169,5 +178,57 @@ public void close() throws IOException { _indexReader.close(); _indexDirectory.close(); +_docIdReaderWriter.close(); + } + + private class DocIdReaderWriter implements Closeable { +private PinotDataBuffer _buffer; Review comment: done This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org For additional commands, e-mail: commits-h...@pinot.apache.org
[GitHub] [incubator-pinot] siddharthteotia commented on a change in pull request #5177: Lucene DocId to PinotDocId cache
siddharthteotia commented on a change in pull request #5177: Lucene DocId to PinotDocId cache URL: https://github.com/apache/incubator-pinot/pull/5177#discussion_r400574811 ## File path: pinot-core/src/main/java/org/apache/pinot/core/segment/index/loader/invertedindex/TextIndexHandler.java ## @@ -61,8 +61,26 @@ import static org.apache.pinot.core.segment.creator.impl.V1Constants.MetadataKeys.Column.getKeyFor; +/** + * Helper class for text indexes used by {@link org.apache.pinot.core.segment.index.loader.SegmentPreProcessor}. + * to create text index for column during segment load time. Currently text index is always + * created (if enabled on a column) during segment generation + * + * (1) A new segment with text index is created/refreshed. Server loads the segment. The handler + * detects the existence of text index and returns. + * + * (2) A reload is issued on an existing segment with existing text index. The handler + * detects the existence of text index and returns. + * + * (3) A reload is issued on an existing segment after text index is enabled on an existing + * column. Read the forward index to create text index. + * + * (4) A reload is issued on an existing segment after text index is enabled on a newly + * added column. In this case, the default column handler would have taken care of adding + * forward index for the new column. Read the forward index to create text index. + */ public class TextIndexHandler { Review comment: done This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org For additional commands, e-mail: commits-h...@pinot.apache.org
[GitHub] [incubator-pinot] jackjlli commented on a change in pull request #5190: Nightly publish to bintray
jackjlli commented on a change in pull request #5190: Nightly publish to bintray URL: https://github.com/apache/incubator-pinot/pull/5190#discussion_r400659602 ## File path: pom.xml ## @@ -96,8 +97,18 @@ 2018 + + + bintray-linkedin-maven + linkedin-maven + https://api.bintray.com/maven/linkedin/maven/pinot/;publish=1 + + + ${basedir} +0.4.0 Review comment: Since now the version consists of revision and sha1, you can set the version number and then publish the release. An instruction can be found in the link below: https://techluminary.com/discard-maven-release-plugin-with-a-new-approach/ This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org For additional commands, e-mail: commits-h...@pinot.apache.org
[GitHub] [incubator-pinot] siddharthteotia commented on a change in pull request #5177: Lucene DocId to PinotDocId cache
siddharthteotia commented on a change in pull request #5177: Lucene DocId to PinotDocId cache URL: https://github.com/apache/incubator-pinot/pull/5177#discussion_r400578038 ## File path: pinot-core/src/main/java/org/apache/pinot/core/segment/index/readers/text/LuceneTextIndexReader.java ## @@ -169,5 +178,57 @@ public void close() throws IOException { _indexReader.close(); _indexDirectory.close(); +_docIdReaderWriter.close(); + } + + private class DocIdReaderWriter implements Closeable { +private PinotDataBuffer _buffer; +private final boolean _mappingExists; + +DocIdReaderWriter(File segmentIndexDir, String column, int numDocs) +throws Exception { + int length = Integer.BYTES * numDocs; + File docIdMappingFile = new File(SegmentDirectoryPaths.findSegmentDirectory(segmentIndexDir), + column + LUCENE_TEXT_INDEX_DOCID_MAPPING_FILE_EXTENSION); + // The mapping is local to a segment. It is created on the server during segment load. + // Unless we are running Pinot on Solaris/SPARC, the underlying architecture is + // LITTLE_ENDIAN (Linux/x86). So use that as byte order. + if (docIdMappingFile.exists()) { +// we will be here for segment reload and server restart +// for refresh, we will not be here since segment is deleted/replaced +// TODO: see if we can prefetch the pages +_mappingExists = true; +_buffer = PinotDataBuffer.mapFile(docIdMappingFile, /* readOnly */ true, 0, length, ByteOrder.LITTLE_ENDIAN, +_column + getClass().getSimpleName()); + } else { +_mappingExists = false; +_buffer = PinotDataBuffer.mapFile(docIdMappingFile, /* readOnly */ false, 0, length, ByteOrder.LITTLE_ENDIAN, +_column + getClass().getSimpleName()); + } +} + +public void buildDocIdMapping(int numDocs) { + if (!_mappingExists) { +for (int i = 0; i < numDocs; i++) { + try { +Document document = _indexSearcher.doc(i); +int pinotDocId = Integer.valueOf(document.get(LuceneTextIndexCreator.LUCENE_INDEX_DOC_ID_COLUMN_NAME)); +_buffer.putInt(i * Integer.BYTES, pinotDocId); + } catch (Exception e) { +throw new RuntimeException("Failed to build doc id mapping during segment load: " + e); Review comment: done This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org For additional commands, e-mail: commits-h...@pinot.apache.org
[GitHub] [incubator-pinot] siddharthteotia commented on a change in pull request #5177: Lucene DocId to PinotDocId cache
siddharthteotia commented on a change in pull request #5177: Lucene DocId to PinotDocId cache URL: https://github.com/apache/incubator-pinot/pull/5177#discussion_r400578010 ## File path: pinot-core/src/main/java/org/apache/pinot/core/segment/index/readers/text/LuceneTextIndexReader.java ## @@ -169,5 +178,57 @@ public void close() throws IOException { _indexReader.close(); _indexDirectory.close(); +_docIdReaderWriter.close(); + } + + private class DocIdReaderWriter implements Closeable { +private PinotDataBuffer _buffer; +private final boolean _mappingExists; + +DocIdReaderWriter(File segmentIndexDir, String column, int numDocs) +throws Exception { + int length = Integer.BYTES * numDocs; + File docIdMappingFile = new File(SegmentDirectoryPaths.findSegmentDirectory(segmentIndexDir), + column + LUCENE_TEXT_INDEX_DOCID_MAPPING_FILE_EXTENSION); + // The mapping is local to a segment. It is created on the server during segment load. + // Unless we are running Pinot on Solaris/SPARC, the underlying architecture is + // LITTLE_ENDIAN (Linux/x86). So use that as byte order. + if (docIdMappingFile.exists()) { +// we will be here for segment reload and server restart +// for refresh, we will not be here since segment is deleted/replaced +// TODO: see if we can prefetch the pages +_mappingExists = true; +_buffer = PinotDataBuffer.mapFile(docIdMappingFile, /* readOnly */ true, 0, length, ByteOrder.LITTLE_ENDIAN, Review comment: I explored existing reader/writer. Didn't use for the following reasons: FixedByteChunkReaderWriter uses direct memory (ByteBuffer.allocateDirect). Memory mapping is better. The FixedBitIntReaderWriter (used for dictionary encoded forward index) has additional bells and whistles due to bit-packing. The return type is an array. I wanted to keep this simple since we use it in exactly one place in lucene code. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org For additional commands, e-mail: commits-h...@pinot.apache.org
[incubator-pinot] branch master updated: Lucene DocId to PinotDocId cache to improve performance (#5177)
This is an automated email from the ASF dual-hosted git repository. siddteotia pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/incubator-pinot.git The following commit(s) were added to refs/heads/master by this push: new 8dfa51a Lucene DocId to PinotDocId cache to improve performance (#5177) 8dfa51a is described below commit 8dfa51af244f911d52824f94e95f983ffc50b5fb Author: Sidd AuthorDate: Mon Mar 30 21:47:22 2020 -0700 Lucene DocId to PinotDocId cache to improve performance (#5177) Co-authored-by: Siddharth Teotia --- .../index/column/PhysicalColumnIndexContainer.java | 2 +- .../converter/SegmentV1V2ToV3FormatConverter.java | 18 + .../loader/invertedindex/TextIndexHandler.java | 59 .../index/readers/text/LuceneTextIndexReader.java | 79 ++ .../core/segment/store/SegmentDirectoryPaths.java | 9 +++ .../core/segment/index/loader/LoaderTest.java | 57 ++-- ...archQueries.java => TextSearchQueriesTest.java} | 3 +- 7 files changed, 177 insertions(+), 50 deletions(-) diff --git a/pinot-core/src/main/java/org/apache/pinot/core/segment/index/column/PhysicalColumnIndexContainer.java b/pinot-core/src/main/java/org/apache/pinot/core/segment/index/column/PhysicalColumnIndexContainer.java index 33ba360..76ff19e 100644 --- a/pinot-core/src/main/java/org/apache/pinot/core/segment/index/column/PhysicalColumnIndexContainer.java +++ b/pinot-core/src/main/java/org/apache/pinot/core/segment/index/column/PhysicalColumnIndexContainer.java @@ -132,7 +132,7 @@ public final class PhysicalColumnIndexContainer implements ColumnIndexContainer _dictionary = null; _bloomFilterReader = null; if (loadTextIndex) { -_invertedIndex = new LuceneTextIndexReader(columnName, segmentIndexDir); +_invertedIndex = new LuceneTextIndexReader(columnName, segmentIndexDir, metadata.getTotalDocs()); } else { _invertedIndex = null; } diff --git a/pinot-core/src/main/java/org/apache/pinot/core/segment/index/converter/SegmentV1V2ToV3FormatConverter.java b/pinot-core/src/main/java/org/apache/pinot/core/segment/index/converter/SegmentV1V2ToV3FormatConverter.java index 71c56ae..f534fe9 100644 --- a/pinot-core/src/main/java/org/apache/pinot/core/segment/index/converter/SegmentV1V2ToV3FormatConverter.java +++ b/pinot-core/src/main/java/org/apache/pinot/core/segment/index/converter/SegmentV1V2ToV3FormatConverter.java @@ -35,6 +35,7 @@ import org.apache.pinot.core.indexsegment.generator.SegmentVersion; import org.apache.pinot.core.segment.creator.impl.V1Constants; import org.apache.pinot.core.segment.creator.impl.inv.text.LuceneTextIndexCreator; import org.apache.pinot.core.segment.index.metadata.SegmentMetadataImpl; +import org.apache.pinot.core.segment.index.readers.text.LuceneTextIndexReader; import org.apache.pinot.core.segment.memory.PinotDataBuffer; import org.apache.pinot.core.segment.store.ColumnIndexType; import org.apache.pinot.core.segment.store.SegmentDirectory; @@ -225,6 +226,7 @@ public class SegmentV1V2ToV3FormatConverter implements SegmentFormatConverter { private void copyLuceneTextIndexIfExists(File segmentDirectory, File v3Dir) throws IOException { +// TODO: see if this can be done by reusing some existing methods String suffix = LuceneTextIndexCreator.LUCENE_TEXT_INDEX_FILE_EXTENSION; File[] textIndexFiles = segmentDirectory.listFiles(new FilenameFilter() { @Override @@ -241,6 +243,22 @@ public class SegmentV1V2ToV3FormatConverter implements SegmentFormatConverter { Files.copy(indexFile.toPath(), v3LuceneIndexFile.toPath()); } } +// if segment reload is issued asking for up-conversion of +// on-disk segment format from v1/v2 to v3, then in addition +// to moving the lucene text index files, we need to move the +// docID mapping/cache file created by us in v1/v2 during an earlier +// load of the segment. +String docIDFileSuffix = LuceneTextIndexReader.LUCENE_TEXT_INDEX_DOCID_MAPPING_FILE_EXTENSION; +File[] textIndexDocIdMappingFiles = segmentDirectory.listFiles(new FilenameFilter() { + @Override + public boolean accept(File dir, String name) { +return name.endsWith(docIDFileSuffix); + } +}); +for (File docIdMappingFile : textIndexDocIdMappingFiles) { + File v3DocIdMappingFile = new File(v3Dir, docIdMappingFile.getName()); + Files.copy(docIdMappingFile.toPath(), v3DocIdMappingFile.toPath()); +} } private void deleteStaleConversionDirectories(File segmentDirectory) { diff --git a/pinot-core/src/main/java/org/apache/pinot/core/segment/index/loader/invertedindex/TextIndexHandler.java b/pinot-core/src/main/java/org/apache/pinot/core/segment/index/loader/invertedindex/TextIndexHandler.java index a501596..1c1786f 100644 ---
[GitHub] [incubator-pinot] siddharthteotia merged pull request #5177: Lucene DocId to PinotDocId cache
siddharthteotia merged pull request #5177: Lucene DocId to PinotDocId cache URL: https://github.com/apache/incubator-pinot/pull/5177 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org For additional commands, e-mail: commits-h...@pinot.apache.org
[GitHub] [incubator-pinot] fx19880617 commented on a change in pull request #5190: Nightly publish to bintray
fx19880617 commented on a change in pull request #5190: Nightly publish to bintray URL: https://github.com/apache/incubator-pinot/pull/5190#discussion_r400649681 ## File path: pom.xml ## @@ -96,8 +97,18 @@ 2018 + + + bintray-linkedin-maven + linkedin-maven + https://api.bintray.com/maven/linkedin/maven/pinot/;publish=1 + + + ${basedir} +0.4.0 Review comment: In this case, what will be the behavior of command `mvn release` how is version been updated there? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org For additional commands, e-mail: commits-h...@pinot.apache.org
[GitHub] [incubator-pinot] siddharthteotia commented on a change in pull request #5177: Lucene DocId to PinotDocId cache
siddharthteotia commented on a change in pull request #5177: Lucene DocId to PinotDocId cache URL: https://github.com/apache/incubator-pinot/pull/5177#discussion_r400576876 ## File path: pinot-core/src/main/java/org/apache/pinot/core/segment/index/converter/SegmentV1V2ToV3FormatConverter.java ## @@ -241,6 +242,22 @@ public boolean accept(File dir, String name) { Files.copy(indexFile.toPath(), v3LuceneIndexFile.toPath()); } } +// if segment reload is issued asking for up-conversion of Review comment: Lucene index files (created by Lucene) can't be copied as done in copyForwardIndex `PinotDataBuffer oldBuffer = reader.getIndexFor(column, indexType); long oldBufferSize = oldBuffer.size(); PinotDataBuffer newBuffer = writer.newIndexFor(column, indexType, oldBufferSize); oldBuffer.copyTo(0, newBuffer, 0, oldBufferSize);` The only thing that can be copied as buffer is the mapping file that we create. I just wanted to keep everything related to lucene in a single method. I can explore how to clean this up. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org For additional commands, e-mail: commits-h...@pinot.apache.org
[GitHub] [incubator-pinot] siddharthteotia commented on a change in pull request #5177: Lucene DocId to PinotDocId cache
siddharthteotia commented on a change in pull request #5177: Lucene DocId to PinotDocId cache URL: https://github.com/apache/incubator-pinot/pull/5177#discussion_r400576876 ## File path: pinot-core/src/main/java/org/apache/pinot/core/segment/index/converter/SegmentV1V2ToV3FormatConverter.java ## @@ -241,6 +242,22 @@ public boolean accept(File dir, String name) { Files.copy(indexFile.toPath(), v3LuceneIndexFile.toPath()); } } +// if segment reload is issued asking for up-conversion of Review comment: Lucene index files (multiple files in a single directory created by Lucene) can't be copied as done in copyForwardIndex `PinotDataBuffer oldBuffer = reader.getIndexFor(column, indexType); long oldBufferSize = oldBuffer.size(); PinotDataBuffer newBuffer = writer.newIndexFor(column, indexType, oldBufferSize); oldBuffer.copyTo(0, newBuffer, 0, oldBufferSize);` The only thing that can be copied as buffer is the mapping file that we create. I just wanted to keep everything related to lucene in a single method. I can explore how to clean this up. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org For additional commands, e-mail: commits-h...@pinot.apache.org
[GitHub] [incubator-pinot] harleyjj opened a new pull request #5198: [TE] frontend - harleyjj/home - use duration param to set date range
harleyjj opened a new pull request #5198: [TE] frontend - harleyjj/home - use duration param to set date range URL: https://github.com/apache/incubator-pinot/pull/5198 1) If the duration param is present, it will override the startDate and endDate params in home page and share dashboard 2) Fixes a bug where share dashboard depends on config from backend (not always present) This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org For additional commands, e-mail: commits-h...@pinot.apache.org
[GitHub] [incubator-pinot] Jackie-Jiang commented on a change in pull request #5177: Lucene DocId to PinotDocId cache
Jackie-Jiang commented on a change in pull request #5177: Lucene DocId to PinotDocId cache URL: https://github.com/apache/incubator-pinot/pull/5177#discussion_r400565538 ## File path: pinot-core/src/main/java/org/apache/pinot/core/segment/index/readers/text/LuceneTextIndexReader.java ## @@ -169,5 +178,57 @@ public void close() throws IOException { _indexReader.close(); _indexDirectory.close(); +_docIdReaderWriter.close(); + } + + private class DocIdReaderWriter implements Closeable { +private PinotDataBuffer _buffer; +private final boolean _mappingExists; + +DocIdReaderWriter(File segmentIndexDir, String column, int numDocs) +throws Exception { + int length = Integer.BYTES * numDocs; + File docIdMappingFile = new File(SegmentDirectoryPaths.findSegmentDirectory(segmentIndexDir), + column + LUCENE_TEXT_INDEX_DOCID_MAPPING_FILE_EXTENSION); + // The mapping is local to a segment. It is created on the server during segment load. + // Unless we are running Pinot on Solaris/SPARC, the underlying architecture is + // LITTLE_ENDIAN (Linux/x86). So use that as byte order. + if (docIdMappingFile.exists()) { +// we will be here for segment reload and server restart +// for refresh, we will not be here since segment is deleted/replaced +// TODO: see if we can prefetch the pages +_mappingExists = true; +_buffer = PinotDataBuffer.mapFile(docIdMappingFile, /* readOnly */ true, 0, length, ByteOrder.LITTLE_ENDIAN, +_column + getClass().getSimpleName()); + } else { +_mappingExists = false; +_buffer = PinotDataBuffer.mapFile(docIdMappingFile, /* readOnly */ false, 0, length, ByteOrder.LITTLE_ENDIAN, +_column + getClass().getSimpleName()); + } +} + +public void buildDocIdMapping(int numDocs) { Review comment: Merge this into the constructor, no need to track an extra boolean. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org For additional commands, e-mail: commits-h...@pinot.apache.org
[GitHub] [incubator-pinot] Jackie-Jiang commented on a change in pull request #5177: Lucene DocId to PinotDocId cache
Jackie-Jiang commented on a change in pull request #5177: Lucene DocId to PinotDocId cache URL: https://github.com/apache/incubator-pinot/pull/5177#discussion_r400564728 ## File path: pinot-core/src/main/java/org/apache/pinot/core/segment/index/readers/text/LuceneTextIndexReader.java ## @@ -169,5 +178,57 @@ public void close() throws IOException { _indexReader.close(); _indexDirectory.close(); +_docIdReaderWriter.close(); + } + + private class DocIdReaderWriter implements Closeable { Review comment: `private static class` This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org For additional commands, e-mail: commits-h...@pinot.apache.org
[GitHub] [incubator-pinot] Jackie-Jiang commented on a change in pull request #5177: Lucene DocId to PinotDocId cache
Jackie-Jiang commented on a change in pull request #5177: Lucene DocId to PinotDocId cache URL: https://github.com/apache/incubator-pinot/pull/5177#discussion_r400565872 ## File path: pinot-core/src/main/java/org/apache/pinot/core/segment/index/readers/text/LuceneTextIndexReader.java ## @@ -60,7 +66,7 @@ * @param column column name * @param segmentIndexDir segment index directory */ - public LuceneTextIndexReader(String column, File segmentIndexDir) { + public LuceneTextIndexReader(String column, File segmentIndexDir, int numDocs) { Review comment: Recommend making second argument `indexDir` to denote that it is top level dir This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org For additional commands, e-mail: commits-h...@pinot.apache.org
[GitHub] [incubator-pinot] Jackie-Jiang commented on a change in pull request #5177: Lucene DocId to PinotDocId cache
Jackie-Jiang commented on a change in pull request #5177: Lucene DocId to PinotDocId cache URL: https://github.com/apache/incubator-pinot/pull/5177#discussion_r400559121 ## File path: pinot-core/src/main/java/org/apache/pinot/core/segment/index/loader/invertedindex/TextIndexHandler.java ## @@ -61,8 +61,26 @@ import static org.apache.pinot.core.segment.creator.impl.V1Constants.MetadataKeys.Column.getKeyFor; +/** + * Helper class for text indexes used by {@link org.apache.pinot.core.segment.index.loader.SegmentPreProcessor}. + * to create text index for column during segment load time. Currently text index is always + * created (if enabled on a column) during segment generation + * + * (1) A new segment with text index is created/refreshed. Server loads the segment. The handler + * detects the existence of text index and returns. + * + * (2) A reload is issued on an existing segment with existing text index. The handler + * detects the existence of text index and returns. + * + * (3) A reload is issued on an existing segment after text index is enabled on an existing + * column. Read the forward index to create text index. + * + * (4) A reload is issued on an existing segment after text index is enabled on a newly + * added column. In this case, the default column handler would have taken care of adding + * forward index for the new column. Read the forward index to create text index. + */ public class TextIndexHandler { Review comment: Not related to this pr, but seems this file needs reformat This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org For additional commands, e-mail: commits-h...@pinot.apache.org
[GitHub] [incubator-pinot] Jackie-Jiang commented on a change in pull request #5177: Lucene DocId to PinotDocId cache
Jackie-Jiang commented on a change in pull request #5177: Lucene DocId to PinotDocId cache URL: https://github.com/apache/incubator-pinot/pull/5177#discussion_r400564673 ## File path: pinot-core/src/main/java/org/apache/pinot/core/segment/index/readers/text/LuceneTextIndexReader.java ## @@ -169,5 +178,57 @@ public void close() throws IOException { _indexReader.close(); _indexDirectory.close(); +_docIdReaderWriter.close(); + } + + private class DocIdReaderWriter implements Closeable { +private PinotDataBuffer _buffer; Review comment: `final PinotDataBuffer _buffer` (class itself is private) This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org For additional commands, e-mail: commits-h...@pinot.apache.org
[GitHub] [incubator-pinot] Jackie-Jiang commented on a change in pull request #5177: Lucene DocId to PinotDocId cache
Jackie-Jiang commented on a change in pull request #5177: Lucene DocId to PinotDocId cache URL: https://github.com/apache/incubator-pinot/pull/5177#discussion_r400557528 ## File path: pinot-core/src/test/java/org/apache/pinot/queries/TextSearchQueriesTest.java ## @@ -18,6 +18,7 @@ */ package org.apache.pinot.queries; +import com.google.common.base.Stopwatch; Review comment: Remove? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org For additional commands, e-mail: commits-h...@pinot.apache.org
[GitHub] [incubator-pinot] Jackie-Jiang commented on a change in pull request #5177: Lucene DocId to PinotDocId cache
Jackie-Jiang commented on a change in pull request #5177: Lucene DocId to PinotDocId cache URL: https://github.com/apache/incubator-pinot/pull/5177#discussion_r400568072 ## File path: pinot-core/src/main/java/org/apache/pinot/core/segment/index/readers/text/LuceneTextIndexReader.java ## @@ -169,5 +178,57 @@ public void close() throws IOException { _indexReader.close(); _indexDirectory.close(); +_docIdReaderWriter.close(); + } + + private class DocIdReaderWriter implements Closeable { +private PinotDataBuffer _buffer; +private final boolean _mappingExists; + +DocIdReaderWriter(File segmentIndexDir, String column, int numDocs) +throws Exception { + int length = Integer.BYTES * numDocs; + File docIdMappingFile = new File(SegmentDirectoryPaths.findSegmentDirectory(segmentIndexDir), + column + LUCENE_TEXT_INDEX_DOCID_MAPPING_FILE_EXTENSION); + // The mapping is local to a segment. It is created on the server during segment load. + // Unless we are running Pinot on Solaris/SPARC, the underlying architecture is + // LITTLE_ENDIAN (Linux/x86). So use that as byte order. + if (docIdMappingFile.exists()) { +// we will be here for segment reload and server restart +// for refresh, we will not be here since segment is deleted/replaced +// TODO: see if we can prefetch the pages +_mappingExists = true; +_buffer = PinotDataBuffer.mapFile(docIdMappingFile, /* readOnly */ true, 0, length, ByteOrder.LITTLE_ENDIAN, +_column + getClass().getSimpleName()); Review comment: `"Text index docId mapping buffer: " + _column` This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org For additional commands, e-mail: commits-h...@pinot.apache.org
[GitHub] [incubator-pinot] Jackie-Jiang commented on a change in pull request #5177: Lucene DocId to PinotDocId cache
Jackie-Jiang commented on a change in pull request #5177: Lucene DocId to PinotDocId cache URL: https://github.com/apache/incubator-pinot/pull/5177#discussion_r400566118 ## File path: pinot-core/src/main/java/org/apache/pinot/core/segment/index/readers/text/LuceneTextIndexReader.java ## @@ -121,7 +131,7 @@ public MutableRoaringBitmap getDocIds(Object value) { try { Query query = _queryParser.parse(searchQuery); _indexSearcher.search(query, docIDCollector); - return getPinotDocIds(docIDs); + return getPinotDocIdsFromMappingFile(docIDs); Review comment: I prefer the original name This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org For additional commands, e-mail: commits-h...@pinot.apache.org
[GitHub] [incubator-pinot] Jackie-Jiang commented on a change in pull request #5177: Lucene DocId to PinotDocId cache
Jackie-Jiang commented on a change in pull request #5177: Lucene DocId to PinotDocId cache URL: https://github.com/apache/incubator-pinot/pull/5177#discussion_r400568967 ## File path: pinot-core/src/main/java/org/apache/pinot/core/segment/index/readers/text/LuceneTextIndexReader.java ## @@ -169,5 +178,57 @@ public void close() throws IOException { _indexReader.close(); _indexDirectory.close(); +_docIdReaderWriter.close(); + } + + private class DocIdReaderWriter implements Closeable { Review comment: I would recommend renaming it to `DocIdTranslator` This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org For additional commands, e-mail: commits-h...@pinot.apache.org
[GitHub] [incubator-pinot] Jackie-Jiang commented on a change in pull request #5177: Lucene DocId to PinotDocId cache
Jackie-Jiang commented on a change in pull request #5177: Lucene DocId to PinotDocId cache URL: https://github.com/apache/incubator-pinot/pull/5177#discussion_r400566758 ## File path: pinot-core/src/main/java/org/apache/pinot/core/segment/index/readers/text/LuceneTextIndexReader.java ## @@ -142,18 +152,17 @@ public MutableRoaringBitmap getDocIds(Object value) { * * TODO: Explore optimizing this path to avoid building the second bitmap */ - private MutableRoaringBitmap getPinotDocIds(MutableRoaringBitmap luceneDocIds) { + private MutableRoaringBitmap getPinotDocIdsFromMappingFile(MutableRoaringBitmap luceneDocIds) { IntIterator luceneDocIDIterator = luceneDocIds.getIntIterator(); MutableRoaringBitmap actualDocIDs = new MutableRoaringBitmap(); try { while (luceneDocIDIterator.hasNext()) { int luceneDocId = luceneDocIDIterator.next(); -Document document = _indexSearcher.doc(luceneDocId); -int pinotDocId = Integer.valueOf(document.get(LuceneTextIndexCreator.LUCENE_INDEX_DOC_ID_COLUMN_NAME)); +int pinotDocId = _docIdReaderWriter.getInt(luceneDocId * Integer.BYTES); actualDocIDs.add(pinotDocId); } } catch (Exception e) { - throw new RuntimeException("Error: failed while retrieving document from index: " + e); + throw new RuntimeException("Error: failed while retrieving pinot doc id from mapping file: " + e); Review comment: No need to catch, you can directly throw the exception This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org For additional commands, e-mail: commits-h...@pinot.apache.org
[GitHub] [incubator-pinot] Jackie-Jiang commented on a change in pull request #5177: Lucene DocId to PinotDocId cache
Jackie-Jiang commented on a change in pull request #5177: Lucene DocId to PinotDocId cache URL: https://github.com/apache/incubator-pinot/pull/5177#discussion_r400568405 ## File path: pinot-core/src/main/java/org/apache/pinot/core/segment/index/readers/text/LuceneTextIndexReader.java ## @@ -169,5 +178,57 @@ public void close() throws IOException { _indexReader.close(); _indexDirectory.close(); +_docIdReaderWriter.close(); + } + + private class DocIdReaderWriter implements Closeable { +private PinotDataBuffer _buffer; +private final boolean _mappingExists; + +DocIdReaderWriter(File segmentIndexDir, String column, int numDocs) +throws Exception { + int length = Integer.BYTES * numDocs; + File docIdMappingFile = new File(SegmentDirectoryPaths.findSegmentDirectory(segmentIndexDir), + column + LUCENE_TEXT_INDEX_DOCID_MAPPING_FILE_EXTENSION); + // The mapping is local to a segment. It is created on the server during segment load. + // Unless we are running Pinot on Solaris/SPARC, the underlying architecture is + // LITTLE_ENDIAN (Linux/x86). So use that as byte order. + if (docIdMappingFile.exists()) { +// we will be here for segment reload and server restart +// for refresh, we will not be here since segment is deleted/replaced +// TODO: see if we can prefetch the pages +_mappingExists = true; +_buffer = PinotDataBuffer.mapFile(docIdMappingFile, /* readOnly */ true, 0, length, ByteOrder.LITTLE_ENDIAN, +_column + getClass().getSimpleName()); + } else { +_mappingExists = false; +_buffer = PinotDataBuffer.mapFile(docIdMappingFile, /* readOnly */ false, 0, length, ByteOrder.LITTLE_ENDIAN, +_column + getClass().getSimpleName()); + } +} + +public void buildDocIdMapping(int numDocs) { + if (!_mappingExists) { +for (int i = 0; i < numDocs; i++) { + try { +Document document = _indexSearcher.doc(i); +int pinotDocId = Integer.valueOf(document.get(LuceneTextIndexCreator.LUCENE_INDEX_DOC_ID_COLUMN_NAME)); +_buffer.putInt(i * Integer.BYTES, pinotDocId); + } catch (Exception e) { +throw new RuntimeException("Failed to build doc id mapping during segment load: " + e); + } +} + } +} + +int getInt(int offset) { Review comment: `int getPinotDocId(int luceneDocId)` for better abstraction This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org For additional commands, e-mail: commits-h...@pinot.apache.org
[GitHub] [incubator-pinot] Jackie-Jiang commented on a change in pull request #5177: Lucene DocId to PinotDocId cache
Jackie-Jiang commented on a change in pull request #5177: Lucene DocId to PinotDocId cache URL: https://github.com/apache/incubator-pinot/pull/5177#discussion_r400566628 ## File path: pinot-core/src/main/java/org/apache/pinot/core/segment/index/readers/text/LuceneTextIndexReader.java ## @@ -121,7 +131,7 @@ public MutableRoaringBitmap getDocIds(Object value) { try { Query query = _queryParser.parse(searchQuery); _indexSearcher.search(query, docIDCollector); - return getPinotDocIds(docIDs); + return getPinotDocIdsFromMappingFile(docIDs); } catch (Exception e) { LOGGER.error("Failed while searching the text index for column {}, search query {}, exception {}", _column, Review comment: You don't really need to log the error if you are going to throw out the exception. The catcher will log it with the stack trace This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org For additional commands, e-mail: commits-h...@pinot.apache.org
[GitHub] [incubator-pinot] Jackie-Jiang commented on a change in pull request #5177: Lucene DocId to PinotDocId cache
Jackie-Jiang commented on a change in pull request #5177: Lucene DocId to PinotDocId cache URL: https://github.com/apache/incubator-pinot/pull/5177#discussion_r400565038 ## File path: pinot-core/src/main/java/org/apache/pinot/core/segment/index/readers/text/LuceneTextIndexReader.java ## @@ -169,5 +178,57 @@ public void close() throws IOException { _indexReader.close(); _indexDirectory.close(); +_docIdReaderWriter.close(); + } + + private class DocIdReaderWriter implements Closeable { +private PinotDataBuffer _buffer; +private final boolean _mappingExists; + +DocIdReaderWriter(File segmentIndexDir, String column, int numDocs) +throws Exception { + int length = Integer.BYTES * numDocs; + File docIdMappingFile = new File(SegmentDirectoryPaths.findSegmentDirectory(segmentIndexDir), + column + LUCENE_TEXT_INDEX_DOCID_MAPPING_FILE_EXTENSION); + // The mapping is local to a segment. It is created on the server during segment load. + // Unless we are running Pinot on Solaris/SPARC, the underlying architecture is + // LITTLE_ENDIAN (Linux/x86). So use that as byte order. + if (docIdMappingFile.exists()) { +// we will be here for segment reload and server restart +// for refresh, we will not be here since segment is deleted/replaced +// TODO: see if we can prefetch the pages +_mappingExists = true; +_buffer = PinotDataBuffer.mapFile(docIdMappingFile, /* readOnly */ true, 0, length, ByteOrder.LITTLE_ENDIAN, +_column + getClass().getSimpleName()); + } else { +_mappingExists = false; +_buffer = PinotDataBuffer.mapFile(docIdMappingFile, /* readOnly */ false, 0, length, ByteOrder.LITTLE_ENDIAN, +_column + getClass().getSimpleName()); + } +} + +public void buildDocIdMapping(int numDocs) { + if (!_mappingExists) { +for (int i = 0; i < numDocs; i++) { + try { +Document document = _indexSearcher.doc(i); +int pinotDocId = Integer.valueOf(document.get(LuceneTextIndexCreator.LUCENE_INDEX_DOC_ID_COLUMN_NAME)); Review comment: `Integer.parseInt()` This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org For additional commands, e-mail: commits-h...@pinot.apache.org
[GitHub] [incubator-pinot] Jackie-Jiang commented on a change in pull request #5177: Lucene DocId to PinotDocId cache
Jackie-Jiang commented on a change in pull request #5177: Lucene DocId to PinotDocId cache URL: https://github.com/apache/incubator-pinot/pull/5177#discussion_r400560559 ## File path: pinot-core/src/main/java/org/apache/pinot/core/segment/index/converter/SegmentV1V2ToV3FormatConverter.java ## @@ -241,6 +242,22 @@ public boolean accept(File dir, String name) { Files.copy(indexFile.toPath(), v3LuceneIndexFile.toPath()); } } +// if segment reload is issued asking for up-conversion of Review comment: Not related to this pr, but can you make this method similar to `copyForwardIndex()` and call it in `copyIndexData()` based on the metadata? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org For additional commands, e-mail: commits-h...@pinot.apache.org
[GitHub] [incubator-pinot] Jackie-Jiang commented on a change in pull request #5177: Lucene DocId to PinotDocId cache
Jackie-Jiang commented on a change in pull request #5177: Lucene DocId to PinotDocId cache URL: https://github.com/apache/incubator-pinot/pull/5177#discussion_r400558663 ## File path: pinot-core/src/main/java/org/apache/pinot/core/segment/index/loader/invertedindex/TextIndexHandler.java ## @@ -154,6 +152,11 @@ private void createTextIndexForColumn(ColumnMetadata columnMetadata) int numDocs = columnMetadata.getTotalDocs(); LOGGER.info("Creating new text index for column: {} in segment: {}", column, _segmentName); File segmentIndexDir = SegmentDirectoryPaths.segmentDirectoryFor(_indexDir, _segmentVersion); +// The handlers are always invoked by the preprocessor. Before this ImmutableSegmentLoader would have already Review comment: By convention, we use `indexDir` for top level directory, and `segmentDirectory` for the direct directory (`indexDir` for `v1`, or `v3` for `v3`). If you follow the same naming convention, it will be clearer. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org For additional commands, e-mail: commits-h...@pinot.apache.org
[GitHub] [incubator-pinot] siddharthteotia commented on issue #5177: Lucene DocId to PinotDocId cache
siddharthteotia commented on issue #5177: Lucene DocId to PinotDocId cache URL: https://github.com/apache/incubator-pinot/pull/5177#issuecomment-606329890 @Jackie-Jiang , @kishoreg Addressed the review comments. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org For additional commands, e-mail: commits-h...@pinot.apache.org
[GitHub] [incubator-pinot] siddharthteotia commented on a change in pull request #5177: Lucene DocId to PinotDocId cache
siddharthteotia commented on a change in pull request #5177: Lucene DocId to PinotDocId cache URL: https://github.com/apache/incubator-pinot/pull/5177#discussion_r400578880 ## File path: pinot-core/src/main/java/org/apache/pinot/core/segment/index/converter/SegmentV1V2ToV3FormatConverter.java ## @@ -241,6 +242,22 @@ public boolean accept(File dir, String name) { Files.copy(indexFile.toPath(), v3LuceneIndexFile.toPath()); } } +// if segment reload is issued asking for up-conversion of Review comment: Added a TODO -- I have couple of more PRs coming up for text. Will address this This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org For additional commands, e-mail: commits-h...@pinot.apache.org
[GitHub] [incubator-pinot] siddharthteotia commented on a change in pull request #5177: Lucene DocId to PinotDocId cache
siddharthteotia commented on a change in pull request #5177: Lucene DocId to PinotDocId cache URL: https://github.com/apache/incubator-pinot/pull/5177#discussion_r400578880 ## File path: pinot-core/src/main/java/org/apache/pinot/core/segment/index/converter/SegmentV1V2ToV3FormatConverter.java ## @@ -241,6 +242,22 @@ public boolean accept(File dir, String name) { Files.copy(indexFile.toPath(), v3LuceneIndexFile.toPath()); } } +// if segment reload is issued asking for up-conversion of Review comment: Added a TODO -- I have couple of more PRs coming up shortly for text. Will address this This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org For additional commands, e-mail: commits-h...@pinot.apache.org
[GitHub] [incubator-pinot] siddharthteotia commented on a change in pull request #5177: Lucene DocId to PinotDocId cache
siddharthteotia commented on a change in pull request #5177: Lucene DocId to PinotDocId cache URL: https://github.com/apache/incubator-pinot/pull/5177#discussion_r400588813 ## File path: pinot-core/src/main/java/org/apache/pinot/core/segment/index/readers/text/LuceneTextIndexReader.java ## @@ -123,8 +133,7 @@ public MutableRoaringBitmap getDocIds(Object value) { _indexSearcher.search(query, docIDCollector); return getPinotDocIds(docIDs); } catch (Exception e) { - LOGGER.error("Failed while searching the text index for column {}, search query {}, exception {}", _column, - searchQuery, e.getMessage()); + LOGGER.error("Failed while searching the text index for column {}, search query {},", _column, searchQuery); Review comment: used in message of RuntimeException This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org For additional commands, e-mail: commits-h...@pinot.apache.org
[GitHub] [incubator-pinot] siddharthteotia commented on a change in pull request #5177: Lucene DocId to PinotDocId cache
siddharthteotia commented on a change in pull request #5177: Lucene DocId to PinotDocId cache URL: https://github.com/apache/incubator-pinot/pull/5177#discussion_r400588799 ## File path: pinot-core/src/main/java/org/apache/pinot/core/segment/index/readers/text/LuceneTextIndexReader.java ## @@ -169,5 +173,50 @@ public void close() throws IOException { _indexReader.close(); _indexDirectory.close(); +_docIdTranslator.close(); + } + + private static class DocIdTranslator implements Closeable { +final PinotDataBuffer _buffer; + +DocIdTranslator(File segmentIndexDir, String column, int numDocs, IndexSearcher indexSearcher) +throws Exception { + int length = Integer.BYTES * numDocs; + File docIdMappingFile = new File(SegmentDirectoryPaths.findSegmentDirectory(segmentIndexDir), + column + LUCENE_TEXT_INDEX_DOCID_MAPPING_FILE_EXTENSION); + // The mapping is local to a segment. It is created on the server during segment load. + // Unless we are running Pinot on Solaris/SPARC, the underlying architecture is + // LITTLE_ENDIAN (Linux/x86). So use that as byte order. + String desc = "Text index docId mapping buffer: " + column; + if (docIdMappingFile.exists()) { +// we will be here for segment reload and server restart +// for refresh, we will not be here since segment is deleted/replaced +// TODO: see if we can prefetch the pages +_buffer = +PinotDataBuffer.mapFile(docIdMappingFile, /* readOnly */ true, 0, length, ByteOrder.LITTLE_ENDIAN, desc); + } else { +_buffer = +PinotDataBuffer.mapFile(docIdMappingFile, /* readOnly */ false, 0, length, ByteOrder.LITTLE_ENDIAN, desc); +for (int i = 0; i < numDocs; i++) { + try { +Document document = indexSearcher.doc(i); +int pinotDocId = Integer.parseInt(document.get(LuceneTextIndexCreator.LUCENE_INDEX_DOC_ID_COLUMN_NAME)); +_buffer.putInt(i * Integer.BYTES, pinotDocId); + } catch (Exception e) { +throw new RuntimeException("Failed to build doc id mapping during segment load, " + e); Review comment: done This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org For additional commands, e-mail: commits-h...@pinot.apache.org
[GitHub] [incubator-pinot] Jackie-Jiang commented on a change in pull request #5177: Lucene DocId to PinotDocId cache
Jackie-Jiang commented on a change in pull request #5177: Lucene DocId to PinotDocId cache URL: https://github.com/apache/incubator-pinot/pull/5177#discussion_r400582841 ## File path: pinot-core/src/main/java/org/apache/pinot/core/segment/index/readers/text/LuceneTextIndexReader.java ## @@ -123,8 +133,7 @@ public MutableRoaringBitmap getDocIds(Object value) { _indexSearcher.search(query, docIDCollector); return getPinotDocIds(docIDs); } catch (Exception e) { - LOGGER.error("Failed while searching the text index for column {}, search query {}, exception {}", _column, - searchQuery, e.getMessage()); + LOGGER.error("Failed while searching the text index for column {}, search query {},", _column, searchQuery); throw new RuntimeException(e); Review comment: ```suggestion throw new RuntimeException("Caught exception while searching the text index column: " + _column + " with query: " + searchQuery, e); ``` This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org For additional commands, e-mail: commits-h...@pinot.apache.org
[GitHub] [incubator-pinot] Jackie-Jiang commented on a change in pull request #5177: Lucene DocId to PinotDocId cache
Jackie-Jiang commented on a change in pull request #5177: Lucene DocId to PinotDocId cache URL: https://github.com/apache/incubator-pinot/pull/5177#discussion_r400580427 ## File path: pinot-core/src/main/java/org/apache/pinot/core/segment/index/converter/SegmentV1V2ToV3FormatConverter.java ## @@ -241,6 +242,22 @@ public boolean accept(File dir, String name) { Files.copy(indexFile.toPath(), v3LuceneIndexFile.toPath()); } } +// if segment reload is issued asking for up-conversion of Review comment: I was referring to the usage (check metadata and copy instead of filtering on file names). Not critical, you can address later This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org For additional commands, e-mail: commits-h...@pinot.apache.org
[GitHub] [incubator-pinot] Jackie-Jiang commented on a change in pull request #5177: Lucene DocId to PinotDocId cache
Jackie-Jiang commented on a change in pull request #5177: Lucene DocId to PinotDocId cache URL: https://github.com/apache/incubator-pinot/pull/5177#discussion_r400581886 ## File path: pinot-core/src/main/java/org/apache/pinot/core/segment/index/readers/text/LuceneTextIndexReader.java ## @@ -169,5 +173,50 @@ public void close() throws IOException { _indexReader.close(); _indexDirectory.close(); +_docIdTranslator.close(); + } + + private static class DocIdTranslator implements Closeable { +final PinotDataBuffer _buffer; + +DocIdTranslator(File segmentIndexDir, String column, int numDocs, IndexSearcher indexSearcher) +throws Exception { + int length = Integer.BYTES * numDocs; + File docIdMappingFile = new File(SegmentDirectoryPaths.findSegmentDirectory(segmentIndexDir), + column + LUCENE_TEXT_INDEX_DOCID_MAPPING_FILE_EXTENSION); + // The mapping is local to a segment. It is created on the server during segment load. + // Unless we are running Pinot on Solaris/SPARC, the underlying architecture is + // LITTLE_ENDIAN (Linux/x86). So use that as byte order. + String desc = "Text index docId mapping buffer: " + column; + if (docIdMappingFile.exists()) { +// we will be here for segment reload and server restart +// for refresh, we will not be here since segment is deleted/replaced +// TODO: see if we can prefetch the pages +_buffer = +PinotDataBuffer.mapFile(docIdMappingFile, /* readOnly */ true, 0, length, ByteOrder.LITTLE_ENDIAN, desc); + } else { +_buffer = +PinotDataBuffer.mapFile(docIdMappingFile, /* readOnly */ false, 0, length, ByteOrder.LITTLE_ENDIAN, desc); +for (int i = 0; i < numDocs; i++) { + try { +Document document = indexSearcher.doc(i); +int pinotDocId = Integer.parseInt(document.get(LuceneTextIndexCreator.LUCENE_INDEX_DOC_ID_COLUMN_NAME)); +_buffer.putInt(i * Integer.BYTES, pinotDocId); + } catch (Exception e) { +throw new RuntimeException("Failed to build doc id mapping during segment load, " + e); Review comment: ```suggestion throw new RuntimeException("Caught exception while building doc id mapping for text index column: " + column, e); ``` This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org For additional commands, e-mail: commits-h...@pinot.apache.org
[GitHub] [incubator-pinot] Jackie-Jiang commented on a change in pull request #5177: Lucene DocId to PinotDocId cache
Jackie-Jiang commented on a change in pull request #5177: Lucene DocId to PinotDocId cache URL: https://github.com/apache/incubator-pinot/pull/5177#discussion_r400581254 ## File path: pinot-core/src/main/java/org/apache/pinot/core/segment/index/readers/text/LuceneTextIndexReader.java ## @@ -123,8 +133,7 @@ public MutableRoaringBitmap getDocIds(Object value) { _indexSearcher.search(query, docIDCollector); return getPinotDocIds(docIDs); } catch (Exception e) { - LOGGER.error("Failed while searching the text index for column {}, search query {}, exception {}", _column, - searchQuery, e.getMessage()); + LOGGER.error("Failed while searching the text index for column {}, search query {},", _column, searchQuery); Review comment: If you want to keep the context of the exception, put it into the RuntimeException This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org For additional commands, e-mail: commits-h...@pinot.apache.org
[GitHub] [incubator-pinot] Jackie-Jiang commented on a change in pull request #5177: Lucene DocId to PinotDocId cache
Jackie-Jiang commented on a change in pull request #5177: Lucene DocId to PinotDocId cache URL: https://github.com/apache/incubator-pinot/pull/5177#discussion_r400583323 ## File path: pinot-core/src/main/java/org/apache/pinot/core/segment/index/readers/text/LuceneTextIndexReader.java ## @@ -169,5 +173,50 @@ public void close() throws IOException { _indexReader.close(); _indexDirectory.close(); +_docIdTranslator.close(); + } + + private static class DocIdTranslator implements Closeable { +final PinotDataBuffer _buffer; + +DocIdTranslator(File segmentIndexDir, String column, int numDocs, IndexSearcher indexSearcher) +throws Exception { + int length = Integer.BYTES * numDocs; + File docIdMappingFile = new File(SegmentDirectoryPaths.findSegmentDirectory(segmentIndexDir), + column + LUCENE_TEXT_INDEX_DOCID_MAPPING_FILE_EXTENSION); + // The mapping is local to a segment. It is created on the server during segment load. + // Unless we are running Pinot on Solaris/SPARC, the underlying architecture is + // LITTLE_ENDIAN (Linux/x86). So use that as byte order. + String desc = "Text index docId mapping buffer: " + column; + if (docIdMappingFile.exists()) { +// we will be here for segment reload and server restart +// for refresh, we will not be here since segment is deleted/replaced +// TODO: see if we can prefetch the pages +_buffer = +PinotDataBuffer.mapFile(docIdMappingFile, /* readOnly */ true, 0, length, ByteOrder.LITTLE_ENDIAN, desc); + } else { +_buffer = +PinotDataBuffer.mapFile(docIdMappingFile, /* readOnly */ false, 0, length, ByteOrder.LITTLE_ENDIAN, desc); +for (int i = 0; i < numDocs; i++) { + try { +Document document = indexSearcher.doc(i); +int pinotDocId = Integer.parseInt(document.get(LuceneTextIndexCreator.LUCENE_INDEX_DOC_ID_COLUMN_NAME)); +_buffer.putInt(i * Integer.BYTES, pinotDocId); + } catch (Exception e) { +throw new RuntimeException("Failed to build doc id mapping during segment load, " + e); + } +} + } +} + +int getPinotDocId(int offset) { Review comment: Pass in `luceneDocId` and wrap the `_buffer.getInt(luceneDocId * Integer.BYTES)` logic inside This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org For additional commands, e-mail: commits-h...@pinot.apache.org
[GitHub] [incubator-pinot] siddharthteotia commented on a change in pull request #5177: Lucene DocId to PinotDocId cache
siddharthteotia commented on a change in pull request #5177: Lucene DocId to PinotDocId cache URL: https://github.com/apache/incubator-pinot/pull/5177#discussion_r400588824 ## File path: pinot-core/src/main/java/org/apache/pinot/core/segment/index/readers/text/LuceneTextIndexReader.java ## @@ -169,5 +173,50 @@ public void close() throws IOException { _indexReader.close(); _indexDirectory.close(); +_docIdTranslator.close(); + } + + private static class DocIdTranslator implements Closeable { +final PinotDataBuffer _buffer; + +DocIdTranslator(File segmentIndexDir, String column, int numDocs, IndexSearcher indexSearcher) +throws Exception { + int length = Integer.BYTES * numDocs; + File docIdMappingFile = new File(SegmentDirectoryPaths.findSegmentDirectory(segmentIndexDir), + column + LUCENE_TEXT_INDEX_DOCID_MAPPING_FILE_EXTENSION); + // The mapping is local to a segment. It is created on the server during segment load. + // Unless we are running Pinot on Solaris/SPARC, the underlying architecture is + // LITTLE_ENDIAN (Linux/x86). So use that as byte order. + String desc = "Text index docId mapping buffer: " + column; + if (docIdMappingFile.exists()) { +// we will be here for segment reload and server restart +// for refresh, we will not be here since segment is deleted/replaced +// TODO: see if we can prefetch the pages +_buffer = +PinotDataBuffer.mapFile(docIdMappingFile, /* readOnly */ true, 0, length, ByteOrder.LITTLE_ENDIAN, desc); + } else { +_buffer = +PinotDataBuffer.mapFile(docIdMappingFile, /* readOnly */ false, 0, length, ByteOrder.LITTLE_ENDIAN, desc); +for (int i = 0; i < numDocs; i++) { + try { +Document document = indexSearcher.doc(i); +int pinotDocId = Integer.parseInt(document.get(LuceneTextIndexCreator.LUCENE_INDEX_DOC_ID_COLUMN_NAME)); +_buffer.putInt(i * Integer.BYTES, pinotDocId); + } catch (Exception e) { +throw new RuntimeException("Failed to build doc id mapping during segment load, " + e); + } +} + } +} + +int getPinotDocId(int offset) { Review comment: Good catch. Done This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org For additional commands, e-mail: commits-h...@pinot.apache.org
[GitHub] [incubator-pinot] codecov-io commented on issue #5177: Lucene DocId to PinotDocId cache
codecov-io commented on issue #5177: Lucene DocId to PinotDocId cache URL: https://github.com/apache/incubator-pinot/pull/5177#issuecomment-606391678 # [Codecov](https://codecov.io/gh/apache/incubator-pinot/pull/5177?src=pr=h1) Report > Merging [#5177](https://codecov.io/gh/apache/incubator-pinot/pull/5177?src=pr=desc) into [master](https://codecov.io/gh/apache/incubator-pinot/commit/b8ed426b6ac47d38279be4edad6956d7e8f00c51=desc) will **decrease** coverage by `9.55%`. > The diff coverage is `85.10%`. [![Impacted file tree graph](https://codecov.io/gh/apache/incubator-pinot/pull/5177/graphs/tree.svg?width=650=150=pr=4ibza2ugkz)](https://codecov.io/gh/apache/incubator-pinot/pull/5177?src=pr=tree) ```diff @@ Coverage Diff @@ ## master#5177 +/- ## - Coverage 66.04% 56.49% -9.56% Complexity 12 12 Files 1052 1055 +3 Lines 5417054121 -49 Branches 8078 8050 -28 - Hits 3577930576-5203 - Misses1573621117+5381 + Partials 2655 2428 -227 ``` | [Impacted Files](https://codecov.io/gh/apache/incubator-pinot/pull/5177?src=pr=tree) | Coverage Δ | Complexity Δ | | |---|---|---|---| | [...t/index/loader/invertedindex/TextIndexHandler.java](https://codecov.io/gh/apache/incubator-pinot/pull/5177/diff?src=pr=tree#diff-cGlub3QtY29yZS9zcmMvbWFpbi9qYXZhL29yZy9hcGFjaGUvcGlub3QvY29yZS9zZWdtZW50L2luZGV4L2xvYWRlci9pbnZlcnRlZGluZGV4L1RleHRJbmRleEhhbmRsZXIuamF2YQ==) | `84.31% <57.14%> (ø)` | `0.00 <0.00> (ø)` | | | [...ment/index/readers/text/LuceneTextIndexReader.java](https://codecov.io/gh/apache/incubator-pinot/pull/5177/diff?src=pr=tree#diff-cGlub3QtY29yZS9zcmMvbWFpbi9qYXZhL29yZy9hcGFjaGUvcGlub3QvY29yZS9zZWdtZW50L2luZGV4L3JlYWRlcnMvdGV4dC9MdWNlbmVUZXh0SW5kZXhSZWFkZXIuamF2YQ==) | `80.95% <87.09%> (+9.21%)` | `0.00 <0.00> (ø)` | | | [...ent/index/column/PhysicalColumnIndexContainer.java](https://codecov.io/gh/apache/incubator-pinot/pull/5177/diff?src=pr=tree#diff-cGlub3QtY29yZS9zcmMvbWFpbi9qYXZhL29yZy9hcGFjaGUvcGlub3QvY29yZS9zZWdtZW50L2luZGV4L2NvbHVtbi9QaHlzaWNhbENvbHVtbkluZGV4Q29udGFpbmVyLmphdmE=) | `87.83% <100.00%> (-5.41%)` | `0.00 <0.00> (ø)` | | | [...ndex/converter/SegmentV1V2ToV3FormatConverter.java](https://codecov.io/gh/apache/incubator-pinot/pull/5177/diff?src=pr=tree#diff-cGlub3QtY29yZS9zcmMvbWFpbi9qYXZhL29yZy9hcGFjaGUvcGlub3QvY29yZS9zZWdtZW50L2luZGV4L2NvbnZlcnRlci9TZWdtZW50VjFWMlRvVjNGb3JtYXRDb252ZXJ0ZXIuamF2YQ==) | `76.97% <100.00%> (+1.03%)` | `0.00 <0.00> (ø)` | | | [...inot/core/segment/store/SegmentDirectoryPaths.java](https://codecov.io/gh/apache/incubator-pinot/pull/5177/diff?src=pr=tree#diff-cGlub3QtY29yZS9zcmMvbWFpbi9qYXZhL29yZy9hcGFjaGUvcGlub3QvY29yZS9zZWdtZW50L3N0b3JlL1NlZ21lbnREaXJlY3RvcnlQYXRocy5qYXZh) | `81.48% <100.00%> (+1.48%)` | `0.00 <0.00> (ø)` | | | [...a/org/apache/pinot/minion/metrics/MinionMeter.java](https://codecov.io/gh/apache/incubator-pinot/pull/5177/diff?src=pr=tree#diff-cGlub3QtbWluaW9uL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9waW5vdC9taW5pb24vbWV0cmljcy9NaW5pb25NZXRlci5qYXZh) | `0.00% <0.00%> (-100.00%)` | `0.00% <0.00%> (ø%)` | | | [.../apache/pinot/common/metrics/BrokerQueryPhase.java](https://codecov.io/gh/apache/incubator-pinot/pull/5177/diff?src=pr=tree#diff-cGlub3QtY29tbW9uL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9waW5vdC9jb21tb24vbWV0cmljcy9Ccm9rZXJRdWVyeVBoYXNlLmphdmE=) | `0.00% <0.00%> (-100.00%)` | `0.00% <0.00%> (ø%)` | | | [.../apache/pinot/minion/metrics/MinionQueryPhase.java](https://codecov.io/gh/apache/incubator-pinot/pull/5177/diff?src=pr=tree#diff-cGlub3QtbWluaW9uL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9waW5vdC9taW5pb24vbWV0cmljcy9NaW5pb25RdWVyeVBoYXNlLmphdmE=) | `0.00% <0.00%> (-100.00%)` | `0.00% <0.00%> (ø%)` | | | [...he/pinot/core/query/reduce/ComparisonFunction.java](https://codecov.io/gh/apache/incubator-pinot/pull/5177/diff?src=pr=tree#diff-cGlub3QtY29yZS9zcmMvbWFpbi9qYXZhL29yZy9hcGFjaGUvcGlub3QvY29yZS9xdWVyeS9yZWR1Y2UvQ29tcGFyaXNvbkZ1bmN0aW9uLmphdmE=) | `0.00% <0.00%> (-100.00%)` | `0.00% <0.00%> (ø%)` | | | [...pinot/minion/exception/TaskCancelledException.java](https://codecov.io/gh/apache/incubator-pinot/pull/5177/diff?src=pr=tree#diff-cGlub3QtbWluaW9uL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9waW5vdC9taW5pb24vZXhjZXB0aW9uL1Rhc2tDYW5jZWxsZWRFeGNlcHRpb24uamF2YQ==) | `0.00% <0.00%> (-100.00%)` | `0.00% <0.00%> (ø%)` | | | ... and [400 more](https://codecov.io/gh/apache/incubator-pinot/pull/5177/diff?src=pr=tree-more) | | -- [Continue to review full report at Codecov](https://codecov.io/gh/apache/incubator-pinot/pull/5177?src=pr=continue). > **Legend** - [Click here to learn
[GitHub] [incubator-pinot] kishoreg commented on a change in pull request #5177: Lucene DocId to PinotDocId cache
kishoreg commented on a change in pull request #5177: Lucene DocId to PinotDocId cache URL: https://github.com/apache/incubator-pinot/pull/5177#discussion_r400572893 ## File path: pinot-core/src/main/java/org/apache/pinot/core/segment/index/readers/text/LuceneTextIndexReader.java ## @@ -169,5 +178,57 @@ public void close() throws IOException { _indexReader.close(); _indexDirectory.close(); +_docIdReaderWriter.close(); + } + + private class DocIdReaderWriter implements Closeable { +private PinotDataBuffer _buffer; +private final boolean _mappingExists; + +DocIdReaderWriter(File segmentIndexDir, String column, int numDocs) +throws Exception { + int length = Integer.BYTES * numDocs; + File docIdMappingFile = new File(SegmentDirectoryPaths.findSegmentDirectory(segmentIndexDir), + column + LUCENE_TEXT_INDEX_DOCID_MAPPING_FILE_EXTENSION); + // The mapping is local to a segment. It is created on the server during segment load. + // Unless we are running Pinot on Solaris/SPARC, the underlying architecture is + // LITTLE_ENDIAN (Linux/x86). So use that as byte order. + if (docIdMappingFile.exists()) { +// we will be here for segment reload and server restart +// for refresh, we will not be here since segment is deleted/replaced +// TODO: see if we can prefetch the pages +_mappingExists = true; +_buffer = PinotDataBuffer.mapFile(docIdMappingFile, /* readOnly */ true, 0, length, ByteOrder.LITTLE_ENDIAN, +_column + getClass().getSimpleName()); + } else { +_mappingExists = false; +_buffer = PinotDataBuffer.mapFile(docIdMappingFile, /* readOnly */ false, 0, length, ByteOrder.LITTLE_ENDIAN, +_column + getClass().getSimpleName()); + } +} + +public void buildDocIdMapping(int numDocs) { + if (!_mappingExists) { +for (int i = 0; i < numDocs; i++) { + try { +Document document = _indexSearcher.doc(i); +int pinotDocId = Integer.valueOf(document.get(LuceneTextIndexCreator.LUCENE_INDEX_DOC_ID_COLUMN_NAME)); +_buffer.putInt(i * Integer.BYTES, pinotDocId); + } catch (Exception e) { +throw new RuntimeException("Failed to build doc id mapping during segment load: " + e); Review comment: , e This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org For additional commands, e-mail: commits-h...@pinot.apache.org
[GitHub] [incubator-pinot] kishoreg commented on a change in pull request #5177: Lucene DocId to PinotDocId cache
kishoreg commented on a change in pull request #5177: Lucene DocId to PinotDocId cache URL: https://github.com/apache/incubator-pinot/pull/5177#discussion_r400572762 ## File path: pinot-core/src/main/java/org/apache/pinot/core/segment/index/readers/text/LuceneTextIndexReader.java ## @@ -169,5 +178,57 @@ public void close() throws IOException { _indexReader.close(); _indexDirectory.close(); +_docIdReaderWriter.close(); + } + + private class DocIdReaderWriter implements Closeable { +private PinotDataBuffer _buffer; +private final boolean _mappingExists; + +DocIdReaderWriter(File segmentIndexDir, String column, int numDocs) +throws Exception { + int length = Integer.BYTES * numDocs; + File docIdMappingFile = new File(SegmentDirectoryPaths.findSegmentDirectory(segmentIndexDir), + column + LUCENE_TEXT_INDEX_DOCID_MAPPING_FILE_EXTENSION); + // The mapping is local to a segment. It is created on the server during segment load. + // Unless we are running Pinot on Solaris/SPARC, the underlying architecture is + // LITTLE_ENDIAN (Linux/x86). So use that as byte order. + if (docIdMappingFile.exists()) { +// we will be here for segment reload and server restart +// for refresh, we will not be here since segment is deleted/replaced +// TODO: see if we can prefetch the pages +_mappingExists = true; +_buffer = PinotDataBuffer.mapFile(docIdMappingFile, /* readOnly */ true, 0, length, ByteOrder.LITTLE_ENDIAN, Review comment: Can we reuse FixedByteReaderWriter? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org For additional commands, e-mail: commits-h...@pinot.apache.org
[GitHub] [incubator-pinot] Jackie-Jiang opened a new pull request #5197: Shuffle the segments when rebalancing the table to avoid creating hotspot servers
Jackie-Jiang opened a new pull request #5197: Shuffle the segments when rebalancing the table to avoid creating hotspot servers URL: https://github.com/apache/incubator-pinot/pull/5197 When new servers are added to an existing replica-group based table and rebalance is triggered, current behavior will assign segments in alphabetical order, which might move only the new segments to the new added servers. Because queries tend to query the most recent segments, this behavior might cause new added servers to become the hotspot servers. To address this issue, we shuffle the segments so that old and new segments can be balanced assigned. We use the hash of the table name as the random seed to shuffle the segments so that the result is deterministic. It is a little bit tricky to write a test case for this. Since the change is straight-forward and the existing tests already have pretty good coverage, after manually verified the expected behavior, no new test is added. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org For additional commands, e-mail: commits-h...@pinot.apache.org
[GitHub] [incubator-pinot] siddharthteotia commented on a change in pull request #5177: Lucene DocId to PinotDocId cache
siddharthteotia commented on a change in pull request #5177: Lucene DocId to PinotDocId cache URL: https://github.com/apache/incubator-pinot/pull/5177#discussion_r400552298 ## File path: pinot-core/src/main/java/org/apache/pinot/core/segment/index/readers/text/LuceneTextIndexReader.java ## @@ -70,6 +76,10 @@ public LuceneTextIndexReader(String column, File segmentIndexDir) { // Disable Lucene query result cache. While it helps a lot with performance for // repeated queries, on the downside it cause heap issues. _indexSearcher.setQueryCache(null); + // TODO: consider using a threshold of num docs per segment to decide between building + // mapping file upfront on segment load v/s on-the-fly during query processing + _docIdReaderWriter = new DocIdReaderWriter(segmentIndexDir, _column, numDocs); + _docIdReaderWriter.buildDocIdMapping(numDocs); Review comment: We check for existence and return. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org For additional commands, e-mail: commits-h...@pinot.apache.org
[GitHub] [incubator-pinot] siddharthteotia commented on a change in pull request #5177: Lucene DocId to PinotDocId cache
siddharthteotia commented on a change in pull request #5177: Lucene DocId to PinotDocId cache URL: https://github.com/apache/incubator-pinot/pull/5177#discussion_r400553372 ## File path: pinot-core/src/main/java/org/apache/pinot/core/segment/index/readers/text/LuceneTextIndexReader.java ## @@ -169,5 +178,49 @@ public void close() throws IOException { _indexReader.close(); _indexDirectory.close(); +_docIdReaderWriter.close(); + } + + private class DocIdReaderWriter implements Closeable { +private PinotDataBuffer _buffer; + +DocIdReaderWriter(File segmentIndexDir, String column, int numDocs) throws Exception { + int length = Integer.BYTES * numDocs; + File docIdMappingFile = new File(SegmentDirectoryPaths.findSegmentDirectory(segmentIndexDir), + column + LUCENE_TEXT_INDEX_DOCID_MAPPING_FILE_EXTENSION); + // For newly added segments, this file will not exist. + // For segment refresh, segment reload and server restart, file will exist, + // but we don't know if we are here for refresh v/s reload v/s restart. + // In case of refresh, we have to build the mapping again, but in case of Review comment: Yes, that's what I stated in the comment. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org For additional commands, e-mail: commits-h...@pinot.apache.org
[GitHub] [incubator-pinot] siddharthteotia commented on a change in pull request #5177: Lucene DocId to PinotDocId cache
siddharthteotia commented on a change in pull request #5177: Lucene DocId to PinotDocId cache URL: https://github.com/apache/incubator-pinot/pull/5177#discussion_r400553327 ## File path: pinot-core/src/main/java/org/apache/pinot/core/segment/index/readers/text/LuceneTextIndexReader.java ## @@ -51,6 +54,9 @@ private final IndexSearcher _indexSearcher; private final QueryParser _queryParser; private final String _column; + private DocIdReaderWriter _docIdReaderWriter; Review comment: done This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org For additional commands, e-mail: commits-h...@pinot.apache.org
[GitHub] [incubator-pinot] siddharthteotia commented on a change in pull request #5177: Lucene DocId to PinotDocId cache
siddharthteotia commented on a change in pull request #5177: Lucene DocId to PinotDocId cache URL: https://github.com/apache/incubator-pinot/pull/5177#discussion_r400552708 ## File path: pinot-core/src/main/java/org/apache/pinot/core/segment/index/readers/text/LuceneTextIndexReader.java ## @@ -70,6 +76,10 @@ public LuceneTextIndexReader(String column, File segmentIndexDir) { // Disable Lucene query result cache. While it helps a lot with performance for // repeated queries, on the downside it cause heap issues. _indexSearcher.setQueryCache(null); + // TODO: consider using a threshold of num docs per segment to decide between building + // mapping file upfront on segment load v/s on-the-fly during query processing + _docIdReaderWriter = new DocIdReaderWriter(segmentIndexDir, _column, numDocs); + _docIdReaderWriter.buildDocIdMapping(numDocs); Review comment: Also, I explored doing this in TextIndexHandler. But that is not good since it requires to open the lucene index, create searcher twice (both in handler and here anyway for query processing). I think it is better to avoid that and just open the lucene index reader and search just once per index. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org For additional commands, e-mail: commits-h...@pinot.apache.org
[GitHub] [incubator-pinot] siddharthteotia commented on a change in pull request #5177: Lucene DocId to PinotDocId cache
siddharthteotia commented on a change in pull request #5177: Lucene DocId to PinotDocId cache URL: https://github.com/apache/incubator-pinot/pull/5177#discussion_r400553711 ## File path: pinot-core/src/main/java/org/apache/pinot/core/segment/index/readers/text/LuceneTextIndexReader.java ## @@ -169,5 +178,49 @@ public void close() throws IOException { _indexReader.close(); _indexDirectory.close(); +_docIdReaderWriter.close(); + } + + private class DocIdReaderWriter implements Closeable { +private PinotDataBuffer _buffer; + +DocIdReaderWriter(File segmentIndexDir, String column, int numDocs) throws Exception { + int length = Integer.BYTES * numDocs; + File docIdMappingFile = new File(SegmentDirectoryPaths.findSegmentDirectory(segmentIndexDir), + column + LUCENE_TEXT_INDEX_DOCID_MAPPING_FILE_EXTENSION); + // For newly added segments, this file will not exist. + // For segment refresh, segment reload and server restart, file will exist, + // but we don't know if we are here for refresh v/s reload v/s restart. + // In case of refresh, we have to build the mapping again, but in case of + // reload and restart, we don't. Also, reload has a sub-case where this text index + // was indeed created during reload (user enabled on existing or newly added column). + // Since there is no way to distinguish why we are here, we build the mapping again + // regardless. + // TODO: see if we can prefetch the pages + _buffer = + PinotDataBuffer.mapFile(docIdMappingFile, false, 0, length, ByteOrder.BIG_ENDIAN, getClass().getSimpleName()); +} + +public void buildDocIdMapping(int numDocs) { + for (int i = 0; i < numDocs; i++) { +try { + Document document = _indexSearcher.doc(i); + int pinotDocId = Integer.valueOf(document.get(LuceneTextIndexCreator.LUCENE_INDEX_DOC_ID_COLUMN_NAME)); + _buffer.putInt(i * Integer.BYTES, pinotDocId); +} catch (Exception e) { + LOGGER.error("Failed to build doc id mapping during segment load for column:{},docID:{},error:{}. Will continue and build mapping on the fly", Review comment: done This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org For additional commands, e-mail: commits-h...@pinot.apache.org
[GitHub] [incubator-pinot] siddharthteotia commented on a change in pull request #5177: Lucene DocId to PinotDocId cache
siddharthteotia commented on a change in pull request #5177: Lucene DocId to PinotDocId cache URL: https://github.com/apache/incubator-pinot/pull/5177#discussion_r400553657 ## File path: pinot-core/src/main/java/org/apache/pinot/core/segment/index/readers/text/LuceneTextIndexReader.java ## @@ -169,5 +178,49 @@ public void close() throws IOException { _indexReader.close(); _indexDirectory.close(); +_docIdReaderWriter.close(); + } + + private class DocIdReaderWriter implements Closeable { +private PinotDataBuffer _buffer; + +DocIdReaderWriter(File segmentIndexDir, String column, int numDocs) throws Exception { + int length = Integer.BYTES * numDocs; + File docIdMappingFile = new File(SegmentDirectoryPaths.findSegmentDirectory(segmentIndexDir), + column + LUCENE_TEXT_INDEX_DOCID_MAPPING_FILE_EXTENSION); + // For newly added segments, this file will not exist. + // For segment refresh, segment reload and server restart, file will exist, + // but we don't know if we are here for refresh v/s reload v/s restart. + // In case of refresh, we have to build the mapping again, but in case of + // reload and restart, we don't. Also, reload has a sub-case where this text index + // was indeed created during reload (user enabled on existing or newly added column). + // Since there is no way to distinguish why we are here, we build the mapping again + // regardless. + // TODO: see if we can prefetch the pages + _buffer = + PinotDataBuffer.mapFile(docIdMappingFile, false, 0, length, ByteOrder.BIG_ENDIAN, getClass().getSimpleName()); Review comment: Yes, as discussed using Little Endian and indicated in the comments too. We are unlikely to run on Solaris/Sparc so using LE is fine which is the case on Linux/x86 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org For additional commands, e-mail: commits-h...@pinot.apache.org
[GitHub] [incubator-pinot] siddharthteotia commented on issue #5177: Lucene DocId to PinotDocId cache
siddharthteotia commented on issue #5177: Lucene DocId to PinotDocId cache URL: https://github.com/apache/incubator-pinot/pull/5177#issuecomment-606304224 @Jackie-Jiang , I have addressed the review comments. Please take another look This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org For additional commands, e-mail: commits-h...@pinot.apache.org
[GitHub] [incubator-pinot] siddharthteotia commented on a change in pull request #5177: Lucene DocId to PinotDocId cache
siddharthteotia commented on a change in pull request #5177: Lucene DocId to PinotDocId cache URL: https://github.com/apache/incubator-pinot/pull/5177#discussion_r400553681 ## File path: pinot-core/src/main/java/org/apache/pinot/core/segment/index/readers/text/LuceneTextIndexReader.java ## @@ -169,5 +178,49 @@ public void close() throws IOException { _indexReader.close(); _indexDirectory.close(); +_docIdReaderWriter.close(); + } + + private class DocIdReaderWriter implements Closeable { +private PinotDataBuffer _buffer; + +DocIdReaderWriter(File segmentIndexDir, String column, int numDocs) throws Exception { + int length = Integer.BYTES * numDocs; + File docIdMappingFile = new File(SegmentDirectoryPaths.findSegmentDirectory(segmentIndexDir), + column + LUCENE_TEXT_INDEX_DOCID_MAPPING_FILE_EXTENSION); + // For newly added segments, this file will not exist. + // For segment refresh, segment reload and server restart, file will exist, + // but we don't know if we are here for refresh v/s reload v/s restart. + // In case of refresh, we have to build the mapping again, but in case of + // reload and restart, we don't. Also, reload has a sub-case where this text index + // was indeed created during reload (user enabled on existing or newly added column). + // Since there is no way to distinguish why we are here, we build the mapping again + // regardless. + // TODO: see if we can prefetch the pages + _buffer = + PinotDataBuffer.mapFile(docIdMappingFile, false, 0, length, ByteOrder.BIG_ENDIAN, getClass().getSimpleName()); Review comment: done This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org For additional commands, e-mail: commits-h...@pinot.apache.org
[GitHub] [incubator-pinot] siddharthteotia commented on a change in pull request #5177: Lucene DocId to PinotDocId cache
siddharthteotia commented on a change in pull request #5177: Lucene DocId to PinotDocId cache URL: https://github.com/apache/incubator-pinot/pull/5177#discussion_r400553372 ## File path: pinot-core/src/main/java/org/apache/pinot/core/segment/index/readers/text/LuceneTextIndexReader.java ## @@ -169,5 +178,49 @@ public void close() throws IOException { _indexReader.close(); _indexDirectory.close(); +_docIdReaderWriter.close(); + } + + private class DocIdReaderWriter implements Closeable { +private PinotDataBuffer _buffer; + +DocIdReaderWriter(File segmentIndexDir, String column, int numDocs) throws Exception { + int length = Integer.BYTES * numDocs; + File docIdMappingFile = new File(SegmentDirectoryPaths.findSegmentDirectory(segmentIndexDir), + column + LUCENE_TEXT_INDEX_DOCID_MAPPING_FILE_EXTENSION); + // For newly added segments, this file will not exist. + // For segment refresh, segment reload and server restart, file will exist, + // but we don't know if we are here for refresh v/s reload v/s restart. + // In case of refresh, we have to build the mapping again, but in case of Review comment: Yes, that's what I stated in the comment. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org For additional commands, e-mail: commits-h...@pinot.apache.org
[GitHub] [incubator-pinot] siddharthteotia commented on a change in pull request #5177: Lucene DocId to PinotDocId cache
siddharthteotia commented on a change in pull request #5177: Lucene DocId to PinotDocId cache URL: https://github.com/apache/incubator-pinot/pull/5177#discussion_r400552708 ## File path: pinot-core/src/main/java/org/apache/pinot/core/segment/index/readers/text/LuceneTextIndexReader.java ## @@ -70,6 +76,10 @@ public LuceneTextIndexReader(String column, File segmentIndexDir) { // Disable Lucene query result cache. While it helps a lot with performance for // repeated queries, on the downside it cause heap issues. _indexSearcher.setQueryCache(null); + // TODO: consider using a threshold of num docs per segment to decide between building + // mapping file upfront on segment load v/s on-the-fly during query processing + _docIdReaderWriter = new DocIdReaderWriter(segmentIndexDir, _column, numDocs); + _docIdReaderWriter.buildDocIdMapping(numDocs); Review comment: Also, I explored doing this in TextIndexHandler. But that is not good since it requires to open the lucene index, create searcher twice (both in handler and here anyway for query processing). I think it is better to avoid that and just open the lucene index reader and searcher just once per index. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org For additional commands, e-mail: commits-h...@pinot.apache.org
[GitHub] [incubator-pinot] xiaohui-sun commented on a change in pull request #5196: [TE] fix the merger issue that it can't merge historical anomaly generated by multiple rules
xiaohui-sun commented on a change in pull request #5196: [TE] fix the merger issue that it can't merge historical anomaly generated by multiple rules URL: https://github.com/apache/incubator-pinot/pull/5196#discussion_r400560348 ## File path: thirdeye/thirdeye-pinot/src/main/java/org/apache/pinot/thirdeye/detection/wrapper/ChildKeepingMergeWrapper.java ## @@ -48,9 +50,23 @@ public ChildKeepingMergeWrapper(DataProvider provider, DetectionConfigDTO config } @Override - // does not fetch any anomalies from database Review comment: @jihaozh Do you know a reason why we had this logic? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org For additional commands, e-mail: commits-h...@pinot.apache.org
[GitHub] [incubator-pinot] xiaohui-sun commented on a change in pull request #5196: [TE] fix the merger issue that it can't merge historical anomaly generated by multiple rules
xiaohui-sun commented on a change in pull request #5196: [TE] fix the merger issue that it can't merge historical anomaly generated by multiple rules URL: https://github.com/apache/incubator-pinot/pull/5196#discussion_r400561252 ## File path: thirdeye/thirdeye-pinot/src/main/java/org/apache/pinot/thirdeye/detection/wrapper/ChildKeepingMergeWrapper.java ## @@ -48,9 +50,23 @@ public ChildKeepingMergeWrapper(DataProvider provider, DetectionConfigDTO config } @Override - // does not fetch any anomalies from database + // retrieve the anomalies that are detected by multiple detectors protected List retrieveAnomaliesFromDatabase(List generated) { -return Collections.emptyList(); +AnomalySlice effectiveSlice = this.slice.withDetectionId(this.config.getId()) +.withStart(this.getStartTime(generated) - this.maxGap - 1) +.withEnd(this.getEndTime(generated) + this.maxGap + 1); + +Collection anomalies = + this.provider.fetchAnomalies(Collections.singleton(effectiveSlice)).get(effectiveSlice); + +return anomalies.stream() +.filter(anomaly -> !anomaly.isChild() && isDetectedByMultipleComponents(anomaly)) Review comment: Why we need to have a filter on multiple component here? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org For additional commands, e-mail: commits-h...@pinot.apache.org
[GitHub] [incubator-pinot] xiaohui-sun commented on a change in pull request #5196: [TE] fix the merger issue that it can't merge historical anomaly generated by multiple rules
xiaohui-sun commented on a change in pull request #5196: [TE] fix the merger issue that it can't merge historical anomaly generated by multiple rules URL: https://github.com/apache/incubator-pinot/pull/5196#discussion_r400561152 ## File path: thirdeye/thirdeye-pinot/src/main/java/org/apache/pinot/thirdeye/detection/wrapper/ChildKeepingMergeWrapper.java ## @@ -48,9 +50,23 @@ public ChildKeepingMergeWrapper(DataProvider provider, DetectionConfigDTO config } @Override - // does not fetch any anomalies from database + // retrieve the anomalies that are detected by multiple detectors protected List retrieveAnomaliesFromDatabase(List generated) { -return Collections.emptyList(); +AnomalySlice effectiveSlice = this.slice.withDetectionId(this.config.getId()) +.withStart(this.getStartTime(generated) - this.maxGap - 1) +.withEnd(this.getEndTime(generated) + this.maxGap + 1); + +Collection anomalies = + this.provider.fetchAnomalies(Collections.singleton(effectiveSlice)).get(effectiveSlice); + +return anomalies.stream() +.filter(anomaly -> !anomaly.isChild() && isDetectedByMultipleComponents(anomaly)) +.collect(Collectors.toList()); + } + + private boolean isDetectedByMultipleComponents(MergedAnomalyResultDTO anomaly) { +String componentName = anomaly.getProperties().getOrDefault(PROP_DETECTOR_COMPONENT_NAME, ""); +return componentName.contains(","); Review comment: This is too hacky. We can't depend on the string match to tell whether there are multiple components. What if we decided to change the delimiter some time later then this is a bug hard to debug. Let's put some helper function here to actually parse/assemble the component name in a centralized place. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org For additional commands, e-mail: commits-h...@pinot.apache.org
[GitHub] [incubator-pinot] ChethanUK commented on a change in pull request #5185: Pinot website [WIP]
ChethanUK commented on a change in pull request #5185: Pinot website [WIP] URL: https://github.com/apache/incubator-pinot/pull/5185#discussion_r400510725 ## File path: website/docusaurus.config.js ## @@ -0,0 +1,168 @@ +module.exports = { + title: 'Apache Pinot™ (Incubating)', + tagline: 'Realtime distributed OLAP datastore', + url: 'https://pinot.apache.com', + baseUrl: '/', + favicon: 'img/favicon.ico', + organizationName: 'apache', + projectName: 'pinot', + themeConfig: { +navbar: { + hideOnScroll: true, + title: 'Pinot™ (Incubating)', + logo: { +alt: 'Pinot', +src: 'img/logo.svg', + }, + links: [ +{to: 'https://apache-pinot.gitbook.io/apache-pinot-cookbook/', label: 'Docs', position: 'right'}, +{to: 'https://issues.apache.org/jira/projects/PINOT/issues', label: 'Jira', position: 'right'}, +{to: 'https://cwiki.apache.org/confluence/display/PINOT', label: 'Wiki', position: 'right'}, +{ + href: 'https://github.com/apache/incubator-pinot', + label: 'GitHub', + position: 'right', +}, + ], +}, +prism: { + theme: require('prism-react-renderer/themes/github'), + darkTheme: require('prism-react-renderer/themes/dracula'), +}, +footer: { + style: 'light', + links: [ +{ + title: 'About', + items: [ +{ + label: 'What is Pinot?', + to: 'https://docs.pinot.apache.org/', +}, +{ + label: 'Components', + to: 'https://docs.pinot.apache.org/pinot-components', +}, +{ + label: 'Architecture', + to: 'https://docs.pinot.apache.org/concepts/architecture', +}, +{ + label: 'PluginsArchitecture', + to: 'https://docs.pinot.apache.org/plugins/plugin-architecture', +}, + ], +}, +{ + title: 'Components', + items: [ +{ + label: 'Presto', + to: 'https://docs.pinot.apache.org/integrations/presto', +}, +{ + label: 'PQL', + to: 'docs/components/sources', +}, +{ + label: 'ThirdEye', + to: 'https://docs.pinot.apache.org/integrations/thirdeye', +}, +{ + label: 'PowerBI', + to: 'docs/components/sinks', +}, + ], +}, +{ + title: 'Docs', + items: [ +{ + label: 'GettingStarted', + to: 'https://docs.pinot.apache.org/getting-started', +}, +{ + label: 'PinotComponents', + to: 'https://docs.pinot.apache.org/pinot-components', +}, +{ + label: 'UserGuide', + to: 'https://docs.pinot.apache.org/pinot-user-guide', +}, +{ + label: 'Administration', + to: 'https://docs.pinot.apache.org/operating-pinot', +}, + ], +}, +{ + title: 'Community', + items: [ +{ + label: 'Slack', + to: 'https://communityinviter.com/apps/apache-pinot/apache-pinot', +}, +{ + label: 'Github', + to: 'https://github.com/apache/incubator-pinot', +}, +{ + label: 'Twitter', + to: 'https://twitter.com/ApachePinot', +}, +{ + label: 'Mailing List', + to: 'mailto:dev-subscr...@pinot.apache.org?Subject=SubscribeToPinot', +}, + ], +}, + ], + logo: { +alt: 'Apache Pinot™ - Incubating', +src: 'img/logo.svg', +href: 'https://pinot.apache.org/', + }, + copyright: `Copyright © ${new Date().getFullYear()} The Apache Software Foundation.`, +}, +googleAnalytics: { + // TODO + trackingID: 'TEMP', +}, +algolia: { + apiKey: 'f3cde09979e469ad62eaea4e115c21ea', + indexName: 'apache_pinot', + algoliaOptions: {}, // Optional, if provided by Algolia +}, Review comment: Will add this later. Search not required now.. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org For additional commands, e-mail: commits-h...@pinot.apache.org
[GitHub] [incubator-pinot-site] fx19880617 closed pull request #22: Update document links to docs.pinot.apache.org
fx19880617 closed pull request #22: Update document links to docs.pinot.apache.org URL: https://github.com/apache/incubator-pinot-site/pull/22 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org For additional commands, e-mail: commits-h...@pinot.apache.org
[incubator-pinot-site] 01/01: update documentation links
This is an automated email from the ASF dual-hosted git repository. xiangfu pushed a commit to branch update_doc_links_new in repository https://gitbox.apache.org/repos/asf/incubator-pinot-site.git commit 08ccf01e648191f1828de2db1354c44c6de7af4b Author: Xiang Fu AuthorDate: Mon Mar 30 00:26:08 2020 -0700 update documentation links --- content/404.html| 2 +- content/blog/hello-world/index.html | 2 +- content/blog/hola/index.html| 2 +- content/blog/index.html | 2 +- content/blog/tags/docusaurus/index.html | 2 +- content/blog/tags/facebook/index.html | 2 +- content/blog/tags/hello/index.html | 2 +- content/blog/tags/hola/index.html | 2 +- content/blog/tags/index.html| 2 +- content/blog/welcome/index.html | 2 +- content/c4f5d8e4.d784d618.js| 2 +- content/docs/about/features_of_pinot/index.html | 4 ++-- content/docs/about/index.html | 2 +- content/docs/about/what_is_pinot/index.html | 4 ++-- content/docs/about/who_use_pinot/index.html | 2 +- .../guides/troubleshooting/index.html | 2 +- content/docs/administration/index.html | 2 +- .../installation/cloud/aws/index.html | 4 ++-- .../installation/cloud/azure/index.html | 4 ++-- .../installation/cloud/gcp/index.html | 2 +- .../installation/cloud/on-premises/index.html | 4 ++-- .../installation/containers/docker/index.html | 4 ++-- .../installation/containers/index.html | 2 +- .../installation/operating-systems/macos/index.html | 2 +- .../operating-systems/ubuntu/index.html | 2 +- .../docs/administration/running_locally/index.html | 4 ++-- content/docs/components/broker/index.html | 2 +- content/docs/components/cluster/index.html | 2 +- content/docs/components/controller/index.html | 2 +- content/docs/components/index.html | 2 +- content/docs/components/minion/index.html | 2 +- content/docs/components/schema/index.html | 2 +- content/docs/components/segments/index.html | 2 +- content/docs/components/server/index.html | 2 +- content/docs/components/tables/index.html | 2 +- content/docs/components/tenants/index.html | 2 +- content/docs/concepts/index.html| 2 +- content/docs/concepts/pinot-architecture/index.html | 4 ++-- content/docs/how-to/index.html | 2 +- content/docs/misc/build-docker/index.html | 4 ++-- content/docs/misc/index.html| 2 +- content/docs/user-guide/clients/golang/index.html | 4 ++-- content/docs/user-guide/clients/java/index.html | 4 ++-- content/docs/user-guide/index.html | 2 +- content/docs/user-guide/pql/index.html | 4 ++-- content/docs/user-guide/query-pinot/index.html | 4 ++-- content/docs/user-guide/response-format/index.html | 4 ++-- .../docs/user-guide/rest-admin-interface/index.html | 2 +- {download => content/download}/index.html | 20 content/hello/index.html| 2 +- content/index.html | 2 +- content/main.076a6be3.js| 2 +- download/index.html | 20 index.html | 21 + 54 files changed, 90 insertions(+), 101 deletions(-) diff --git a/content/404.html b/content/404.html index 3ae9a6c..0690a49 100644 --- a/content/404.html +++ b/content/404.html @@ -30,7 +30,7 @@ !function(){function t(t){document.documentElement.setAttribute("data-theme",t)}var e=window.matchMedia("(prefers-color-scheme: dark)"),n=function(){var t=null;try{t=localStorage.getItem("theme")}catch(t){}return t}();null!==n?t(n):e.matches&&t("dark")}() -http://www.w3.org/2000/svg; width="30" height="30" viewBox="0 0 30 30" role="img" focusable="false">Menuhttp://www.w3.org/2000/svg; width="30" height="30" viewBox="0 0 30 30" role="img" focusable="false">Menu diff --git a/content/blog/hello-world/index.html b/content/blog/hello-world/index.html index ecffc11..283728f 100644 --- a/content/blog/hello-world/index.html +++ b/content/blog/hello-world/index.html @@ -40,7 +40,7 @@ !function(){function t(t){document.documentElement.setAttribute("data-theme",t)}var e=window.matchMedia("(prefers-color-scheme: dark)"),n=function(){var t=null;try{t=localStorage.getItem("theme")}catch(t){}return t}();null!==n?t(n):e.matches&&t("dark")}() -http://www.w3.org/2000/svg; width="30" height="30" viewBox="0 0 30 30" role="img"
[incubator-pinot-site] 01/01: update documentation links
This is an automated email from the ASF dual-hosted git repository. xiangfu pushed a commit to branch update_doc_links_new in repository https://gitbox.apache.org/repos/asf/incubator-pinot-site.git commit 23e5e11a541cbee872894e4a86e52af06f2cccae Author: Xiang Fu AuthorDate: Mon Mar 30 00:26:08 2020 -0700 update documentation links --- content/404.html| 2 +- content/blog/hello-world/index.html | 2 +- content/blog/hola/index.html| 2 +- content/blog/index.html | 2 +- content/blog/tags/docusaurus/index.html | 2 +- content/blog/tags/facebook/index.html | 2 +- content/blog/tags/hello/index.html | 2 +- content/blog/tags/hola/index.html | 2 +- content/blog/tags/index.html| 2 +- content/blog/welcome/index.html | 2 +- content/c4f5d8e4.d784d618.js| 2 +- content/docs/about/features_of_pinot/index.html | 4 ++-- content/docs/about/index.html | 2 +- content/docs/about/what_is_pinot/index.html | 4 ++-- content/docs/about/who_use_pinot/index.html | 2 +- .../guides/troubleshooting/index.html | 2 +- content/docs/administration/index.html | 2 +- .../installation/cloud/aws/index.html | 4 ++-- .../installation/cloud/azure/index.html | 4 ++-- .../installation/cloud/gcp/index.html | 2 +- .../installation/cloud/on-premises/index.html | 4 ++-- .../installation/containers/docker/index.html | 4 ++-- .../installation/containers/index.html | 2 +- .../installation/operating-systems/macos/index.html | 2 +- .../operating-systems/ubuntu/index.html | 2 +- .../docs/administration/running_locally/index.html | 4 ++-- content/docs/components/broker/index.html | 2 +- content/docs/components/cluster/index.html | 2 +- content/docs/components/controller/index.html | 2 +- content/docs/components/index.html | 2 +- content/docs/components/minion/index.html | 2 +- content/docs/components/schema/index.html | 2 +- content/docs/components/segments/index.html | 2 +- content/docs/components/server/index.html | 2 +- content/docs/components/tables/index.html | 2 +- content/docs/components/tenants/index.html | 2 +- content/docs/concepts/index.html| 2 +- content/docs/concepts/pinot-architecture/index.html | 4 ++-- content/docs/how-to/index.html | 2 +- content/docs/misc/build-docker/index.html | 4 ++-- content/docs/misc/index.html| 2 +- content/docs/user-guide/clients/golang/index.html | 4 ++-- content/docs/user-guide/clients/java/index.html | 4 ++-- content/docs/user-guide/index.html | 2 +- content/docs/user-guide/pql/index.html | 4 ++-- content/docs/user-guide/query-pinot/index.html | 4 ++-- content/docs/user-guide/response-format/index.html | 4 ++-- .../docs/user-guide/rest-admin-interface/index.html | 2 +- {download => content/download}/index.html | 20 content/hello/index.html| 2 +- content/index.html | 2 +- content/main.076a6be3.js| 2 +- download/index.html | 20 index.html | 21 + 54 files changed, 90 insertions(+), 101 deletions(-) diff --git a/content/404.html b/content/404.html index 3ae9a6c..e6bc9c2 100644 --- a/content/404.html +++ b/content/404.html @@ -30,7 +30,7 @@ !function(){function t(t){document.documentElement.setAttribute("data-theme",t)}var e=window.matchMedia("(prefers-color-scheme: dark)"),n=function(){var t=null;try{t=localStorage.getItem("theme")}catch(t){}return t}();null!==n?t(n):e.matches&&t("dark")}() -http://www.w3.org/2000/svg; width="30" height="30" viewBox="0 0 30 30" role="img" focusable="false">Menuhttp://www.w3.org/2000/svg; width="30" height="30" viewBox="0 0 30 30" role="img" focusable="false">Menu diff --git a/content/blog/hello-world/index.html b/content/blog/hello-world/index.html index ecffc11..91292f5 100644 --- a/content/blog/hello-world/index.html +++ b/content/blog/hello-world/index.html @@ -40,7 +40,7 @@ !function(){function t(t){document.documentElement.setAttribute("data-theme",t)}var e=window.matchMedia("(prefers-color-scheme: dark)"),n=function(){var t=null;try{t=localStorage.getItem("theme")}catch(t){}return t}();null!==n?t(n):e.matches&&t("dark")}() -http://www.w3.org/2000/svg; width="30" height="30" viewBox="0 0 30 30" role="img"
[incubator-pinot-site] branch update_doc_links_new created (now 08ccf01)
This is an automated email from the ASF dual-hosted git repository. xiangfu pushed a change to branch update_doc_links_new in repository https://gitbox.apache.org/repos/asf/incubator-pinot-site.git. at 08ccf01 update documentation links This branch includes the following new commits: new 08ccf01 update documentation links The 1 revisions listed above as "new" are entirely new to this repository and will be described in separate emails. The revisions listed as "add" were already present in the repository and have only been added to this reference. - To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org For additional commands, e-mail: commits-h...@pinot.apache.org
[incubator-pinot-site] branch update_doc_links_new updated (08ccf01 -> 23e5e11)
This is an automated email from the ASF dual-hosted git repository. xiangfu pushed a change to branch update_doc_links_new in repository https://gitbox.apache.org/repos/asf/incubator-pinot-site.git. discard 08ccf01 update documentation links new 23e5e11 update documentation links This update added new revisions after undoing existing revisions. That is to say, some revisions that were in the old version of the branch are not in the new version. This situation occurs when a user --force pushes a change and generates a repository containing something like this: * -- * -- B -- O -- O -- O (08ccf01) \ N -- N -- N refs/heads/update_doc_links_new (23e5e11) You should already have received notification emails for all of the O revisions, and so the following emails describe only the N revisions from the common base, B. Any revisions marked "omit" are not gone; other references still refer to them. Any revisions marked "discard" are gone forever. The 1 revisions listed above as "new" are entirely new to this repository and will be described in separate emails. The revisions listed as "add" were already present in the repository and have only been added to this reference. Summary of changes: content/404.html| 2 +- content/blog/hello-world/index.html | 2 +- content/blog/hola/index.html| 2 +- content/blog/index.html | 2 +- content/blog/tags/docusaurus/index.html | 2 +- content/blog/tags/facebook/index.html | 2 +- content/blog/tags/hello/index.html | 2 +- content/blog/tags/hola/index.html | 2 +- content/blog/tags/index.html| 2 +- content/blog/welcome/index.html | 2 +- content/docs/about/features_of_pinot/index.html | 2 +- content/docs/about/index.html | 2 +- content/docs/about/what_is_pinot/index.html | 2 +- content/docs/about/who_use_pinot/index.html | 2 +- content/docs/administration/guides/troubleshooting/index.html | 2 +- content/docs/administration/index.html | 2 +- content/docs/administration/installation/cloud/aws/index.html | 2 +- content/docs/administration/installation/cloud/azure/index.html | 2 +- content/docs/administration/installation/cloud/gcp/index.html | 2 +- content/docs/administration/installation/cloud/on-premises/index.html | 2 +- content/docs/administration/installation/containers/docker/index.html | 2 +- content/docs/administration/installation/containers/index.html | 2 +- .../docs/administration/installation/operating-systems/macos/index.html | 2 +- .../administration/installation/operating-systems/ubuntu/index.html | 2 +- content/docs/administration/running_locally/index.html | 2 +- content/docs/components/broker/index.html | 2 +- content/docs/components/cluster/index.html | 2 +- content/docs/components/controller/index.html | 2 +- content/docs/components/index.html | 2 +- content/docs/components/minion/index.html | 2 +- content/docs/components/schema/index.html | 2 +- content/docs/components/segments/index.html | 2 +- content/docs/components/server/index.html | 2 +- content/docs/components/tables/index.html | 2 +- content/docs/components/tenants/index.html | 2 +- content/docs/concepts/index.html| 2 +- content/docs/concepts/pinot-architecture/index.html | 2 +- content/docs/how-to/index.html | 2 +- content/docs/misc/build-docker/index.html | 2 +- content/docs/misc/index.html| 2 +- content/docs/user-guide/clients/golang/index.html | 2 +- content/docs/user-guide/clients/java/index.html | 2 +- content/docs/user-guide/index.html | 2 +- content/docs/user-guide/pql/index.html | 2 +- content/docs/user-guide/query-pinot/index.html | 2 +- content/docs/user-guide/response-format/index.html | 2 +- content/docs/user-guide/rest-admin-interface/index.html
[GitHub] [incubator-pinot-site] fx19880617 opened a new pull request #23: update documentation links
fx19880617 opened a new pull request #23: update documentation links URL: https://github.com/apache/incubator-pinot-site/pull/23 Updated old links for jira, blogs, etc This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org For additional commands, e-mail: commits-h...@pinot.apache.org
[GitHub] [incubator-pinot] siddharthteotia closed pull request #5075: Fix bug in DISTINCT for queries that return empty response
siddharthteotia closed pull request #5075: Fix bug in DISTINCT for queries that return empty response URL: https://github.com/apache/incubator-pinot/pull/5075 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org For additional commands, e-mail: commits-h...@pinot.apache.org
[incubator-pinot-site] branch asf-site updated (242a3c2 -> 6c44fcb)
This is an automated email from the ASF dual-hosted git repository. xiangfu pushed a change to branch asf-site in repository https://gitbox.apache.org/repos/asf/incubator-pinot-site.git. from 242a3c2 Update index.html add 23e5e11 update documentation links new 6c44fcb Merge pull request #23 from apache/update_doc_links_new The 1 revisions listed above as "new" are entirely new to this repository and will be described in separate emails. The revisions listed as "add" were already present in the repository and have only been added to this reference. Summary of changes: content/404.html| 2 +- content/blog/hello-world/index.html | 2 +- content/blog/hola/index.html| 2 +- content/blog/index.html | 2 +- content/blog/tags/docusaurus/index.html | 2 +- content/blog/tags/facebook/index.html | 2 +- content/blog/tags/hello/index.html | 2 +- content/blog/tags/hola/index.html | 2 +- content/blog/tags/index.html| 2 +- content/blog/welcome/index.html | 2 +- content/c4f5d8e4.d784d618.js| 2 +- content/docs/about/features_of_pinot/index.html | 4 ++-- content/docs/about/index.html | 2 +- content/docs/about/what_is_pinot/index.html | 4 ++-- content/docs/about/who_use_pinot/index.html | 2 +- .../guides/troubleshooting/index.html | 2 +- content/docs/administration/index.html | 2 +- .../installation/cloud/aws/index.html | 4 ++-- .../installation/cloud/azure/index.html | 4 ++-- .../installation/cloud/gcp/index.html | 2 +- .../installation/cloud/on-premises/index.html | 4 ++-- .../installation/containers/docker/index.html | 4 ++-- .../installation/containers/index.html | 2 +- .../installation/operating-systems/macos/index.html | 2 +- .../operating-systems/ubuntu/index.html | 2 +- .../docs/administration/running_locally/index.html | 4 ++-- content/docs/components/broker/index.html | 2 +- content/docs/components/cluster/index.html | 2 +- content/docs/components/controller/index.html | 2 +- content/docs/components/index.html | 2 +- content/docs/components/minion/index.html | 2 +- content/docs/components/schema/index.html | 2 +- content/docs/components/segments/index.html | 2 +- content/docs/components/server/index.html | 2 +- content/docs/components/tables/index.html | 2 +- content/docs/components/tenants/index.html | 2 +- content/docs/concepts/index.html| 2 +- content/docs/concepts/pinot-architecture/index.html | 4 ++-- content/docs/how-to/index.html | 2 +- content/docs/misc/build-docker/index.html | 4 ++-- content/docs/misc/index.html| 2 +- content/docs/user-guide/clients/golang/index.html | 4 ++-- content/docs/user-guide/clients/java/index.html | 4 ++-- content/docs/user-guide/index.html | 2 +- content/docs/user-guide/pql/index.html | 4 ++-- content/docs/user-guide/query-pinot/index.html | 4 ++-- content/docs/user-guide/response-format/index.html | 4 ++-- .../docs/user-guide/rest-admin-interface/index.html | 2 +- {download => content/download}/index.html | 20 content/hello/index.html| 2 +- content/index.html | 2 +- content/main.076a6be3.js| 2 +- download/index.html | 20 index.html | 21 + 54 files changed, 90 insertions(+), 101 deletions(-) copy {download => content/download}/index.html (89%) - To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org For additional commands, e-mail: commits-h...@pinot.apache.org
[incubator-pinot-site] 01/01: Merge pull request #23 from apache/update_doc_links_new
This is an automated email from the ASF dual-hosted git repository. xiangfu pushed a commit to branch asf-site in repository https://gitbox.apache.org/repos/asf/incubator-pinot-site.git commit 6c44fcb382a39cdf8e34ac8796101d39a93fddda Merge: 242a3c2 23e5e11 Author: Xiang Fu AuthorDate: Mon Mar 30 10:19:26 2020 -0700 Merge pull request #23 from apache/update_doc_links_new update documentation links content/404.html| 2 +- content/blog/hello-world/index.html | 2 +- content/blog/hola/index.html| 2 +- content/blog/index.html | 2 +- content/blog/tags/docusaurus/index.html | 2 +- content/blog/tags/facebook/index.html | 2 +- content/blog/tags/hello/index.html | 2 +- content/blog/tags/hola/index.html | 2 +- content/blog/tags/index.html| 2 +- content/blog/welcome/index.html | 2 +- content/c4f5d8e4.d784d618.js| 2 +- content/docs/about/features_of_pinot/index.html | 4 ++-- content/docs/about/index.html | 2 +- content/docs/about/what_is_pinot/index.html | 4 ++-- content/docs/about/who_use_pinot/index.html | 2 +- .../guides/troubleshooting/index.html | 2 +- content/docs/administration/index.html | 2 +- .../installation/cloud/aws/index.html | 4 ++-- .../installation/cloud/azure/index.html | 4 ++-- .../installation/cloud/gcp/index.html | 2 +- .../installation/cloud/on-premises/index.html | 4 ++-- .../installation/containers/docker/index.html | 4 ++-- .../installation/containers/index.html | 2 +- .../installation/operating-systems/macos/index.html | 2 +- .../operating-systems/ubuntu/index.html | 2 +- .../docs/administration/running_locally/index.html | 4 ++-- content/docs/components/broker/index.html | 2 +- content/docs/components/cluster/index.html | 2 +- content/docs/components/controller/index.html | 2 +- content/docs/components/index.html | 2 +- content/docs/components/minion/index.html | 2 +- content/docs/components/schema/index.html | 2 +- content/docs/components/segments/index.html | 2 +- content/docs/components/server/index.html | 2 +- content/docs/components/tables/index.html | 2 +- content/docs/components/tenants/index.html | 2 +- content/docs/concepts/index.html| 2 +- content/docs/concepts/pinot-architecture/index.html | 4 ++-- content/docs/how-to/index.html | 2 +- content/docs/misc/build-docker/index.html | 4 ++-- content/docs/misc/index.html| 2 +- content/docs/user-guide/clients/golang/index.html | 4 ++-- content/docs/user-guide/clients/java/index.html | 4 ++-- content/docs/user-guide/index.html | 2 +- content/docs/user-guide/pql/index.html | 4 ++-- content/docs/user-guide/query-pinot/index.html | 4 ++-- content/docs/user-guide/response-format/index.html | 4 ++-- .../docs/user-guide/rest-admin-interface/index.html | 2 +- {download => content/download}/index.html | 20 content/hello/index.html| 2 +- content/index.html | 2 +- content/main.076a6be3.js| 2 +- download/index.html | 20 index.html | 21 + 54 files changed, 90 insertions(+), 101 deletions(-) - To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org For additional commands, e-mail: commits-h...@pinot.apache.org
[GitHub] [incubator-pinot] siddharthteotia commented on a change in pull request #5177: Lucene DocId to PinotDocId cache
siddharthteotia commented on a change in pull request #5177: Lucene DocId to PinotDocId cache URL: https://github.com/apache/incubator-pinot/pull/5177#discussion_r400346677 ## File path: pinot-core/src/main/java/org/apache/pinot/core/segment/index/readers/text/LuceneTextIndexReader.java ## @@ -148,8 +153,12 @@ private MutableRoaringBitmap getPinotDocIds(MutableRoaringBitmap luceneDocIds) { try { while (luceneDocIDIterator.hasNext()) { int luceneDocId = luceneDocIDIterator.next(); -Document document = _indexSearcher.doc(luceneDocId); -int pinotDocId = Integer.valueOf(document.get(LuceneTextIndexCreator.LUCENE_INDEX_DOC_ID_COLUMN_NAME)); +Integer pinotDocId = _luceneDocIDToPinotDocIDCache.get(luceneDocId); Review comment: I am not using this method anymore. The mapping is built once during segment load and memory mapped. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org For additional commands, e-mail: commits-h...@pinot.apache.org
[GitHub] [incubator-pinot-site] fx19880617 merged pull request #23: update documentation links
fx19880617 merged pull request #23: update documentation links URL: https://github.com/apache/incubator-pinot-site/pull/23 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org For additional commands, e-mail: commits-h...@pinot.apache.org
[GitHub] [incubator-pinot] siddharthteotia commented on issue #5177: Lucene DocId to PinotDocId cache
siddharthteotia commented on issue #5177: Lucene DocId to PinotDocId cache URL: https://github.com/apache/incubator-pinot/pull/5177#issuecomment-606128156 @Jackie-Jiang , @kishoreg please take a look at this. I'd like to provide a new build for ongoing internal perf testing asap. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org For additional commands, e-mail: commits-h...@pinot.apache.org
[GitHub] [incubator-pinot] siddharthteotia commented on a change in pull request #5177: Lucene DocId to PinotDocId cache
siddharthteotia commented on a change in pull request #5177: Lucene DocId to PinotDocId cache URL: https://github.com/apache/incubator-pinot/pull/5177#discussion_r400358308 ## File path: pinot-core/src/main/java/org/apache/pinot/core/segment/index/readers/text/LuceneTextIndexReader.java ## @@ -148,8 +153,12 @@ private MutableRoaringBitmap getPinotDocIds(MutableRoaringBitmap luceneDocIds) { try { while (luceneDocIDIterator.hasNext()) { int luceneDocId = luceneDocIDIterator.next(); -Document document = _indexSearcher.doc(luceneDocId); -int pinotDocId = Integer.valueOf(document.get(LuceneTextIndexCreator.LUCENE_INDEX_DOC_ID_COLUMN_NAME)); +Integer pinotDocId = _luceneDocIDToPinotDocIDCache.get(luceneDocId); Review comment: Also added a TODO to consider building mapping on-the-fly during query processing -- we can have two methods and make it configurable based on number of docs per segment. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org For additional commands, e-mail: commits-h...@pinot.apache.org
[GitHub] [incubator-pinot] Jackie-Jiang commented on a change in pull request #5177: Lucene DocId to PinotDocId cache
Jackie-Jiang commented on a change in pull request #5177: Lucene DocId to PinotDocId cache URL: https://github.com/apache/incubator-pinot/pull/5177#discussion_r400359941 ## File path: pinot-core/src/test/java/org/apache/pinot/queries/TestTextSearchQueries.java ## @@ -89,7 +91,7 @@ private RecordReader _recordReader; Schema _schema; - private List _indexSegments = new ArrayList<>(1); + private static List _indexSegments = new ArrayList<>(1); Review comment: Revert the change in this file? Also suggesting rename the test to TextSearchQueriesTest for naming convention. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org For additional commands, e-mail: commits-h...@pinot.apache.org
[GitHub] [incubator-pinot] Jackie-Jiang commented on a change in pull request #5177: Lucene DocId to PinotDocId cache
Jackie-Jiang commented on a change in pull request #5177: Lucene DocId to PinotDocId cache URL: https://github.com/apache/incubator-pinot/pull/5177#discussion_r400360304 ## File path: pinot-core/src/main/java/org/apache/pinot/core/segment/index/readers/text/LuceneTextIndexReader.java ## @@ -51,6 +54,9 @@ private final IndexSearcher _indexSearcher; private final QueryParser _queryParser; private final String _column; + private DocIdReaderWriter _docIdReaderWriter; Review comment: `private final` This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org For additional commands, e-mail: commits-h...@pinot.apache.org
[GitHub] [incubator-pinot] Jackie-Jiang commented on a change in pull request #5177: Lucene DocId to PinotDocId cache
Jackie-Jiang commented on a change in pull request #5177: Lucene DocId to PinotDocId cache URL: https://github.com/apache/incubator-pinot/pull/5177#discussion_r400371993 ## File path: pinot-core/src/main/java/org/apache/pinot/core/segment/index/readers/text/LuceneTextIndexReader.java ## @@ -169,5 +178,49 @@ public void close() throws IOException { _indexReader.close(); _indexDirectory.close(); +_docIdReaderWriter.close(); + } + + private class DocIdReaderWriter implements Closeable { +private PinotDataBuffer _buffer; + +DocIdReaderWriter(File segmentIndexDir, String column, int numDocs) throws Exception { + int length = Integer.BYTES * numDocs; + File docIdMappingFile = new File(SegmentDirectoryPaths.findSegmentDirectory(segmentIndexDir), + column + LUCENE_TEXT_INDEX_DOCID_MAPPING_FILE_EXTENSION); + // For newly added segments, this file will not exist. + // For segment refresh, segment reload and server restart, file will exist, + // but we don't know if we are here for refresh v/s reload v/s restart. + // In case of refresh, we have to build the mapping again, but in case of + // reload and restart, we don't. Also, reload has a sub-case where this text index + // was indeed created during reload (user enabled on existing or newly added column). + // Since there is no way to distinguish why we are here, we build the mapping again + // regardless. + // TODO: see if we can prefetch the pages + _buffer = + PinotDataBuffer.mapFile(docIdMappingFile, false, 0, length, ByteOrder.BIG_ENDIAN, getClass().getSimpleName()); Review comment: Please include the column name into the description (last argument) to distinguish different columns This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org For additional commands, e-mail: commits-h...@pinot.apache.org
[GitHub] [incubator-pinot] Jackie-Jiang commented on a change in pull request #5177: Lucene DocId to PinotDocId cache
Jackie-Jiang commented on a change in pull request #5177: Lucene DocId to PinotDocId cache URL: https://github.com/apache/incubator-pinot/pull/5177#discussion_r400369668 ## File path: pinot-core/src/main/java/org/apache/pinot/core/segment/index/readers/text/LuceneTextIndexReader.java ## @@ -169,5 +178,49 @@ public void close() throws IOException { _indexReader.close(); _indexDirectory.close(); +_docIdReaderWriter.close(); + } + + private class DocIdReaderWriter implements Closeable { +private PinotDataBuffer _buffer; + +DocIdReaderWriter(File segmentIndexDir, String column, int numDocs) throws Exception { + int length = Integer.BYTES * numDocs; + File docIdMappingFile = new File(SegmentDirectoryPaths.findSegmentDirectory(segmentIndexDir), + column + LUCENE_TEXT_INDEX_DOCID_MAPPING_FILE_EXTENSION); + // For newly added segments, this file will not exist. + // For segment refresh, segment reload and server restart, file will exist, + // but we don't know if we are here for refresh v/s reload v/s restart. + // In case of refresh, we have to build the mapping again, but in case of Review comment: For segment refresh, this file should not exist as we delete the old segment This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org For additional commands, e-mail: commits-h...@pinot.apache.org
[GitHub] [incubator-pinot] Jackie-Jiang commented on a change in pull request #5177: Lucene DocId to PinotDocId cache
Jackie-Jiang commented on a change in pull request #5177: Lucene DocId to PinotDocId cache URL: https://github.com/apache/incubator-pinot/pull/5177#discussion_r400367374 ## File path: pinot-core/src/main/java/org/apache/pinot/core/segment/index/readers/text/LuceneTextIndexReader.java ## @@ -70,6 +76,10 @@ public LuceneTextIndexReader(String column, File segmentIndexDir) { // Disable Lucene query result cache. While it helps a lot with performance for // repeated queries, on the downside it cause heap issues. _indexSearcher.setQueryCache(null); + // TODO: consider using a threshold of num docs per segment to decide between building + // mapping file upfront on segment load v/s on-the-fly during query processing + _docIdReaderWriter = new DocIdReaderWriter(segmentIndexDir, _column, numDocs); + _docIdReaderWriter.buildDocIdMapping(numDocs); Review comment: We build the mapping every time we load the index? You saved the mapping into a file right? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org For additional commands, e-mail: commits-h...@pinot.apache.org
[GitHub] [incubator-pinot] Jackie-Jiang commented on a change in pull request #5177: Lucene DocId to PinotDocId cache
Jackie-Jiang commented on a change in pull request #5177: Lucene DocId to PinotDocId cache URL: https://github.com/apache/incubator-pinot/pull/5177#discussion_r400372543 ## File path: pinot-core/src/main/java/org/apache/pinot/core/segment/index/readers/text/LuceneTextIndexReader.java ## @@ -169,5 +178,49 @@ public void close() throws IOException { _indexReader.close(); _indexDirectory.close(); +_docIdReaderWriter.close(); + } + + private class DocIdReaderWriter implements Closeable { +private PinotDataBuffer _buffer; + +DocIdReaderWriter(File segmentIndexDir, String column, int numDocs) throws Exception { + int length = Integer.BYTES * numDocs; + File docIdMappingFile = new File(SegmentDirectoryPaths.findSegmentDirectory(segmentIndexDir), + column + LUCENE_TEXT_INDEX_DOCID_MAPPING_FILE_EXTENSION); + // For newly added segments, this file will not exist. + // For segment refresh, segment reload and server restart, file will exist, + // but we don't know if we are here for refresh v/s reload v/s restart. + // In case of refresh, we have to build the mapping again, but in case of + // reload and restart, we don't. Also, reload has a sub-case where this text index + // was indeed created during reload (user enabled on existing or newly added column). + // Since there is no way to distinguish why we are here, we build the mapping again + // regardless. + // TODO: see if we can prefetch the pages + _buffer = + PinotDataBuffer.mapFile(docIdMappingFile, false, 0, length, ByteOrder.BIG_ENDIAN, getClass().getSimpleName()); +} + +public void buildDocIdMapping(int numDocs) { + for (int i = 0; i < numDocs; i++) { +try { + Document document = _indexSearcher.doc(i); + int pinotDocId = Integer.valueOf(document.get(LuceneTextIndexCreator.LUCENE_INDEX_DOC_ID_COLUMN_NAME)); + _buffer.putInt(i * Integer.BYTES, pinotDocId); +} catch (Exception e) { + LOGGER.error("Failed to build doc id mapping during segment load for column:{},docID:{},error:{}. Will continue and build mapping on the fly", Review comment: Throw this exception out instead of logging an ERROR. If this step fails, JVM will crash when reading the buffer. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org For additional commands, e-mail: commits-h...@pinot.apache.org
[GitHub] [incubator-pinot] Jackie-Jiang commented on a change in pull request #5177: Lucene DocId to PinotDocId cache
Jackie-Jiang commented on a change in pull request #5177: Lucene DocId to PinotDocId cache URL: https://github.com/apache/incubator-pinot/pull/5177#discussion_r400370429 ## File path: pinot-core/src/main/java/org/apache/pinot/core/segment/index/readers/text/LuceneTextIndexReader.java ## @@ -169,5 +178,49 @@ public void close() throws IOException { _indexReader.close(); _indexDirectory.close(); +_docIdReaderWriter.close(); + } + + private class DocIdReaderWriter implements Closeable { +private PinotDataBuffer _buffer; + +DocIdReaderWriter(File segmentIndexDir, String column, int numDocs) throws Exception { + int length = Integer.BYTES * numDocs; + File docIdMappingFile = new File(SegmentDirectoryPaths.findSegmentDirectory(segmentIndexDir), + column + LUCENE_TEXT_INDEX_DOCID_MAPPING_FILE_EXTENSION); + // For newly added segments, this file will not exist. + // For segment refresh, segment reload and server restart, file will exist, + // but we don't know if we are here for refresh v/s reload v/s restart. + // In case of refresh, we have to build the mapping again, but in case of + // reload and restart, we don't. Also, reload has a sub-case where this text index + // was indeed created during reload (user enabled on existing or newly added column). + // Since there is no way to distinguish why we are here, we build the mapping again + // regardless. + // TODO: see if we can prefetch the pages + _buffer = + PinotDataBuffer.mapFile(docIdMappingFile, false, 0, length, ByteOrder.BIG_ENDIAN, getClass().getSimpleName()); Review comment: For better performance, I would probably choose native order as this index is always local to one instance? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org For additional commands, e-mail: commits-h...@pinot.apache.org
[GitHub] [incubator-pinot] siddharthteotia commented on a change in pull request #5177: Lucene DocId to PinotDocId cache
siddharthteotia commented on a change in pull request #5177: Lucene DocId to PinotDocId cache URL: https://github.com/apache/incubator-pinot/pull/5177#discussion_r400381863 ## File path: pinot-core/src/test/java/org/apache/pinot/queries/TestTextSearchQueries.java ## @@ -89,7 +91,7 @@ private RecordReader _recordReader; Schema _schema; - private List _indexSegments = new ArrayList<>(1); + private static List _indexSegments = new ArrayList<>(1); Review comment: oops. Sorry, this was due to local debugging. Will remove and also reformat This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org For additional commands, e-mail: commits-h...@pinot.apache.org
[GitHub] [incubator-pinot-site] fx19880617 opened a new pull request #25: Move assets to make download page work
fx19880617 opened a new pull request #25: Move assets to make download page work URL: https://github.com/apache/incubator-pinot-site/pull/25 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org For additional commands, e-mail: commits-h...@pinot.apache.org
[incubator-pinot-site] branch adding_download_link created (now 980ffed)
This is an automated email from the ASF dual-hosted git repository. xiangfu pushed a change to branch adding_download_link in repository https://gitbox.apache.org/repos/asf/incubator-pinot-site.git. at 980ffed Move assets to make download page work This branch includes the following new commits: new 980ffed Move assets to make download page work The 1 revisions listed above as "new" are entirely new to this repository and will be described in separate emails. The revisions listed as "add" were already present in the repository and have only been added to this reference. - To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org For additional commands, e-mail: commits-h...@pinot.apache.org
[incubator-pinot-site] branch adding_download_link updated (980ffed -> 95f3ddd)
This is an automated email from the ASF dual-hosted git repository. xiangfu pushed a change to branch adding_download_link in repository https://gitbox.apache.org/repos/asf/incubator-pinot-site.git. discard 980ffed Move assets to make download page work add d395e83 Merge pull request #24 from apache/adding_download_link new 95f3ddd Move assets to make download page work This update added new revisions after undoing existing revisions. That is to say, some revisions that were in the old version of the branch are not in the new version. This situation occurs when a user --force pushes a change and generates a repository containing something like this: * -- * -- B -- O -- O -- O (980ffed) \ N -- N -- N refs/heads/adding_download_link (95f3ddd) You should already have received notification emails for all of the O revisions, and so the following emails describe only the N revisions from the common base, B. Any revisions marked "omit" are not gone; other references still refer to them. Any revisions marked "discard" are gone forever. The 1 revisions listed above as "new" are entirely new to this repository and will be described in separate emails. The revisions listed as "add" were already present in the repository and have only been added to this reference. Summary of changes: - To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org For additional commands, e-mail: commits-h...@pinot.apache.org
[incubator-pinot] branch jfrog-bintray updated (2e991ef -> 86c4c17)
This is an automated email from the ASF dual-hosted git repository. jlli pushed a change to branch jfrog-bintray in repository https://gitbox.apache.org/repos/asf/incubator-pinot.git. discard 2e991ef Finalize the PR discard 6966dfd Add logic omit 90c1e82 Fix pom files omit 56500c7 Change versions before deploying to bintray omit e31571e Add relative path omit 8d7d03d Testing omit 26beead Fix the build omit 02352ce Deploy pinot to bintray add 48fb505 Change readme link to gitbook for kafka plugins readme. (#5191) add f1e2086 [TE] frontend - harleyjj/validation - surface errors in dom for create and edit alert (#5187) add 8f0ed55 Update travis scripts to test quickstart over jdk 10-15 (#5182) add 772f51e [TE] frontend - harleyjj/alert-details - show bounds for minute granularity again (#5192) add 1f1baf8 Adding missing license files for jquery-requestAnimationFrame and jquery-sizzle, requested in Issue #5183 (#5195) add 00fcb1d Move table config into pinot-spi (#5194) add ab32e0b Deploy pinot to bintray add e403e67 Fix the build add 06c34c3 Testing add ef10234 Add relative path add 83eadbb Change versions before deploying to bintray add 44ee854 Fix pom files add f36c66f Add logic add ead2ca6 Finalize the PR add 86c4c17 Rebase with master branch This update added new revisions after undoing existing revisions. That is to say, some revisions that were in the old version of the branch are not in the new version. This situation occurs when a user --force pushes a change and generates a repository containing something like this: * -- * -- B -- O -- O -- O (2e991ef) \ N -- N -- N refs/heads/jfrog-bintray (86c4c17) You should already have received notification emails for all of the O revisions, and so the following emails describe only the N revisions from the common base, B. Any revisions marked "omit" are not gone; other references still refer to them. Any revisions marked "discard" are gone forever. No new revisions were added by this update. Summary of changes: .gitignore | 1 + .travis.yml| 49 +- .ci.settings.xml => .travis/.ci.settings.xml | 0 .travis_install.sh => .travis/.travis_install.sh | 3 + .../.travis_nightly_build.sh | 2 +- .travis/.travis_quickstart.sh | 136 .../.travis_quickstart_openjdk.sh | 23 +- .../.travis_set_deploy_build_opts.sh | 0 .travis_test.sh => .travis/.travis_test.sh | 0 .travis_quickstart.sh | 125 .../LICENSE-jquery-requestAnimationFrame.txt | 22 + licenses-binary/LICENSE-jquery-sizzle.txt | 36 + licenses/LICENSE-jquery-requestAnimationFrame.txt | 22 + licenses/LICENSE-jquery-sizzle.txt | 36 + .../broker/api/resources/PinotBrokerDebug.java | 4 +- ...okerResourceOnlineOfflineStateModelFactory.java | 2 +- .../broker/broker/helix/HelixBrokerStarter.java| 2 +- .../HelixExternalViewBasedQueryQuotaManager.java | 54 +- .../requesthandler/BaseBrokerRequestHandler.java | 9 +- .../SingleConnectionBrokerRequestHandler.java | 2 +- .../pinot/broker/routing/RoutingManager.java | 6 +- .../instanceselector/InstanceSelectorFactory.java | 6 +- .../segmentpruner/SegmentPrunerFactory.java| 10 +- .../segmentselector/SegmentSelectorFactory.java| 5 +- .../routing/timeboundary/TimeBoundaryManager.java | 4 +- .../broker/broker/HelixBrokerStarterTest.java | 15 +- ...elixExternalViewBasedQueryQuotaManagerTest.java | 107 ++- .../instanceselector/InstanceSelectorTest.java | 10 +- .../routing/segmentpruner/SegmentPrunerTest.java | 12 +- .../segmentselector/SegmentSelectorTest.java | 4 +- .../timeboundary/TimeBoundaryManagerTest.java | 9 +- .../InstanceAssignmentConfigUtils.java | 17 +- .../common/assignment/InstancePartitionsUtils.java | 9 +- .../apache/pinot/common/config/TableConfig.java| 733 - .../common/config/TextIndexConfigValidator.java| 46 -- .../pinot/common/metadata/ZKMetadataProvider.java | 19 +- .../metadata/instance/InstanceZKMetadata.java | 2 +- .../apache/pinot/common/utils/CommonConstants.java | 15 - .../pinot/common/utils/config/InstanceUtils.java | 80 +++ .../common/utils/config/TableConfigUtils.java | 192 ++ .../common/{ => utils}/config/TagNameUtils.java| 12 +- .../pinot/common/utils/helix/HelixHelper.java | 2 +- .../pinot/common/utils/helix/TableCache.java | 9 +- .../pinot/common/config/QuotaConfigTest.java | 120 .../pinot/common/config/TableConfigTest.java | 409 .../apache/pinot/common/utils/DataSizeTest.java| 45 --
[GitHub] [incubator-pinot-site] fx19880617 opened a new pull request #24: Adding Download link into menu bar
fx19880617 opened a new pull request #24: Adding Download link into menu bar URL: https://github.com/apache/incubator-pinot-site/pull/24 - Adding Download link into menu bar - Fixing assets link in download page This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org For additional commands, e-mail: commits-h...@pinot.apache.org
[incubator-pinot-site] 01/01: Adding Download link into menu bar
This is an automated email from the ASF dual-hosted git repository. xiangfu pushed a commit to branch adding_download_link in repository https://gitbox.apache.org/repos/asf/incubator-pinot-site.git commit 2bfb77cdae4bbcada3beacdeb9e82e3ab23436db Author: Xiang Fu AuthorDate: Mon Mar 30 11:13:17 2020 -0700 Adding Download link into menu bar --- content/404.html | 2 +- content/blog/hello-world/index.html | 2 +- content/blog/hola/index.html | 2 +- content/blog/index.html | 2 +- content/blog/tags/docusaurus/index.html | 2 +- content/blog/tags/facebook/index.html | 2 +- content/blog/tags/hello/index.html| 2 +- content/blog/tags/hola/index.html | 2 +- content/blog/tags/index.html | 2 +- content/blog/welcome/index.html | 2 +- content/docs/about/features_of_pinot/index.html | 2 +- content/docs/about/index.html | 2 +- content/docs/about/what_is_pinot/index.html | 2 +- content/docs/about/who_use_pinot/index.html | 2 +- content/docs/administration/guides/troubleshooting/index.html | 2 +- content/docs/administration/index.html| 2 +- content/docs/administration/installation/cloud/aws/index.html | 2 +- content/docs/administration/installation/cloud/azure/index.html | 2 +- content/docs/administration/installation/cloud/gcp/index.html | 2 +- .../docs/administration/installation/cloud/on-premises/index.html | 2 +- .../docs/administration/installation/containers/docker/index.html | 2 +- content/docs/administration/installation/containers/index.html| 2 +- .../installation/operating-systems/macos/index.html | 2 +- .../installation/operating-systems/ubuntu/index.html | 2 +- content/docs/administration/running_locally/index.html| 2 +- content/docs/components/broker/index.html | 2 +- content/docs/components/cluster/index.html| 2 +- content/docs/components/controller/index.html | 2 +- content/docs/components/index.html| 2 +- content/docs/components/minion/index.html | 2 +- content/docs/components/schema/index.html | 2 +- content/docs/components/segments/index.html | 2 +- content/docs/components/server/index.html | 2 +- content/docs/components/tables/index.html | 2 +- content/docs/components/tenants/index.html| 2 +- content/docs/concepts/index.html | 2 +- content/docs/concepts/pinot-architecture/index.html | 2 +- content/docs/how-to/index.html| 2 +- content/docs/misc/build-docker/index.html | 2 +- content/docs/misc/index.html | 2 +- content/docs/user-guide/clients/golang/index.html | 2 +- content/docs/user-guide/clients/java/index.html | 2 +- content/docs/user-guide/index.html| 2 +- content/docs/user-guide/pql/index.html| 2 +- content/docs/user-guide/query-pinot/index.html| 2 +- content/docs/user-guide/response-format/index.html| 2 +- content/docs/user-guide/rest-admin-interface/index.html | 2 +- content/download/index.html | 8 content/hello/index.html | 2 +- content/index.html| 2 +- content/main.076a6be3.js | 2 +- index.html| 2 -- 52 files changed, 54 insertions(+), 56 deletions(-) diff --git a/content/404.html b/content/404.html index e6bc9c2..e874728 100644 --- a/content/404.html +++ b/content/404.html @@ -30,7 +30,7 @@ !function(){function t(t){document.documentElement.setAttribute("data-theme",t)}var e=window.matchMedia("(prefers-color-scheme: dark)"),n=function(){var t=null;try{t=localStorage.getItem("theme")}catch(t){}return t}();null!==n?t(n):e.matches&&t("dark")}() -http://www.w3.org/2000/svg; width="30" height="30" viewBox="0 0 30 30" role="img" focusable="false">Menuhttp://www.w3.org/2000/svg; width="30" height="30" viewBox="0 0 30 30" role="img" focusable="false">Menu diff --git a/content/blog/hello-world/index.html b/content/blog/hello-world/index.html index 91292f5..b677722
[incubator-pinot-site] branch adding_download_link created (now 2bfb77c)
This is an automated email from the ASF dual-hosted git repository. xiangfu pushed a change to branch adding_download_link in repository https://gitbox.apache.org/repos/asf/incubator-pinot-site.git. at 2bfb77c Adding Download link into menu bar This branch includes the following new commits: new 2bfb77c Adding Download link into menu bar The 1 revisions listed above as "new" are entirely new to this repository and will be described in separate emails. The revisions listed as "add" were already present in the repository and have only been added to this reference. - To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org For additional commands, e-mail: commits-h...@pinot.apache.org