uchenily commented on code in PR #56954:
URL: https://github.com/apache/doris/pull/56954#discussion_r2431946797
##########
be/src/olap/rowset/segment_v2/ann_index/faiss_ann_index.cpp:
##########
@@ -151,11 +151,24 @@ struct FaissIndexReader : faiss::IOReader {
lucene::store::IndexInput* _input = nullptr;
};
-void FaissVectorIndex::train(vectorized::Int64 n, const float* x) {
+doris::Status FaissVectorIndex::train(vectorized::Int64 n, const float* x) {
DCHECK(x != nullptr);
DCHECK(_index != nullptr);
+
+ // For PQ index, check if we have enough training data
+ if (_params.quantizer == FaissBuildParameter::Quantizer::PQ) {
+ int k = 1 << _params.pq_nbits;
+ if (n < k) {
+ // Not enough training data
+ LOG(WARNING) << "Not enough training data for PQ index: " << n <<
" < " << k;
+ return doris::Status::RuntimeError("Not enough training data for
PQ index");
+ }
+ }
+
omp_set_num_threads(config::omp_threads_limit);
_index->train(n, x);
Review Comment:
Yes, both train and add functions may throw exceptions in code. but so far,
only when no enough traning data for PQ index will throw a real exception.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]