Hi I am trying to use NaiveBayesClassifier in my solr project. Currently looking at its test case ClassificationTestBase.java.
Below codes seems like that classifier read the whole index db to train the model everytime when classification happened for inputDocument. or am I misunderstanding something here? If i had a large index db, will it impact performance? protected void checkCorrectClassification(Classifier<T> classifier, String inputDoc, T expectedResult, Analyzer analyzer, String textFieldName, String classFieldName, Query query) throws Exception { AtomicReader atomicReader = null; try { populateSampleIndex(analyzer); atomicReader = SlowCompositeReaderWrapper.wrap(indexWriter .getReader()); classifier.train(atomicReader, textFieldName, classFieldName, analyzer, query); ClassificationResult<T> classificationResult = classifier.assignClass( inputDoc); assertNotNull(classificationResult.getAssignedClass()); assertEquals("got an assigned class of " + classificationResult.getAssignedClass(), expectedResult, classificationResult.getAssignedClass()); assertTrue("got a not positive score " + classificationResult.getScore(), classificationResult.getScore() > 0); } finally { if (atomicReader != null) atomicReader.close(); } }