Hi guys!
We are building a system using only Lucene (no MySQL). The system is
analyzing A LOT of input files and inserts a dozen of types of records
in the index. Each record type has at least a PK and a TYPE field -
which makes it unique. Our problem is that the records are not unique
from file to file so each time we need to insert a document in the
index, first we need to check if the record is already in the index or
not (using PK and TYPE) - and this is VERY slow (the way I did it, at
least). Here's the code I use to search the index to verify if my
record's already there:
//*****
// Prepare filters
//*****
// Initialize
Filter *cluCF = NULL;
Filter *cluFilters[3]; // 2+1
// Type
strcpy(sFieldName, "TYPE");
strcpy(sFieldValue, sTmpType);
STRCPY_AtoT(tField, sFieldName, sizeof(sFieldName));
STRCPY_AtoT(tValue, sFieldValue, sizeof(sFieldValue));
cluFilters[0] = _CLNEW QueryFilter(QueryParser::parse(tValue,
tField, &cluKwdAn));
// PK
strcpy(sFieldName, "PK");
strcpy(sFieldValue, sTmpPK);
STRCPY_AtoT(tField, sFieldName, sizeof(sFieldName));
STRCPY_AtoT(tValue, sFieldValue, sizeof(sFieldValue));
cluFilters[1] = _CLNEW QueryFilter(QueryParser::parse(tValue,
tField, &cluKwdAn));
// Null terminator
cluFilters[2] = NULL;
// Combine filters
cluCF = _CLNEW ChainedFilter(cluFilters, ChainedFilter::AND);
//*****
// Find document
//*****
// Prepare query
sprintf(sTmp, "tag:tag", sTmpPK);
STRCPY_AtoT(tField, sTmp, sizeof(sTmp));
// Search
cluQuery = QueryParser::parse(tField, _T("content"), &cluStdAn);
cluHits = cluSearcher->search(cluQuery, cluCF);
// Document found in the index?
if ( cluHits->length() == 0 )
{
// Insert document...
...
All records have a "tag" field containing the word "tag", that's why I'm
searching for tag:tag (to find all records and then apply filters).
Is there a better way to do this? Because that way, it's so slow that we
can't consider using that system in production.
Thank you very much,
Mike
------------------------------------------------------------------------------
Let Crystal Reports handle the reporting - Free Crystal Reports 2008 30-Day
trial. Simplify your report design, integration and deployment - and focus on
what you do best, core application coding. Discover what's new with
Crystal Reports now. http://p.sf.net/sfu/bobj-july
_______________________________________________
CLucene-developers mailing list
CLucene-developers@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/clucene-developers