Serkan Mulayim wrote on 11/16/16, 1:17 PM:
Hi guys,
I think I need to simplify my question. After reading it one more time, I
realized I touched many things, and it seem confusing.
It seems like if we index the same document twice, a new document is
created. And as per http://lucy.apache.org/docs/c/Lucy/Docs/DocIDs.html, " If
you truly need a primary key field, you must define it and populate it
yourself". How can we do this, are there any examples around this? Should I
search for the document with the primary key before indexing and if it
exists, should I not index it?
What I do in all my apps is use delete_by_term
https://metacpan.org/pod/distribution/Lucy/lib/Lucy/Index/Indexer.pod#delete_by_term
I have my own primary key system that varies based on the application. Sometimes
it is a URI, sometimes a db PK. I maintain the document integrity myself.
One example from how Dezi solves this more generally:
https://metacpan.org/source/KARMAN/Dezi-App-0.014/lib/Dezi/Lucy/Indexer.pm#L451
Lucy isn't a RDBMS. It just tokenizes the fields you shove into it, and
retrieves very quickly.
--
Peter Karman . http://peknet.com/ . pe...@peknet.com