hgaol commented on PR #1510:
URL: https://github.com/apache/answer/pull/1510#issuecomment-4230888299
There're also some follow ups I don't have clear answer for now. Post here
for discussion.
## Follow-up: When to calc the embeddings and sync to vector storage
### Comparison with Search Plugin
| Aspect | Search Plugin | VectorSearch Plugin |
|--------|---------------|---------------------|
| Bulk sync | Yes | Yes |
| Real-time sync | Yes (create/update/delete hooks) | No |
| Trigger | Event-driven + startup/config | Startup/config only |
| Consistency | Near real-time | Eventually consistent |
### Current Gap
`UpdateContent()` / `DeleteContent()` exist in `plugin.VectorSearch`, but
are not called from question/answer service events. So after initial sync,
content changes are not reflected until next full re-sync.
### Options
1. **Manual (current)**
- Re-sync only on plugin config save/update
- Simple, but stale results between syncs
2. **Real-time**
- Add event hooks to call vector search update/delete
- Can be async (goroutine / queue) to avoid write-path latency
- Higher embedding API call volume
3. **Scheduled (cron)**
- Periodic bulk sync via cron expression
- Good for off-peak syncing
- Delayed freshness until next run
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]