thanks again for your feeback!
I will give it a try and get back if I should have more questions :-)
Thanks
Michael
Am 13.07.21 um 09:58 schrieb Tomoko Uchida:
I think beside the query it would be nice if Luke would display some
"stats" of the index, for example the various fields beside the actual
vector and also how many vectors are inside the index
It would be a good start point, I think.
Can you give me a hint where in the code this check does currently happen?
(I guess where the error is happening about the corrupted index)
Actually I have few clues about where to start (haven't tried to read
indexes that includes vector values with Luke).
The stack traces you might see should include full information to fix
or improve it.
Tomoko
2021年7月13日(火) 14:22 Michael Wechner <[email protected]>:
Am 13.07.21 um 04:22 schrieb Tomoko Uchida:
There isn't any plans for that, and I'm not sure what is actually
expected of the GUI tool
yes, I understand, the input for the query would have to be an embedding
(vector of for example 768 dimensions).
I currently see two possibilities to do this:
- Import/open the embedding from a file
- Connecting the regular search input with a service generating the
embedding, like for example https://github.com/hanxiao/bert-as-service
to support vector search codec (it'd be
costly operation to decode vectors with several hundreds of
dimensions); though I am open to new ideas which are feasible and
useful.
I think beside the query it would be nice if Luke would display some
"stats" of the index, for example the various fields beside the actual
vector and also how many vectors are inside the index
Nonetheless the error you saw is not great; we could improve that by
just ignoring the codec for now.
maybe I can try to improve this :-)
Can you give me a hint where in the code this check does currently happen?
(I guess where the error is happening about the corrupted index)
Thanks
Michael
Tomoko
2021年7月6日(火) 16:23 Michael Wechner <[email protected]>:
Hi
I just created a Lucene vector search index with Lucene-9.0.0-SNAPSHOT
based on train-v2.0.json of SQuAD
(https://rajpurkar.github.io/SQuAD-explorer/), which are 86'831 QnAs
(for the embedding I used SentenceBERT).
It took a couple of hours on my Mac laptop, but it worked in the end and
I can search successfully :-)
I tried to open the index with Luke, but receive an error, that the
index might be corrupt.
Does Luke already support analyzing a vector search index? If not, are
there any plans to support vector search?
Thanks
Michael
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]