wenjin272 commented on PR #842:
URL: https://github.com/apache/flink-agents/pull/842#issuecomment-4693330995
After enabling `ragAsync`, the VectorStoreCrossLanguageTest in CI
consistently hangs, and I am unable to reproduce this issue locally.
After thorough investigation in my own repository, I discovered that the
test hangs at the `np.array` call within ChromaDB's `normalize_embeddings`
function.
```
def normalize_embeddings(
target: Optional[Union[OneOrMany[Embedding], OneOrMany[PyEmbedding]]]
) -> Optional[Embeddings]:
if target is None:
return None
if len(target) == 0:
raise ValueError(
f"Expected Embeddings to be non-empty list or numpy array, got
{target}"
)
if isinstance(target, np.ndarray):
if target.ndim == 1:
return [target]
elif target.ndim == 2:
return [row for row in target]
elif isinstance(target, list):
# One PyEmbedding
if isinstance(target[0], (int, float)) and not isinstance(target[0],
bool):
return [np.array(target, dtype=np.float32)]
elif isinstance(target[0], np.ndarray):
return cast(Embeddings, target)
elif isinstance(target[0], list):
if isinstance(target[0][0], (int, float)) and not isinstance(
target[0][0], bool
):
return [np.array(row, dtype=np.float32) for row in target]
raise ValueError(
f"Expected embeddings to be a list of floats or ints, a list of
lists, a numpy array, or a list of numpy arrays, got {target}"
)
```
I attempted several approaches using Claude Code, but none resolved the hang
issue. As a temporary workaround, I decided to decouple the `list<float>` to
`ndarray` conversion from the async execution flow. I will update this PR
shortly with the changes.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]