wu-sheng commented on PR #10884: URL: https://github.com/apache/skywalking/pull/10884#issuecomment-1579558155
> Hi! I will wrie two algorithm choices for future implementation references: > > First I will do the tree version as it's easier and more feasible for users, expect to have a working algorithm in **a week** (will provide a simple web interface to test it). > > > > 1. My own tree based algorithm (priority) > > 2. LLM based response using [Langchain](https://github.com/hwchase17/langchain) > > > > Input is like this, output will be structured according to proto. > > ``` > > cachedHttpUris: ConcurrentHashMap > > | > > |-- Service Name (String) > > | |-- URI (String) : Occurrence Count (AtomicInteger) > > | |-- URI (String) : Occurrence Count (AtomicInteger) > > | > > |-- Service Name (String) > > | |-- URI (String) : Occurrence Count (AtomicInteger) > > | > > > > Output: > > > > Something like this > > > > { > > "FormattedEndpoint1": {"regex": "pattern", "uris": [URI, URI, URI...]}, # < Regex starts from the smallest scope possible? > > "FormattedEndpoint2": {"regex": "pattern", "uris": [URI, URI, URI...]}, > > "FormattedEndpoint3": {"regex": "pattern", "uris": [URI, URI, URI...]} > > } > > ``` > > > > Considerations of the algorithm: > > > > It's going to be stateful, based on incremental trees. Meaning across different batches (at most 3k per 30mins) the algorithm will remember previously grouped uris (assumption is that it will always end up to be a finite number), so results is consistent throughout the service lifecycle. LLM based response is stateless by nature, so previous uri results will be kept in a local cache and passed back to each gpt question in later batches to simulate the stateful nature. > > I am feeling you are not reading codes correctly. OAP will provide you all URIs as INPUT, and expect yo get the regex to format with formatted URI. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
