wu-sheng commented on PR #10884:
URL: https://github.com/apache/skywalking/pull/10884#issuecomment-1579558155

   > Hi! I will wrie two algorithm choices for future implementation 
references: 
   > 
   > First I will do the tree version as it's easier and more feasible for 
users, expect to have a working algorithm in **a week** (will provide a simple 
web interface to test it).
   > 
   > 
   > 
   > 1. My own tree based algorithm (priority)
   > 
   > 2. LLM based response using 
[Langchain](https://github.com/hwchase17/langchain) 
   > 
   > 
   > 
   > Input is like this, output will be structured according to proto.
   > 
   > ```
   > 
   > cachedHttpUris: ConcurrentHashMap
   > 
   > |
   > 
   > |-- Service Name (String)
   > 
   > |   |-- URI (String) : Occurrence Count (AtomicInteger)
   > 
   > |   |-- URI (String) : Occurrence Count (AtomicInteger)
   > 
   > |
   > 
   > |-- Service Name (String)
   > 
   > |   |-- URI (String) : Occurrence Count (AtomicInteger)
   > 
   > |
   > 
   > 
   > 
   > Output:
   > 
   > 
   > 
   > Something like this
   > 
   > 
   > 
   > {
   > 
   >    "FormattedEndpoint1": {"regex": "pattern", "uris": [URI, URI, URI...]}, 
 # < Regex starts from the smallest scope possible?
   > 
   >    "FormattedEndpoint2": {"regex": "pattern", "uris": [URI, URI, URI...]}, 
   > 
   >    "FormattedEndpoint3": {"regex": "pattern", "uris": [URI, URI, URI...]}
   > 
   > }
   > 
   > ```
   > 
   > 
   > 
   > Considerations of the algorithm:
   > 
   > 
   > 
   > It's going to be stateful, based on incremental trees. Meaning across 
different batches (at most 3k per 30mins) the algorithm will remember 
previously grouped uris (assumption is that it will always end up to be a 
finite number), so results is consistent throughout the service lifecycle. LLM 
based response is stateless by nature, so previous uri results will be kept in 
a local cache and passed back to each gpt question in later batches to simulate 
the stateful nature.
   > 
   > 
   
   I am feeling you are not reading codes correctly. OAP will provide you all 
URIs as INPUT, and expect yo get the regex to format with formatted URI.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to