Superskyyy commented on PR #10884:
URL: https://github.com/apache/skywalking/pull/10884#issuecomment-1579334587

   Hi! I will offer two algorithm choices for future implementation references: 
First I will do the tree version as it's easier and quicker, expect to have a 
working algorithm in a week (will provide a simple web interface to test it).
   
   1. My own tree based algorithm
   2. LLM based response using 
[Langchain](https://github.com/hwchase17/langchain)
   
   Input is like this, output will be structured according to proto.
   ```
   cachedHttpUris: ConcurrentHashMap
   |
   |-- Service Name (String)
   |   |-- URI (String) : Occurrence Count (AtomicInteger)
   |   |-- URI (String) : Occurrence Count (AtomicInteger)
   |
   |-- Service Name (String)
   |   |-- URI (String) : Occurrence Count (AtomicInteger)
   |
   
   Output:
   
   Something like this
   
   {
        "FormattedPattern1": [URI, URI, URI...],   # FormattedPattern is not 
regex, but logical endpoint. 
                                                   # Algorithm generated regex 
is unreliable, it will introduce ambiguity, so I don't recommend update rules 
with algorithm, any new uri that unhandled by openapi and user regex should be 
always sent to algorithm
        "FormattedPattern2": [URI, URI, URI...],
        "FormattedPattern3": [URI, URI, URI...],
   }
   ```
   
   Considerations of the algorithm:
   
   It's going to be stateful, based on incremental trees. Meaning across 
different batches (at most 3k per 30mins) the algorithm will remember 
previously grouped uris (assumption is that it will always end up to be a 
finite number), so results is consistent throughout the service lifecycle. LLM 
based response is stateless by nature, so previous uri results will be kept in 
a local cache and passed back to each gpt question in later batches to simulate 
the stateful nature.
   
   
   [I have a comment on the regex pattern group added back to ruleset, please 
see in pr code comment]


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to