This is an automated email from the ASF dual-hosted git repository.

alexstocks pushed a commit to branch main
in repository https://gitbox.apache.org/repos/asf/dubbo-go-samples.git


The following commit(s) were added to refs/heads/main by this push:
     new 2bf11259 [llm] use nacos for service discovery (#847)
2bf11259 is described below

commit 2bf11259e01f4898190cafaf7abc983adcde90f3
Author: Gucvii <[email protected]>
AuthorDate: Tue May 27 23:24:58 2025 +0800

    [llm] use nacos for service discovery (#847)
    
    * fix: [llm] no more log info
    
    * feat:
    1. Each server is aligned with one LLM.
    2. Use a script to run the server cluster and use Nacos for service 
discovery.
    3. Load balancing is implemented using round-robin, but it currently does 
not support selecting models.
    
    * fix: new README.md and README_zh.md
    
    * fix: import
    
    * Update llm/go-client/cmd/client.go
    
    standard TODO
    
    Co-authored-by: Copilot <[email protected]>
    
    * fix: log -> logger, all config -> config.go
    
    ---------
    
    Co-authored-by: Copilot <[email protected]>
---
 llm/.env.example                        |  4 +-
 llm/README.md                           | 69 +++++++++++++++++---------
 llm/README_zh.md                        | 62 ++++++++++++++++-------
 llm/config/config.go                    | 40 +++++++++++++--
 llm/go-client/cmd/client.go             | 11 ++++-
 llm/go-client/frontend/handlers/chat.go | 14 +++---
 llm/go-client/frontend/main.go          |  5 +-
 llm/go-server/cmd/server.go             | 86 +++++++++++---------------------
 llm/start_servers.bat                   | 68 ++++++++++++++++++++++++++
 llm/start_servers.sh                    | 87 +++++++++++++++++++++++++++++++++
 10 files changed, 334 insertions(+), 112 deletions(-)

diff --git a/llm/.env.example b/llm/.env.example
index e77245c0..f3686ffa 100644
--- a/llm/.env.example
+++ b/llm/.env.example
@@ -16,9 +16,11 @@
 #
 
 
-OLLAMA_MODELS = llava:7b, qwen2.5:7b
+OLLAMA_MODELS= qwen2.5:7b, llava:7b
 OLLAMA_URL = http://localhost:11434
 TIME_OUT_SECOND = 300
 
 NACOS_URL = localhost:8848
 MAX_CONTEXT_COUNT = 3
+MODEL_NAME = qwen2.5:7b
+SERVER_PORT = 20000
diff --git a/llm/README.md b/llm/README.md
index ec0f8d35..09951a15 100644
--- a/llm/README.md
+++ b/llm/README.md
@@ -2,7 +2,7 @@
 
 ## 1. **Introduction**
 
-This sample demonstrates how to integrate **large language models (LLM)** in 
**Dubbo-go**, allowing the server to invoke the Ollama model for inference and 
return the results to the client via Dubbo RPC.
+This sample demonstrates how to integrate **large language models (LLM)** in 
**Dubbo-go**, allowing the server to invoke the Ollama model for inference and 
return the results to the client via Dubbo RPC. It supports multiple model 
deployment with multiple instances per model.
 
 ## 2. **Preparation**
 
@@ -28,15 +28,14 @@ $ source ~/.bashrc
 $ ollama serve
 ```
 
-### **Download Model**
+### **Download Models**
 
 ```shell
 $ ollama pull llava:7b
+$ ollama pull qwen2.5:7b  # Optional: download additional models
 ```
 
-Default model uses ```llava:7b```, a novel end-to-end trained large multimodal 
model.
-
-You can pull your favourite model and specify the demo to use the model in 
```.env``` file
+You can pull your preferred models and configure them in the `.env` file.
 
 ### **Install Nacos**
 
@@ -44,51 +43,77 @@ Follow this instruction to [install and start Nacos 
server](https://dubbo-next.s
 
 ## 3. **Run the Example**
 
-You need to run all the commands in ```llm``` directory.
+You need to run all the commands in the `llm` directory.
 
 ```shell
 $ cd llm
 ```
 
 Create your local environment configuration by copying the template file. 
-After creating the ```.env``` file, edit it to set up your specific 
configurations.
+After creating the `.env` file, edit it to set up your specific configurations.
 
 ```shell
 # Copy environment template (Use `copy` for Windows)
 $ cp .env.example .env
 ```
 
+### **Configuration**
+
+The `.env` file supports multiple model configurations, example:
+
+```text
+# Configure multiple models, comma-separated, spaces allowed
+OLLAMA_MODELS = llava:7b, qwen2.5:7b
+OLLAMA_URL = http://localhost:11434
+NACOS_URL = nacos://localhost:8848
+TIME_OUT_SECOND = 300
+MAX_CONTEXT_COUNT = 3
+```
+
 ### **Run the Server**
 
-The server integrates the Ollama model and uses Dubbo-go's RPC service for 
invocation.
+The server supports multi-instance deployment, with multiple instances per 
model to enhance service capacity. We provide convenient startup scripts:
 
-Run the server by executing:
+**Linux/macOS**:
+```shell
+# Default: 2 instances per model, starting from port 20020
+$ ./start_servers.sh
 
+# Custom configuration: specify instance count and start port
+$ ./start_servers.sh --instances 3 --start-port 20030
+```
+
+**Windows**:
 ```shell
-$ go run go-server/cmd/server.go
+# Default: 2 instances per model, starting from port 20020
+$ start_servers.bat
+
+# Custom configuration: specify instance count and start port
+$ start_servers.bat --instances 3 --start-port 20030
 ```
 
 ### **Run the Client**
 
-The client invokes the server's RPC interface to retrieve the inference 
results from the Ollama model.
-
-Run the cli client by executing:
+The client invokes the server's RPC interface to retrieve inference results 
from the Ollama models.
 
+CLI Client:
 ```shell
 $ go run go-client/cmd/client.go
 ```
+Supports multi-turn conversations, command interaction, and context management.
 
-Cli client supports multi-turn conversations, command interact, context 
management.
-
-We also support a frontend using Gin framework for users to interact. If you 
want run the frontend client you can executing the following command and open 
it in ```localhost:8080``` by default:
-
+Web Client:
 ```shell
 $ go run go-client/frontend/main.go
 ```
+Access at `localhost:8080` with features:
+- Multi-turn conversations
+- Image upload support (png, jpeg, gif)
+- Multiple model selection
 
-Frontend client supports multi-turn conversations, binary file (image) support 
for LLM interactions.
-Currently the supported uploaded image types are limited to png, jpeg and gif, 
with plans to support more binary file types in the future.
-
-### **Notice**
+### **Important Notes**
 
-The default timeout is set to two minutes, please make sure that your 
computer's performance can generate the corresponding response within two 
minutes, otherwise it will report an error timeout, you can set your own 
timeout time in the ```.env``` file
\ No newline at end of file
+1. Default timeout is 5 minutes (adjustable via `TIME_OUT_SECOND` in `.env`)
+2. Each model runs 2 instances by default, adjustable via startup script 
parameters
+3. Servers automatically register with Nacos, no manual port specification 
needed
+4. Ensure all configured models are downloaded through Ollama before starting
\ No newline at end of file
diff --git a/llm/README_zh.md b/llm/README_zh.md
index 25fed96b..8d768d8b 100644
--- a/llm/README_zh.md
+++ b/llm/README_zh.md
@@ -2,7 +2,7 @@
 
 ## 1. **介绍**
 
-本案例展示了如何在 **Dubbo-go** 中集成 **大语言模型(LLM)**,以便在服务端调用 Ollama 模型进行推理,并将结果通过 Dubbo 
的 RPC 接口返回给客户端。
+本案例展示了如何在 **Dubbo-go** 中集成 **大语言模型(LLM)**,以便在服务端调用 Ollama 模型进行推理,并将结果通过 Dubbo 
的 RPC 接口返回给客户端。支持多模型部署和每个模型运行多个实例。
 
 ## 2. **准备工作**
 
@@ -32,11 +32,10 @@ $ ollama serve
 
 ```shell
 $ ollama pull llava:7b
+$ ollama pull qwen2.5:7b  # 可选:下载其他模型
 ```
 
-默认模型使用```llava:7b```,这是一个新颖的端到端多模态的大模型。
-
-您可以自行pull自己喜欢的模型,并在 ```.env``` 文件中指定该demo使用模型。
+您可以自行 pull 需要的模型,并在 `.env` 文件中配置要使用的模型列表。
 
 ### **安装 Nacos**
 
@@ -44,51 +43,76 @@ $ ollama pull llava:7b
 
 ## **3. 运行示例**
 
-以下所有的命令都需要在 ```llm``` 目录下运行.
+以下所有的命令都需要在 `llm` 目录下运行。
 
 ```shell
 $ cd llm
 ```
 
-生成你的本地配置 ```.env``` 文件。完成后,请根据实际情况编辑 ```.env``` 文件并设置相关参数。
+生成你的本地配置 `.env` 文件。完成后,请根据实际情况编辑 `.env` 文件并设置相关参数。
 
 ```shell
 # 复制环境模板文件(Windows用户可使用copy命令)
 $ cp .env.example .env
 ```
 
+### **配置说明**
+
+`.env` 文件支持配置多个模型,示例:
+
+```text
+# 支持配置多个模型,使用逗号分隔,支持带空格
+OLLAMA_MODELS = llava:7b, qwen2.5:7b
+OLLAMA_URL = http://localhost:11434
+NACOS_URL = nacos://localhost:8848
+TIME_OUT_SECOND = 300
+MAX_CONTEXT_COUNT = 3
+```
 
 ### **服务端运行**
 
-在服务端中集成 Ollama 模型,并使用 Dubbo-go 提供的 RPC 服务进行调用。
+服务端支持多实例部署,每个模型可以运行多个实例以提高服务能力。我们提供了便捷的启动脚本:
+
+**Linux/macOS**:
+```shell
+# 默认配置:每个模型运行2个实例,端口从20020开始
+$ ./start_servers.sh
 
-在服务端目录下运行:
+# 自定义配置:指定实例数量和起始端口
+$ ./start_servers.sh --instances 3 --start-port 20030
+```
 
+**Windows**:
 ```shell
-$ go run go-server/cmd/server.go
+# 默认配置:每个模型运行2个实例,端口从20020开始
+$ start_servers.bat
+
+# 自定义配置:指定实例数量和起始端口
+$ start_servers.bat --instances 3 --start-port 20030
 ```
 
 ### **客户端运行**
 
 客户端通过 Dubbo RPC 调用服务端的接口,获取 Ollama 模型的推理结果。
 
-在客户端目录下运行:
-
+命令行客户端:
 ```shell
 $ go run go-client/cmd/client.go
 ```
+支持多轮对话、命令交互、上下文管理功能。
 
-命令行客户端支持多轮对话、命令交互、上下文管理功能。
-
-我们也提供了包含前端页面的基于Gin框架的客户端进行交互,运行以下命令然后访问 ```localhost:8080``` 即可使用:
-
+Web 客户端:
 ```shell
 $ go run go-client/frontend/main.go
 ```
-
-包含前端页面的客户端支持多轮对话,支持进行二进制文件(图片)传输并与大模型进行交互。
-目前所支持上传的图片类型被限制为 png,jpeg 和 gif 类型,计划在将来支持更多的二进制文件类型。
+访问 `localhost:8080` 使用 Web 界面,支持:
+- 多轮对话
+- 图片上传(支持 png、jpeg、gif)
+- 多模型选择
 
 ### **注意事项**
 
-默认超时时间为两分钟,请确保您的电脑性能能在两分钟内生成相应的响应,否则会超时报错,您也可以在 ```.env``` 文件中自行设置超时时间。
\ No newline at end of file
+1. 默认超时时间为5分钟(可在 `.env` 中通过 `TIME_OUT_SECOND` 调整)
+2. 每个模型默认运行2个实例,可通过启动脚本参数调整
+3. 服务端会自动注册到 Nacos,无需手动指定端口
+4. 确保所有配置的模型都已通过 Ollama 下载完成
\ No newline at end of file
diff --git a/llm/config/config.go b/llm/config/config.go
index 475b6c01..5a0b1e22 100644
--- a/llm/config/config.go
+++ b/llm/config/config.go
@@ -30,12 +30,13 @@ import (
 )
 
 type Config struct {
-       OllamaModels []string
-       OllamaURL    string
-
+       OllamaModels    []string
+       OllamaURL       string
        TimeoutSeconds  int
        NacosURL        string
        MaxContextCount int
+       ModelName       string
+       ServerPort      int
 }
 
 var (
@@ -73,6 +74,36 @@ func Load(envFile string) (*Config, error) {
 
                config.OllamaModels = modelsList
 
+               modelName := os.Getenv("MODEL_NAME")
+               if modelName == "" {
+                       configErr = fmt.Errorf("MODEL_NAME environment variable 
is not set")
+                       return
+               }
+               modelName = strings.TrimSpace(modelName)
+               modelValid := false
+               for _, m := range modelsList {
+                       if m == modelName {
+                               modelValid = true
+                               break
+                       }
+               }
+               if !modelValid {
+                       configErr = fmt.Errorf("specified model %s is not in 
the configured models list", modelName)
+                       return
+               }
+               config.ModelName = modelName
+
+               portStr := os.Getenv("SERVER_PORT")
+               if portStr == "" {
+                       configErr = fmt.Errorf("Error: SERVER_PORT environment 
variable is not set\n")
+                       return
+               }
+               config.ServerPort, err = strconv.Atoi(portStr)
+               if err != nil {
+                       configErr = fmt.Errorf("Error converting SERVER_PORT to 
int: %v\n", err)
+                       return
+               }
+
                ollamaURL := os.Getenv("OLLAMA_URL")
                if ollamaURL == "" {
                        configErr = fmt.Errorf("OLLAMA_URL is not set")
@@ -94,10 +125,11 @@ func Load(envFile string) (*Config, error) {
 
                nacosURL := os.Getenv("NACOS_URL")
                if nacosURL == "" {
-                       configErr = fmt.Errorf("OLLAMA_URL is not set")
+                       configErr = fmt.Errorf("NACOS_URL is not set")
                        return
                }
                config.NacosURL = nacosURL
+
                maxContextStr := os.Getenv("MAX_CONTEXT_COUNT")
                if maxContextStr == "" {
                        config.MaxContextCount = defaultMaxContextCount
diff --git a/llm/go-client/cmd/client.go b/llm/go-client/cmd/client.go
index 31b13c38..575855c9 100644
--- a/llm/go-client/cmd/client.go
+++ b/llm/go-client/cmd/client.go
@@ -27,7 +27,9 @@ import (
 
 import (
        "dubbo.apache.org/dubbo-go/v3"
+       "dubbo.apache.org/dubbo-go/v3/client"
        _ "dubbo.apache.org/dubbo-go/v3/imports"
+       "dubbo.apache.org/dubbo-go/v3/logger"
        "dubbo.apache.org/dubbo-go/v3/registry"
 )
 
@@ -146,17 +148,24 @@ func main() {
 
        currentCtxID = createContext()
 
+       // TODO: support selecting model
        ins, err := dubbo.NewInstance(
                dubbo.WithRegistry(
                        registry.WithNacos(),
                        registry.WithAddress(cfg.NacosURL),
                ),
+               dubbo.WithLogger(
+                       logger.WithLevel("warn"),
+                       logger.WithZap(),
+               ),
        )
        if err != nil {
                panic(err)
        }
        // configure the params that only client layer cares
-       cli, err := ins.NewClient()
+       cli, err := ins.NewClient(
+               client.WithClientLoadBalanceRoundRobin(),
+       )
        if err != nil {
                panic(err)
        }
diff --git a/llm/go-client/frontend/handlers/chat.go 
b/llm/go-client/frontend/handlers/chat.go
index 45c5dfa4..fb8603e1 100644
--- a/llm/go-client/frontend/handlers/chat.go
+++ b/llm/go-client/frontend/handlers/chat.go
@@ -21,7 +21,6 @@ import (
        "context"
        "fmt"
        "io"
-       "log"
        "net/http"
        "regexp"
        "runtime/debug"
@@ -29,6 +28,7 @@ import (
 )
 
 import (
+       "github.com/dubbogo/gost/log/logger"
        "github.com/gin-contrib/sessions"
        "github.com/gin-gonic/gin"
 )
@@ -122,7 +122,7 @@ func (h *ChatHandler) Chat(c *gin.Context) {
        }
        defer func() {
                if err := stream.Close(); err != nil {
-                       log.Println("Error closing stream:", err)
+                       logger.Errorf("Error closing stream: %v", err)
                }
        }()
 
@@ -135,7 +135,7 @@ func (h *ChatHandler) Chat(c *gin.Context) {
        go func() {
                defer func() {
                        if r := recover(); r != nil {
-                               log.Printf("Recovered in stream processing: 
%v\n%s", r, debug.Stack())
+                               logger.Errorf("Recovered in stream processing: 
%v\n%s", r, debug.Stack())
                        }
                        close(responseCh)
                }()
@@ -144,13 +144,13 @@ func (h *ChatHandler) Chat(c *gin.Context) {
                for {
                        select {
                        case <-c.Request.Context().Done(): // client disconnect
-                               log.Println("Client disconnected, stopping 
stream processing")
+                               logger.Info("Client disconnected, stopping 
stream processing")
                                return
                        default:
                                if !stream.Recv() {
                                        if err := stream.Err(); err != nil {
                                                
c.JSON(http.StatusInternalServerError, gin.H{"error": err.Error()})
-                                               log.Printf("Stream receive 
error: %v", err)
+                                               logger.Errorf("Stream receive 
error: %v", err)
                                        }
                                        h.ctxManager.AppendMessage(ctxID, 
&chat.ChatMessage{
                                                Role:    "ai",
@@ -183,10 +183,10 @@ func (h *ChatHandler) Chat(c *gin.Context) {
                        c.SSEvent("message", gin.H{"content": chunk})
                        return true
                case <-time.After(time.Duration(timeout) * time.Second):
-                       log.Println("Stream time out")
+                       logger.Error("Stream time out")
                        return false
                case <-c.Request.Context().Done():
-                       log.Println("Client disconnected")
+                       logger.Error("Client disconnected")
                        return false
                }
        })
diff --git a/llm/go-client/frontend/main.go b/llm/go-client/frontend/main.go
index 31477d58..983b01f7 100644
--- a/llm/go-client/frontend/main.go
+++ b/llm/go-client/frontend/main.go
@@ -24,6 +24,7 @@ import (
 
 import (
        "dubbo.apache.org/dubbo-go/v3"
+       "dubbo.apache.org/dubbo-go/v3/client"
        _ "dubbo.apache.org/dubbo-go/v3/imports"
        "dubbo.apache.org/dubbo-go/v3/registry"
 
@@ -57,7 +58,9 @@ func main() {
                panic(err)
        }
        // configure the params that only client layer cares
-       cli, err := ins.NewClient()
+       cli, err := ins.NewClient(
+               client.WithClientLoadBalanceRoundRobin(),
+       )
 
        if err != nil {
                panic(fmt.Sprintf("Error creating Dubbo client: %v", err))
diff --git a/llm/go-server/cmd/server.go b/llm/go-server/cmd/server.go
index 73b87139..cd591083 100644
--- a/llm/go-server/cmd/server.go
+++ b/llm/go-server/cmd/server.go
@@ -21,16 +21,16 @@ import (
        "context"
        "encoding/base64"
        "fmt"
-       "log"
        "net/http"
        "runtime/debug"
 )
 
 import (
-       "dubbo.apache.org/dubbo-go/v3"
        _ "dubbo.apache.org/dubbo-go/v3/imports"
        "dubbo.apache.org/dubbo-go/v3/protocol"
        "dubbo.apache.org/dubbo-go/v3/registry"
+       "dubbo.apache.org/dubbo-go/v3/server"
+       "github.com/dubbogo/gost/log/logger"
        "github.com/tmc/langchaingo/llms"
        "github.com/tmc/langchaingo/llms/ollama"
 )
@@ -43,67 +43,43 @@ import (
 var cfg *config.Config
 
 type ChatServer struct {
-       llms map[string]*ollama.LLM
+       llm *ollama.LLM
 }
 
 func NewChatServer() (*ChatServer, error) {
+       if cfg.ModelName == "" {
+               return nil, fmt.Errorf("MODEL_NAME environment variable is not 
set")
+       }
 
-       llmMap := make(map[string]*ollama.LLM)
-
-       for _, model := range cfg.OllamaModels {
-               if model == "" {
-                       continue
-               }
-
-               llm, err := ollama.New(
-                       ollama.WithModel(model),
-                       ollama.WithServerURL(cfg.OllamaURL),
-               )
-               if err != nil {
-                       return nil, fmt.Errorf("failed to initialize model %s: 
%v", model, err)
-               }
-               llmMap[model] = llm
-               log.Printf("Initialized model: %s", model)
+       llm, err := ollama.New(
+               ollama.WithModel(cfg.ModelName),
+               ollama.WithServerURL(cfg.OllamaURL),
+       )
+       if err != nil {
+               return nil, fmt.Errorf("failed to initialize model %s: %v", 
cfg.ModelName, err)
        }
+       logger.Infof("Initialized model: %s", cfg.ModelName)
 
-       return &ChatServer{llms: llmMap}, nil
+       return &ChatServer{llm: llm}, nil
 }
 
 func (s *ChatServer) Chat(ctx context.Context, req *chat.ChatRequest, stream 
chat.ChatService_ChatServer) (err error) {
        defer func() {
                if r := recover(); r != nil {
-                       log.Printf("panic in Chat: %v\n%s", r, debug.Stack())
+                       logger.Errorf("panic in Chat: %v\n%s", r, debug.Stack())
                        err = fmt.Errorf("internal server error")
                }
        }()
 
-       if len(s.llms) == 0 {
-               return fmt.Errorf("no LLM models are initialized")
+       if s.llm == nil {
+               return fmt.Errorf("LLM model is not initialized")
        }
 
        if len(req.Messages) == 0 {
-               log.Println("Request contains no messages")
+               logger.Info("Request contains no messages")
                return fmt.Errorf("empty messages in request")
        }
 
-       modelName := req.Model
-       var llm *ollama.LLM
-
-       if modelName != "" {
-               var ok bool
-               llm, ok = s.llms[modelName]
-               if !ok {
-                       return fmt.Errorf("requested model '%s' is not 
available", modelName)
-               }
-       } else {
-               for name, l := range s.llms {
-                       modelName = name
-                       llm = l
-                       break
-               }
-               log.Printf("No model specified, using default model: %s", 
modelName)
-       }
-
        var messages []llms.MessageContent
        for _, msg := range req.Messages {
                var msgType llms.ChatMessageType
@@ -126,7 +102,7 @@ func (s *ChatServer) Chat(ctx context.Context, req 
*chat.ChatRequest, stream cha
                if msg.Bin != nil && len(msg.Bin) != 0 {
                        decodeByte, err := 
base64.StdEncoding.DecodeString(string(msg.Bin))
                        if err != nil {
-                               log.Printf("GenerateContent failed: %v\n", err)
+                               logger.Errorf("GenerateContent failed: %v\n", 
err)
                                return fmt.Errorf("GenerateContent failed: %v", 
err)
                        }
                        imgType := http.DetectContentType(decodeByte)
@@ -136,7 +112,7 @@ func (s *ChatServer) Chat(ctx context.Context, req 
*chat.ChatRequest, stream cha
                messages = append(messages, messageContent)
        }
 
-       _, err = llm.GenerateContent(
+       _, err = s.llm.GenerateContent(
                ctx,
                messages,
                llms.WithStreamingFunc(func(ctx context.Context, chunk []byte) 
error {
@@ -145,20 +121,21 @@ func (s *ChatServer) Chat(ctx context.Context, req 
*chat.ChatRequest, stream cha
                        }
                        return stream.Send(&chat.ChatResponse{
                                Content: string(chunk),
-                               Model:   modelName,
+                               Model:   cfg.ModelName,
                        })
                }),
        )
        if err != nil {
-               log.Printf("GenerateContent failed with model %s: %v\n", 
modelName, err)
-               return fmt.Errorf("GenerateContent failed with model %s: %v", 
modelName, err)
+               logger.Errorf("GenerateContent failed with model %s: %v\n", 
cfg.ModelName, err)
+               return fmt.Errorf("GenerateContent failed with model %s: %v", 
cfg.ModelName, err)
        }
 
+       logger.Infof("GenerateContent successfully with model: %s", 
cfg.ModelName)
+
        return nil
 }
 
 func main() {
-
        var err error
        cfg, err = config.GetConfig()
        if err != nil {
@@ -166,22 +143,17 @@ func main() {
                return
        }
 
-       ins, err := dubbo.NewInstance(
-               dubbo.WithRegistry(
+       srv, err := server.NewServer(
+               server.WithServerRegistry(
                        registry.WithNacos(),
                        registry.WithAddress(cfg.NacosURL),
                ),
-               dubbo.WithProtocol(
+               server.WithServerProtocol(
                        protocol.WithTriple(),
-                       protocol.WithPort(20000),
+                       protocol.WithPort(cfg.ServerPort),
                ),
        )
 
-       if err != nil {
-               panic(err)
-       }
-       srv, err := ins.NewServer()
-
        if err != nil {
                fmt.Printf("Error creating server: %v\n", err)
                return
diff --git a/llm/start_servers.bat b/llm/start_servers.bat
new file mode 100644
index 00000000..9c5ca773
--- /dev/null
+++ b/llm/start_servers.bat
@@ -0,0 +1,68 @@
+@echo off
+setlocal enabledelayedexpansion
+
+REM Default values
+set START_PORT=20020
+set INSTANCES_PER_MODEL=2
+
+REM Parse command line arguments
+:parse_args
+if "%~1"=="" goto :end_parse
+if "%~1"=="--instances" (
+    set INSTANCES_PER_MODEL=%~2
+    shift
+    shift
+    goto :parse_args
+)
+if "%~1"=="--start-port" (
+    set START_PORT=%~2
+    shift
+    shift
+    goto :parse_args
+)
+:end_parse
+
+REM Check if .env file exists
+if not exist .env (
+    echo Error: .env file not found
+    exit /b 1
+)
+
+REM Read and clean OLLAMA_MODELS from .env file
+for /f "usebackq tokens=*" %%a in (`powershell -Command "Get-Content .env | 
Select-String '^[[:space:]]*OLLAMA_MODELS[[:space:]]*=' | ForEach-Object { 
$_.Line -replace '^[^=]*=[[:space:]]*', '' -replace '^[\"'']+|[\"'']+$', '' 
}"`) do (
+    set MODELS_LINE=%%a
+)
+
+if "!MODELS_LINE!"=="" (
+    echo Error: OLLAMA_MODELS not found in .env file
+    echo Please make sure .env file contains a line like: OLLAMA_MODELS = 
llava:7b, qwen2.5:7b
+    exit /b 1
+)
+
+echo Found models: !MODELS_LINE!
+
+REM Split models and start servers
+set current_port=%START_PORT%
+
+for %%m in (!MODELS_LINE:,=^,!) do (
+    set "model=%%m"
+    set "model=!model: =!"
+    echo Processing model: !model!
+    
+    for /l %%i in (1,1,%INSTANCES_PER_MODEL%) do (
+        echo Starting server for model: !model! (Instance %%i, Port: 
!current_port!)
+        set MODEL_NAME=!model!
+        set SERVER_PORT=!current_port!
+        start /B cmd /c "go run go-server/cmd/server.go"
+        timeout /t 2 /nobreak >nul
+        set /a current_port+=1
+    )
+)
+
+set /a total_instances=0
+for %%a in (!MODELS_LINE:,=^,!) do set /a total_instances+=1
+set /a total_instances*=%INSTANCES_PER_MODEL%
+
+echo All servers started. Total instances: !total_instances!
+echo Use Ctrl+C to stop all servers.
+pause 
\ No newline at end of file
diff --git a/llm/start_servers.sh b/llm/start_servers.sh
new file mode 100755
index 00000000..882aa2f1
--- /dev/null
+++ b/llm/start_servers.sh
@@ -0,0 +1,87 @@
+#!/bin/bash
+
+# Check if .env file exists
+if [ ! -f .env ]; then
+    echo "Error: .env file not found"
+    exit 1
+fi
+
+# Default values
+START_PORT=20020
+INSTANCES_PER_MODEL=2
+
+# Parse command line arguments
+while [[ $# -gt 0 ]]; do
+    case $1 in
+        --instances)
+            INSTANCES_PER_MODEL=$2
+            shift 2
+            ;;
+        --start-port)
+            START_PORT=$2
+            shift 2
+            ;;
+        *)
+            echo "Unknown parameter: $1"
+            exit 1
+            ;;
+    esac
+done
+
+# Function to read value from .env file and clean it
+get_env_value() {
+    local key=$1
+    # Read the line containing the key, extract everything after the first =
+    local value=$(grep "^[[:space:]]*$key[[:space:]]*=" .env | sed 
's/^[^=]*=[[:space:]]*//')
+    # Remove leading/trailing whitespace and quotes
+    value=$(echo "$value" | sed -e 's/^[[:space:]]*//' -e 's/[[:space:]]*$//' 
-e 's/^["\x27]//' -e 's/["\x27]$//')
+    echo "$value"
+}
+
+# Get models from .env file
+OLLAMA_MODELS=$(get_env_value "OLLAMA_MODELS")
+
+# Check if OLLAMA_MODELS is empty
+if [ -z "$OLLAMA_MODELS" ]; then
+    echo "Error: OLLAMA_MODELS not found in .env file"
+    echo "Please make sure .env file contains a line like: OLLAMA_MODELS = 
llava:7b, qwen2.5:7b"
+    exit 1
+fi
+
+echo "Found models: $OLLAMA_MODELS"
+
+# Convert comma-separated string to array, handling spaces
+IFS=',' read -ra MODELS <<< "$OLLAMA_MODELS"
+
+current_port=$START_PORT
+
+# Function to start a model instance
+start_model_instance() {
+    local model=$1
+    local port=$2
+    local instance_num=$3
+    
+    # Clean the model name (remove leading/trailing spaces)
+    model=$(echo "$model" | sed -e 's/^[[:space:]]*//' -e 's/[[:space:]]*$//')
+    
+    echo "Starting server for model: $model (Instance $instance_num, Port: 
$port)"
+    export MODEL_NAME=$model
+    export SERVER_PORT=$port
+    go run go-server/cmd/server.go &
+    sleep 2
+}
+
+# Start instances for each model
+for model in "${MODELS[@]}"; do
+    echo "Processing model: $model"
+    for ((i=1; i<=INSTANCES_PER_MODEL; i++)); do
+        start_model_instance "$model" $current_port $i
+        ((current_port++))
+    done
+done
+
+echo "All servers started. Total instances: $((${#MODELS[@]} * 
INSTANCES_PER_MODEL))"
+echo "Use Ctrl+C to stop all servers."
+
+# Wait for all background processes
+wait 
\ No newline at end of file

Reply via email to