Yullin opened a new issue, #9804: URL: https://github.com/apache/apisix/issues/9804
### Current Behavior **Scenario:** The company is working on a large model, similar to a chatbot-type service like ChatGPT. The development team has set up a service and is accessing it using the following code: ``` prompt = "xxxxxxxxx" url = "http://10.0.0.3:5001/api/stream/chat" data = { "prompt": prompt, "max_new_tokens": 1024, "max_context_tokens": 5120, "temperature": 0, "history": [], "new_session": False, } chunks = requests.post(url=url, json=data, stream=True) answer = "" for chunk in chunks.iter_content(chunk_size=None): if chunk: try: chunk = json.loads(chunk.decode('utf-8')) print(chunk) except Exception as e: print("----------------error---------------") print(chunk) print(e) print("----------------error---------------") continue ``` **Problem:** In normal circumstances (accessing the development service directly), the JSON data is printed out one by one. But after going through OpenResty reverse proxy, the first few JSON data is still printed out one by one, and the last two JSON data are returned together as a single string ("{"text":"...."}{"text":"...."}"), causing json.loads to fail. **Configuration:** ``` "value": { "host": "t.example.com", "methods": [ "GET", "POST", "PUT", "DELETE", "HEAD", "OPTIONS", "PATCH" ], "plugins": { }, "plugins_check": "other", "priority": 1, "status": 1, "timeout": { "connect": 60, "read": 60, "send": 60 }, "upstream": { "hash_on": "vars", "name": "nodes", "nodes": { "10.0.0.3:5001": 1 }, "pass_host": "pass", "scheme": "http", "type": "roundrobin" }, "uri": "/*" } ``` I have tested with HAProxy and there are no such issues. ### Expected Behavior the JSON data is printed out one by one. and I have posted this issue to openresty(https://github.com/openresty/openresty/issues/914) , maybe people here can fix this ### Error Logs _No response_ ### Steps to Reproduce 1. Set up a large model as a service, like CHATGPT, or other Server Side Event-stream service 2. Run apisix via the docker image 3. Create a route with the admin api, set the large model service as a backend 4. Access with the following code: ``` prompt = "xxxxxxxxx" url = "http://{apisix_ip_addr}/api/stream/chat" data = { "prompt": prompt, "max_new_tokens": 1024, "max_context_tokens": 5120, "temperature": 0, "history": [], "new_session": False, } chunks = requests.post(url=url, json=data, stream=True) answer = "" for chunk in chunks.iter_content(chunk_size=None): if chunk: try: chunk = json.loads(chunk.decode('utf-8')) print(chunk) except Exception as e: print("----------------error---------------") print(chunk) print(e) print("----------------error---------------") continue ``` ### Environment - APISIX version (run `apisix version`): 2.15.3 - Operating system (run `uname -a`): - OpenResty / Nginx version (run `openresty -V` or `nginx -V`): - etcd version, if relevant (run `curl http://127.0.0.1:9090/v1/server_info`): - APISIX Dashboard version, if relevant: - Plugin runner version, for issues related to plugin runners: - LuaRocks version, for installation issues (run `luarocks --version`): -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
