>From a source I am getting stream data which size will not be known before the final processing, but the minimum is 10 GB. I have to send this large amount of data using `gRPC`.
Need to mention here, this `large amount data` will be passed through the `gRPC` while the processing of the `streaming` is done. In this step, I have thought to store all the value in a `vector`. ##### Regarding sending large amount of data I have tried to get idea and found: - [This](https://stackoverflow.com/questions/68644134/how-to-split-long-messages-into-short-messages-in-grpc-c) where it is mentioned not to pass large data using `gRPC`. Here, mentioned to use any other message protocol where I have limitation to use something else rather than `gRPC`(at least till today). - [From this post](https://stackoverflow.com/questions/60538700/how-could-i-send-message-of-many-fields-with-c-grpc-stream) I have tried to know how `chunk message` can be sent but I am not sure is it related to my problem or not. ##### My desire data format and approach what I have done so far `big_data.proto` ```sh syntax = "proto3"; package demo_grpc; message Streaming { repeated int32 data_collection = 1 [packed=true]; int32 index = 2; } ``` `main.proto` ```sh syntax = "proto3"; package demo_grpc; import "myproto/big_data.proto"; message S_Response { Streaming server_data_collection = 7; } message C_Request { Streaming client_data_collection = 4; } service AddressBook { rpc GetAddress(C_Request) returns (S_Response) {} } ``` `Filling the repeated field data_collection` ```cpp int64_t vector_size = 2ULL*1024*1024*1024; for(int64_t i = 0; i < vector_size; i++) { request_.mutable_client_data_collection()->add_data_collection(5); // 5 is a dummy data } ``` In abstract what I have thought: - As `gRPC` client by default cannot send message more than 4MB[[found info here](https://docs.microsoft.com/en-us/aspnet/core/grpc/security?view=aspnetcore-6.0#message-size-limits)] I have to use chunk. But unfortunately didn't get a starting point for this. - I have also found to use max size of message. A solution found [here](https://github.com/tensorflow/serving/issues/1382#issuecomment-730428996). Tried to pass it in `client side`. `client.cpp` ```cpp auto cargs = grpc::ChannelArguments(); cargs.SetMaxReceiveMessageSize(3ULL * 1024 * 1024 * 1024); // 3 GB cargs.SetMaxSendMessageSize(3ULL * 1024 * 1024 * 1024); auto channel = grpc::CreateCustomChannel(client_address,grpc::InsecureChannelCredentials(), cargs); ``` And `server.cpp` ```cpp grpc::ServerBuilder builder; builder.SetMaxReceiveMessageSize(3ULL * 1024 * 1024 * 1024); builder.SetMaxSendMessageSize(3ULL * 1024 * 1024 * 1024); builder.AddListeningPort(server_address, grpc::InsecureServerCredentials()); ``` - After executing the binary I have found two error message from server response: - The passed variable size from client to server is zero (can see [here](https://github.com/atifkarim/gRPC_CPP_CMake/blob/main/server/stream_response.h#L11)) - Error message ```sh Client make a stream data of size -2147483648 E0320 13:31:44.818390120 7560 channel_args.cc:258] grpc.max_send_message_length ignored: it must be >= -1 E0320 13:31:44.818512271 7560 channel_args.cc:258] grpc.max_receive_message_length ignored: it must be >= -1 E0320 13:31:44.818557739 7560 channel_args.cc:258] grpc.max_receive_message_length ignored: it must be >= -1 ``` #### Another [approach](https://nanxiao.me/en/message-length-setting-in-grpc/) find to send at a time whole data but also failed `client.cpp` ```cpp auto cargs = grpc::ChannelArguments(); cargs.SetMaxReceiveMessageSize(-1); // unlimited cargs.SetMaxSendMessageSize(-1); auto channel = grpc::CreateCustomChannel(client_address,grpc::InsecureChannelCredentials(), cargs); ``` `server.cpp` ```cpp grpc::ServerBuilder builder; builder.SetMaxReceiveMessageSize(LONG_MAX); builder.SetMaxSendMessageSize(LONG_MAX); builder.AddListeningPort(server_address, grpc::InsecureServerCredentials()); ``` `trial file to generate big data` ```cpp int64_t vector_size = 536870912; // Assumed it contains amount of integer of 2GB size. (2*1024*1024*1024*8)/32. for(int64_t i = 0; i < vector_size; i++) { request_.mutable_client_data_collection()->add_data_collection(5); } ``` During building got warning ```sh /server/server.cpp:117:35: warning: implicit conversion from 'long' to 'int' changes value from 9223372036854775807 to -1 [-Wconstant-conversion] builder.SetMaxReceiveMessageSize(LONG_MAX); ~~~~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~ /usr/lib/llvm-10/lib/clang/10.0.0/include/limits.h:47:19: note: expanded from macro 'LONG_MAX' #define LONG_MAX __LONG_MAX__ ^~~~~~~~~~~~ <built-in>:94:22: note: expanded from here #define __LONG_MAX__ 9223372036854775807L ^~~~~~~~~~~~~~~~~~~~ /server/server.cpp:118:32: warning: implicit conversion from 'long' to 'int' changes value from 9223372036854775807 to -1 [-Wconstant-conversion] builder.SetMaxSendMessageSize(LONG_MAX); ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~ /usr/lib/llvm-10/lib/clang/10.0.0/include/limits.h:47:19: note: expanded from macro 'LONG_MAX' #define LONG_MAX __LONG_MAX__ ^~~~~~~~~~~~ <built-in>:94:22: note: expanded from here #define __LONG_MAX__ 9223372036854775807L ``` And got error response ```sh Client make a stream data of size 536870912 [libprotobuf ERROR /user/grpc/third_party/protobuf/src/google/protobuf/message_lite.cc:410] demo_grpc.C_Request exceeded maximum protobuf size of 2GB: 2825726993 E0320 17:39:46.862406307 1551158 call_op_set.h:317] assertion failed: serializer_(msg_).ok() Aborted (core dumped) ``` ##### My questions are now: - If I need to send that amount of large data is it suitable to pass via `gRPC`? if Yes, do I need to `chunk` them or at a time I can send? - If I use `chunk` then will it be `bidirectional stream` or it is a `Unary rpc`(Mentioning again it will not be a `ping-pong`. The Whole data will pass from client to server, processed in server and then returned from server to client). As I have no idea related to `chunk` cannot visualize it. - If I have to pass this large data using stream I assume I have to define an `rpc service` as like as follows ```cpp service AddressBook { rpc GetAddress(stream C_Request) returns (stream S_Response) {} } ``` But in this time can I use any other `Unary rpc`? I have another rpc running which is in nature Unary and written in the same `proto` file. Solution with example would be really helpful. The [github repo is available here](https://github.com/atifkarim/gRPC_CPP_CMake). -- You received this message because you are subscribed to the Google Groups "grpc.io" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. To view this discussion on the web visit https://groups.google.com/d/msgid/grpc-io/2ddff0db-d5bb-4e80-99f0-6071fc336d1dn%40googlegroups.com.
