Thanks Deanna,
I actually figured out what the problem was.
The function factored_joint_mvn in
tensorflow_probability/python/sts/internal/util.py uses
LinearOperatorBlockDiag, without passing a name argument.
This in turn, if no name is provided, simply concatenates all the names of
all input operators.
```
if name is None:
# Using ds to mean direct sum.
name = "_ds_".join(operator.name for operator in operators
```
Because we have a rather large state space model, this was resulting in
names in the graph which were enormous (literally 10s of 1000s of
characters). This across a large graph meant ridiculous protobuf messages.
This was the only thing sending the protobuf message lengths sky high.
Using our own internal equivalent and assigning a name attribute reduced
the protobuf message sizes to normal levels and reduced the memory of the
process when running by several GB.
Hope that's helpful to someone else who hits the same problem one day!
Gareth
On Monday, 2 May 2022 at 19:16:12 UTC+1 [email protected] wrote:
> In general, the best ways to get around the message limit are to either
> split the messages and then recombine them or streaming. See this
> documentation
> <https://developers.google.com/protocol-buffers/docs/techniques#large-data>
> for
> more information. Since this is through Tensorflow some of these techniques
> might not work, so reaching out to them directly is probably a good idea.
>
> On Saturday, April 9, 2022 at 6:59:07 AM UTC-7 [email protected] wrote:
>
>> Hi there,
>>
>> I'm creating a model in Tensorflow, which I think is erroring because
>> it's hitting the protobuf message hard limit when compiling with
>> `@tf.function`.
>>
>> I printed the message length just before the error and it's 2217676732,
>> so it makes sense that it's hitting the 2GB limit.
>>
>> I'm just wondering, is there any way to get around this that anyone knows
>> of? I read about Cap'n Proto here https://stackoverflow.com/a/34186672 but
>> am not sure how I'd be able to connect this to TensorFlow.
>>
>> Is there any way to split/recombine the messages? Perhaps this is more of
>> a TensorFlow question, I'll perhaps cross-post on the TensorFlow group in
>> case anyone there has ideas.
>>
>> Many thanks,
>> Gareth
>>
>
--
You received this message because you are subscribed to the Google Groups
"Protocol Buffers" group.
To unsubscribe from this group and stop receiving emails from it, send an email
to [email protected].
To view this discussion on the web visit
https://groups.google.com/d/msgid/protobuf/0fb79abf-5c70-4dc7-89b5-efef9e9d1208n%40googlegroups.com.