prateekm commented on a change in pull request #1008: SAMZA-2174: Throw a
record too large exception for oversized records in changelog
URL: https://github.com/apache/samza/pull/1008#discussion_r297859339
##########
File path: docs/learn/documentation/versioned/jobs/configuration-table.html
##########
@@ -1687,6 +1687,50 @@ <h1>Samza Configuration Reference</h1>
</td>
</tr>
+ <tr>
+ <td class="property"
id="stores-changelog-max-message-size-bytes">stores.<span
class="store">store-name</span>.changelog.max.message.size.bytes</td>
+ <td class="default">1000000</td>
+ <td class="description">
+ This property sets the maximum size of the messages
allowed in the changelog.
+ The default value is 1 MB.
+ </td>
+ </tr>
+
+ <tr>
+ <td class="property"
id="stores-expect-large-messages">stores.<span
class="store">store-name</span>.expect.large.messages</td>
+ <td class="default">false</td>
+ <td class="description">
+ This property, when turned on, tells the system to
expect large messages to be put in the stores.
+ It will then look out for any large messages greater
than
+ <a href="#stores-changelog-max-message-size-bytes"
class="property">stores.*.changelog.max.message.size.bytes</a>
+ and throw a SamzaException when it finds one, stating
that the record is too large.
+ In the case of using CachedStore, it will serialize
the message first, validate
+ its size and then cache it if the size is of
permissible limit.
+ This particular case of using CachedStore causes a
performance degradation since
+ we end up serializing every time before putting the
values in the cache.
+ When this property is turned on, we ignore the value of
+ <a href="#stores-drop-large-messages"
class="property">stores.*.drop.large.messages</a>.
+ The default value for this config is false. When this
property is not set,
+ <a href="#stores-drop-large-messages"
class="property">stores.*.drop.large.messages</a>
+ defines the behaviour to be executed.
+ </td>
+ </tr>
+
+ <tr>
+ <td class="property"
id="stores-drop-large-messages">stores.<span
class="store">store-name</span>.drop.large.messages</td>
+ <td class="default">false</td>
+ <td class="description">
+ This property, when turned on, tells the system to
drop any large messages instead of
Review comment:
s/"Tells the system"/"causes messages larger than ... to be dropped from the
underlying store and the changelog." Next line seems redundant, let's
remove/consolidate it.
CachedStore is implementation detail, not public API. Let's not refer to it
in documentation. Maybe "When store object cache is enabled" (or refer to the
object cache size config).
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]
With regards,
Apache Git Services