sthetland commented on a change in pull request #12143:
URL: https://github.com/apache/druid/pull/12143#discussion_r784389835



##########
File path: docs/ingestion/ingestion-spec.md
##########
@@ -463,7 +463,7 @@ is:
 |-----|-----------|-------|
 |type|Each ingestion method has its own tuning type code. You must specify the 
type code that matches your ingestion method. Common options are `index`, 
`hadoop`, `kafka`, and `kinesis`.||
 |maxRowsInMemory|The maximum number of records to store in memory before 
persisting to disk. Note that this is the number of rows post-rollup, and so it 
may not be equal to the number of input records. Ingested records will be 
persisted to disk when either `maxRowsInMemory` or `maxBytesInMemory` are 
reached (whichever happens first).|`1000000`|
-|maxBytesInMemory|The maximum aggregate size of records, in bytes, to store in 
the JVM heap before persisting. This is based on a rough estimate of memory 
usage. Ingested records will be persisted to disk when either `maxRowsInMemory` 
or `maxBytesInMemory` are reached (whichever happens first). `maxBytesInMemory` 
also includes heap usage of artifacts created from intermediary persists. This 
means that after every persist, the amount of `maxBytesInMemory` until next 
persist will decreases, and task will fail when the sum of bytes of all 
intermediary persisted artifacts exceeds `maxBytesInMemory`.<br /><br />Setting 
maxBytesInMemory to -1 disables this check, meaning Druid will rely entirely on 
maxRowsInMemory to control memory usage. Setting it to zero means the default 
value will be used (one-sixth of JVM heap size).<br /><br />Note that the 
estimate of memory usage is designed to be an overestimate, and can be 
especially high when using complex ingest-time aggregators, including sk
 etches. If this causes your indexing workloads to persist to disk too often, 
you can set maxBytesInMemory to -1 and rely on maxRowsInMemory 
instead.|One-sixth of max JVM heap size|
+|maxBytesInMemory|The maximum aggregate size of records, in bytes, to store in 
the JVM heap before persisting. This is based on a rough estimate of memory 
usage. Ingested records will be persisted to disk when either `maxRowsInMemory` 
or `maxBytesInMemory` are reached (whichever happens first). `maxBytesInMemory` 
also includes heap usage of artifacts created from intermediary persists. This 
means that after every persist, the amount of `maxBytesInMemory` until the next 
persist will decrease. If the sum of bytes of all intermediary persisted 
artifacts exceeds `maxBytesInMemory` the task fails<br /><br />Setting 
maxBytesInMemory to -1 disables this check, meaning Druid will rely entirely on 
maxRowsInMemory to control memory usage. Setting it to zero means the default 
value will be used (one-sixth of JVM heap size).<br /><br />Note that the 
estimate of memory usage is designed to be an overestimate, and can be 
especially high when using complex ingest-time aggregators, including sketch
 es. If this causes your indexing workloads to persist to disk too often, you 
can set maxBytesInMemory to -1 and rely on maxRowsInMemory instead.|One-sixth 
of max JVM heap size|

Review comment:
       Looks better. Just a suggestion for a missing period and format 
consistency. 
   
   ```suggestion
   |maxBytesInMemory|The maximum aggregate size of records, in bytes, to store 
in the JVM heap before persisting. This is based on a rough estimate of memory 
usage. Ingested records will be persisted to disk when either `maxRowsInMemory` 
or `maxBytesInMemory` are reached (whichever happens first). `maxBytesInMemory` 
also includes heap usage of artifacts created from intermediary persists. This 
means that after every persist, the amount of `maxBytesInMemory` until the next 
persist will decrease. If the sum of bytes of all intermediary persisted 
artifacts exceeds `maxBytesInMemory` the task fails.<br /><br />Setting 
`maxBytesInMemory` to -1 disables this check, meaning Druid will rely entirely 
on `maxRowsInMemory` to control memory usage. Setting it to zero means the 
default value will be used (one-sixth of JVM heap size).<br /><br />Note that 
the estimate of memory usage is designed to be an overestimate, and can be 
especially high when using complex ingest-time aggregators, including
  sketches. If this causes your indexing workloads to persist to disk too 
often, you can set `maxBytesInMemory` to -1 and rely on `maxRowsInMemory` 
instead.|One-sixth of max JVM heap size|
   ```




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]



---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to