[ https://issues.apache.org/jira/browse/ORC-228?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16134314#comment-16134314 ]
Lefty Leverenz commented on ORC-228: ------------------------------------ Does the new config (orc.rows.between.memory.checks) need to be documented if it's mainly for testing? > Make MemoryManagerImpl.ROWS_BETWEEN_CHECKS configurable > ------------------------------------------------------- > > Key: ORC-228 > URL: https://issues.apache.org/jira/browse/ORC-228 > Project: ORC > Issue Type: Improvement > Reporter: Eugene Koifman > Assignee: Eugene Koifman > Fix For: 1.5.0 > > > currently addedRow() looks like > {noformat} > public void addedRow(int rows) throws IOException { > rowsAddedSinceCheck += rows; > if (rowsAddedSinceCheck >= ROWS_BETWEEN_CHECKS) { > notifyWriters(); > } > } > {noformat} > it would be convenient for testing to set ROWS_BETWEEN_CHECKS to a low value > so that we can generate multiple stripes with very little data. > Currently the only way to do this is to create a new MemoryManager that > overrides this method and install it via OrcFile.WriterOptions but this only > works when you have control over creating the Writer. > For example > _org.apache.hadoop.hive.ql.io.orc.TestOrcRawRecordMerger.testRecordReaderNewBaseAndDelta()_ > There is no way to do this via some set of config params to make Hive query > for example, create multiple stripes with little data. -- This message was sent by Atlassian JIRA (v6.4.14#64029)