Thanks, this helps.  I'm looking into patching the BlurReducer so that
when a Row hits maxRecordsPerRow, it indexes what it can of a row - as
opposed to dropping it completely.  What's a better approach? :)

--tim

On Fri, May 3, 2013 at 10:44 AM, Aaron McCurry <[email protected]> wrote:
> BlurTask._maxRecordCount
>
> This is used for testing, so that you can exit a mapper after N number of
> records.
>
> BlurTask._maxRecordsPerRow
>
> This will increase the number of records in a single row.  Be careful with
> this option because this may run the reducer out of memory, I have a patch
> that I can apply that removes this limit but for now it's still a risky to
> increase this too large/
>
> BlurTask._ramBufferSizeMB
>
> This is the Lucene writer buffer, large values normally increase indexing
> throughput.
>
> Aaron
>
>
> On Fri, May 3, 2013 at 10:30 AM, Tim Williams <[email protected]> wrote:
>
>> I have an instance where I need to increase max records per row, but
>> before I do I want to understand the relationship (if there is one)
>> between:
>>
>> BlurTask._maxRecordCount
>> BlurTask._maxRecordsPerRow
>> BlurTask._ramBufferSizeMB
>>
>> I understand maxRecordsPerRow, but in looking into this found I don't
>> understand the _maxRecordCount and/or what interplay might exist with
>> buffer size.
>>
>> Thanks,
>> --tim
>>

Reply via email to