Thanks, Doug. We'll take a look at modifying this parameter.

And yes, we'd like to contribute this once we have the performance at an
acceptable level.

Thanks.

Jonathan

On Wed, May 6, 2009 at 12:55 PM, Doug Cutting <cutt...@apache.org> wrote:

> Jonathan Seidman wrote:
>
>> We've created an implementation of FileSystem which allows us to use
>> Sector
>> (http://sector.sourceforge.net/) as the backing store for Hadoop. This
>> implementation is functionally complete, and we can now run Hadoop
>> MapReduce
>> jobs against data  stored in Sector.
>>
>
> Please consider contributing this to Hadoop.
>
>  We're now looking at how to optimize
>> this interface, since the performance suffers considerably compared to MR
>> processing run against HDFS.
>>
>
> Have you tried setting mapred.min.split.size to a large value, so that
> files are not generally split?  Alternately, you might override
> FileInputFormat#computeSplitSize.
>
> Doug
>



-- 
Jonathan Seidman
Open Data Group

Reply via email to