>> Are we sure JNI is a real problem?  It really seems like the right tool 
>> for the job.  Greg seems to remember them asking who would maintain the 
>> (non-java) JNI bits, but even if that's us and not them (which is probably 
>> the way to go anyway), I don't see that that's a problem.
> 
> Yeh, it's sort of a wash. A nice goal would be to have a patch that allowed 
> Hadoop to not require any additional components (i.e. JNI packages) from the 
> Ceph repository. Given that the Ceph infrastructure will be installed anyway 
> in the case of Hadoop, it's a bit of a toss up.

The JNI isn't very _fun_ to develop, but it does do the job just fine and with 
the expected pattern of using a stable interface, with nothing extravagant 
needed for either Hadoop or Ceph.  Hadoop already has JNI pieces, so adding 
more shouldn't be a problem (though I do wish the automake part wasn't so 
awkward to approach).

I suppose there will need to be some automated check for Ceph as part of the 
ant build process.

> 
> -n
> 
>> Let's start with just providing the primary replica, at least until we 
>> find out whether hadoop takes advantage of additional ones (does HDFS read 
>> from the local non-primary replica?).
> 
> I believe that Hadoop will schedule a map job on at a local replica for load 
> balancing, or to duplicate the work when a map is running slowly. Joe, can 
> you confirm this?
> 
When I ran my basic evaluation, Hadoop was reporting its locality results as 
about 75% of jobs being run on the same node as the data.  This seemed to be a 
result of overloading nodes.  Someone will need to run a proper evaluation, as 
my experiment was small and blew up when I expanded my test cluster.  It was 
probably a misconfigured kernel upgrade or something else uninteresting that's 
irrelevant here.

--Alex--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to [email protected]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Reply via email to