Sent from wrong account, so it bounced. Sorry for resending it.
On Jun 13, 2012, at 3:54 PM, Ralph Castain wrote:
> Here is my initial cut at the API.
>
> typedef enum {
> GRIDENGINE_RESOURCE_FILE,
> GRIDENGINE_RESOURCE_PPN,
> GRIDENGINE_RESOURCE_NUM_NODES
> } gridengine_resource_type_t;
>
> typedef struct {
> gridengine_resource_type_t type;
> char *constraint;
> } gridengine_resources_t;
>
> typedef void (*gridengine_allocation_cbfunc_t)(char *nodes);
>
> int gridengine_allocate(gridengine_resources_t *array, int num_elements,
> gridengine_allocation_cbfunc_t cbfunc);
>
> The function call is non-blocking, returning 0 if the request was
> successfully registered and < 0 if there was an error. The input params
> include an array of structs that contain the specification, an integer
> declaring how many elements are in the array, and the callback function to be
> called once the allocation is available.
>
> The callback function delivers a comma-delimited string of node names (the
> receiver is responsible for releasing the string). The string is NULL if the
> allocation could not be done - e.g., if one of the files cannot be located,
> or we lack permission to access it.
>
> We can extend supported constraints over time, but I could see three for now:
>
> 1. ppn, indicating how many "slots" are required on each node. Default is 1.
>
> 2. file, specifying a filename to be accessible by the application. The
> filename will be given in URI form - I can see a couple of variants we need
> supported for now:
>
> (a) "hdfs://path-to-file" - need to get file shard locations from hdfs. There
> are two ways to do this today - I've given Rayson copies of both methods. One
> uses Java and works with all branches of HDFS. The other uses C and only
> works with Hadoop 2.0 (branch-2 of the Apache Hadoop code base).
>
> (b) "nfs://path" - network file system, should be available anywhere.
>
> (c) "file://path" - a file on the local file system, ORTE will take
> responsibility for copying it to the remote nodes.
>
> I'm sure we'll think of others - note that (b) and (c) are equivalent to "no
> constraint" on the allocator in terms of files. In such cases, you can
> essentially ignore that constraint.
>
> 3. num_nodes - specify the total number of nodes to be allocated. If ppn is
> given and no num_nodes, then we want slots allocated on every node that meets
> the rest of the constraints.
>
> Comments welcome - this is just a first cut!
> Ralph
>
>
>
> On Jun 12, 2012, at 12:29 PM, Ralph Castain wrote:
>
>> I'll be happy to do so - I'm running behind as I've been hit with a high
>> priority issue. I hope to get back to this in the not-too-distant future
>> (probably August).
>>
>> Version 2.0 is in alpha now - no announced schedule for release, but I would
>> guess the August time frame again.
>>
>> MR+ doesn't replace HDFS, but replaces the "Job" class underneath Hadoop MR
>> so that it uses the OMPI infrastructure instead of YARN - this allows it to
>> run on any HPC system, and to use MPI. So people can write their mappers and
>> reducers with MPI calls in them (remember, OMPI now includes Java bindings!).
>>
>> The current version of MR+ (in the OMPI trunk) let's you write and execute
>> your own mappers and reducers, but does not support the Hadoop MR classes.
>> So if you want to write something that uses your own algorithms, you can do
>> so. There is a "word-count" example in the "examples" directory of the OMPI
>> code base.
>>
>> Ralph
>>
>>
>> On Jun 5, 2012, at 12:10 AM, Ron Chen wrote:
>>
>>> Ralph, when you your C HDFS API ready, can you please let us know? And do
>>> you know when Apache Hadoop is going to release version 2.0?
>>>
>>>
>>> -Ron
>>>
>>>
>>>
>>> ----- Original Message -----
>>> From: Rayson Ho <[email protected]>
>>> To: Prakashan Korambath <[email protected]>
>>> Cc: Ralph Castain <[email protected]>; Ron Chen <[email protected]>;
>>> "[email protected]" <[email protected]>
>>> Sent: Monday, June 4, 2012 1:53 PM
>>> Subject: Re: [gridengine users] Hadoop Integration HOWTO (was: Hadoop
>>> Integration - how's it going)
>>>
>>> Prakashan,
>>>
>>> Ralph mentioned to me before that the C API bindings will be available
>>> in Apache 2.0, which adds Google protocol buffers as one of the new
>>> features and thus supports non-Java HDFS bindings.
>>>
>>> AFAIK, EMC MapR replaces HDFS with something that has more HA features
>>> & performance. I don't know all the specific details but I do believe
>>> that most of the API interfaces are going to be the same as or very
>>> similar to the existing HDFS APIs.
>>>
>>> Rayson
>>>
>>>
>>>
>>> On Mon, Jun 4, 2012 at 1:24 PM, Prakashan Korambath <[email protected]>
>>> wrote:
>>>> Hi Rayson,
>>>>
>>>> Let me know why you have C API bindings from Ralph ready. I can help you
>>>> guys with testing it out.
>>>>
>>>> Prakashan
>>>>
>>>>
>>>> On 06/04/2012 10:17 AM, Rayson Ho wrote:
>>>>>
>>>>> Hi Prakashan& Ron,
>>>>>
>>>>> I thought about this issue while I was writing& testing the HOWTO...
>>>>>
>>>>> but I didn't spend much more time on it as I needed to work on
>>>>> something else, and it requires an upcoming C API binding for HDFS
>>>>> from Ralph. Plus... I didn't want to pre-announce too many upcoming
>>>>> new features. :-)
>>>>>
>>>>> With the architecture of Prakashan's On-demand Hadoop Cluster, we can
>>>>> take advantage of Ralph's C HDFS API, and we can then easily write a
>>>>> scheduler plugin that queries HDFS block information. This scheduler
>>>>> plugin then affects scheduling decision such that Open Grid
>>>>> Scheduler/Grid Engine can send jobs to the data, which IMO is the core
>>>>> idea behind Hadoop - scheduling jobs& tasks to the data.
>>>>>
>>>>>
>>>>> Note that we will also need to productionize the "Parallel Environment
>>>>> Queue Sort (PQS) Scheduler API", which was under technology preview in
>>>>> GE 2011.11:
>>>>>
>>>>> http://gridscheduler.sourceforge.net/Releases/ReleaseNotesGE2011.11.pdf
>>>>>
>>>>> Rayson
>>>>>
>>>>>
>>>>>
>>>>> On Mon, Jun 4, 2012 at 12:55 PM, Prakashan Korambath<[email protected]>
>>>>> wrote:
>>>>>>
>>>>>> Hi Ron,
>>>>>>
>>>>>> I don't have anything planned beyond what I released right now. Idea is
>>>>>> to
>>>>>> let what Hadoop does best to Hadoop and what SGE or any scheduler does
>>>>>> best
>>>>>> to the scheduler. I believe somebody from SDSC also released similar
>>>>>> strategy for PBS/Torque. I worked only on the SGE because I mostly use
>>>>>> SGE.
>>>>>>
>>>>>> Prakashan
>>>>>>
>>>>>>
>>>>>>
>>>>>> On 06/04/2012 09:45 AM, Ron Chen wrote:
>>>>>>>
>>>>>>>
>>>>>>> Hi Prakashan,
>>>>>>>
>>>>>>>
>>>>>>> I am trying to understand your integration, and it looks like Ravi
>>>>>>> Chandra
>>>>>>> Nallan's Hadoop Integration.
>>>>>>>
>>>>>>> One of the improvements in Daniel Templeton's Hadoop Integration is he
>>>>>>> models HDFS data as resources, and thus can schedule jobs to data. Is
>>>>>>> scheduling jobs to data a planned feature of your "On-Demand Hadoop
>>>>>>> Cluster"
>>>>>>> integration?
>>>>>>>
>>>>>>> For those who didn't know Ravi Chandra Nallan, he was with Sun Micro
>>>>>>> when
>>>>>>> he developed the integration. Last I checked, he was with Oracle.
>>>>>>>
>>>>>>> -Ron
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> ----- Original Message -----
>>>>>>> From: Rayson Ho<[email protected]>
>>>>>>> To: Prakashan Korambath<[email protected]>
>>>>>>> Cc: "[email protected]"<[email protected]>
>>>>>>> Sent: Friday, June 1, 2012 3:04 PM
>>>>>>> Subject: Re: [gridengine users] Hadoop Integration HOWTO (was: Hadoop
>>>>>>> Integration - how's it going)
>>>>>>>
>>>>>>> Thanks again Prakashan for the contribution!
>>>>>>>
>>>>>>> Rayson
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> On Fri, Jun 1, 2012 at 1:25 PM, Prakashan Korambath<[email protected]>
>>>>>>> wrote:
>>>>>>>>
>>>>>>>>
>>>>>>>> Thank you Rayson! Appreciate you taking time and upload the tar files
>>>>>>>> and
>>>>>>>> writing the howto.
>>>>>>>>
>>>>>>>> Regards,
>>>>>>>>
>>>>>>>> Prakashan
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> On 06/01/2012 10:19 AM, Rayson Ho wrote:
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> I've reviewed the integration, and wrote a short Grid Engine Hadoop
>>>>>>>>> HOWTO:
>>>>>>>>>
>>>>>>>>> http://gridscheduler.sourceforge.net/howto/GridEngineHadoop.html
>>>>>>>>>
>>>>>>>>> The difference between the 2 methods (original SGE 6.2u5 vs
>>>>>>>>> Prakashan's) is that with Prakashan's approach, Grid Engine is used
>>>>>>>>> for resource allocation, and the Hadoop job scheduler/Job Tracker is
>>>>>>>>> used to handle all the MapReduce operations. A Hadoop cluster is
>>>>>>>>> created on demand with Prakashan's approach, but in the original SGE
>>>>>>>>> 6.2u5 method Grid Engine replaces the Hadoop job scheduler.
>>>>>>>>>
>>>>>>>>> As standard Grid Engine PEs are used in this new approach, one can
>>>>>>>>> call "qrsh -inherit" and use Grid Engine's method to start Hadoop
>>>>>>>>> services on remote nodes, and thus get full job control, job
>>>>>>>>> accounting, and cleanup at terminate benefits like any other tight PE
>>>>>>>>> jobs!
>>>>>>>>>
>>>>>>>>> Rayson
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> On Tue, May 29, 2012 at 10:36 AM, Prakashan
>>>>>>>>> Korambath<[email protected]>
>>>>>>>>> wrote:
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> I put my scripts in a tar file and send it to Rayson yesterday so
>>>>>>>>>> that
>>>>>>>>>> he
>>>>>>>>>> can put it in a common place to download.
>>>>>>>>>>
>>>>>>>>>> Prakashan
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> On 05/29/2012 07:18 AM, Jesse Becker wrote:
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> On Mon, May 28, 2012 at 12:00:24PM -0400, Prakashan
>>>>>>>>>>> Korambath wrote:
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> This is how we run hadoop using Grid Engine (for that matter
>>>>>>>>>>>> any scheduler with appropriate alteration)
>>>>>>>>>>>>
>>>>>>>>>>>> http://www.ats.ucla.edu/clusters/hoffman2/hadoop/default.htm
>>>>>>>>>>>>
>>>>>>>>>>>> Basically, run either a prolog or call a script inside the
>>>>>>>>>>>> submission command file itself to parse the output of
>>>>>>>>>>>> PE_HOSTFILE to create hadoop *.site.xml, masters and slaves
>>>>>>>>>>>> files at run time. This methodology is suitable for any
>>>>>>>>>>>> scheduler as it is not dependent on them. If there is
>>>>>>>>>>>> interest I can post the prologue script. Thanks.
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> Please do.
>>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>>>> _______________________________________________
>>>>>>> users mailing list
>>>>>>> [email protected]
>>>>>>> https://gridengine.org/mailman/listinfo/users
>>>>>>>
>>>>>>
>>>>
>>>
>>
>
_______________________________________________
users mailing list
[email protected]
https://gridengine.org/mailman/listinfo/users