I agree with your good mentions. (This argue is for our team. ^^) However, In cloud computing environments, It goes direction closing to immutable and share-noting architecture. However two thing you mentioned (i,ii) are also significantly considered in Hadoop MR. For example, straggler finding mechanism for heterogonous system. I think we need more specific reasons of HAMA BSP. I hope our introduced framework base on BSP has good advantages though exploiting the differentiality with legacy BSP and will-convergence with Hadoop.
Thanks. Best Regards, Sangwon Seo. -----Original Message----- From: Hyunsik Choi [mailto:[email protected]] Sent: Saturday, December 12, 2009 11:40 AM To: hama-dev Subject: Re: Please review a document about the BSP package Hi, HAMA aims at a scientific package on cloud computing environment. As you know, cloud computing environment is not business or trends keyword. It is inevitable movement because cloud computing provides business agility and it is very important to some kinds of companies. Many companies already are using this environment. Acutually, now our BSP still is not different to traditional BSP. However, the BSP on Hadoop is an trial to develop another distributed computing model on cloud computing environment. In addition, we know that our BSP has to go which direction. In cloud computing environment, computing resource is readily allocated without interference. Thus, cloud computing environment needs to be based on shared-nothing architecture for easy assigning independent computing resource connected in network. In shared-nothing architecture, transactional operations that execute frequent read and write are not appropriate since it is hard to achieve requirements for transaction. Consequently, data processing should be analytical processing (mostly read processing) in this environment. In order to develop some design in this environment, we have to consider some important characteristics of cloud computing and analytical processing. i) Analytical processing usually takes several hours or days. Therefore, we have to consider fault tolerance if we don't want to restart long time job. ii) The capabilities of cluster nodes on cloud may not be even. In other words, their hardware specifications may be different to each other. In this situation, the processing will be continuous until jobs on the slowest computer instance are terminated. MR has a mechanism for this nature. Above characteristics have to be considered in cloud computing environment. Now, our BSP is still not good at above characteristic and is still too simple framework. However, we have some ideas for above characteristics. Later, we try to achieve above features and improve further things. Best regards, -- Hyunsik Choi Database & Information Systems Group, Korea Univ. http://diveintodata.org On Fri, Dec 11, 2009 at 4:55 PM, Sangwon Seo <[email protected]> wrote: > Cool~! > Are there any differences with traditional BSP? > For example (as expected from others), Map Reduce also has good advantages, > so if application is not fit with MR, It's better to use other method such as MPI/OpenMP and one of other BSP packages. > That is, is there something meaningful of comparison between BSP and MR ? > (This question is also from prof.Kim before) > I mean why don't you add this answer in slides. > > Thanks. > > Best Regards, > Sangwon Seo. > > Sangwon Seo > Computer Architecture Lab, > Computer Science Department, School of EECS, > KAIST (Korea Advanced Institute of Science and Technology), Republic of Korea. > http://smiler-note.blogspot.com > > > > -----Original Message----- > From: Edward J. Yoon [mailto:[email protected]] > Sent: Friday, December 11, 2009 11:31 PM > To: [email protected] > Subject: Please review a document about the BSP package > > Hello all, > > I've worked on documenting the BSP package with Hyunsick Choi -- > http://docs.google.com/present/view?id=dc8qtchr_1f44rjccd > > Any advices are welcome. > -- > Best Regards, Edward J. Yoon @ NHN, corp. > [email protected] > http://blog.udanax.org > >
