Re: [ANNOUNCEMENT] A query system for BSP processing

Thomas Jungblut Fri, 07 Sep 2012 08:40:56 -0700

Although I think this is a great project, I think that you will not meet
the requirements.
You need a community and a charter to get it into the incubation.


What about hosting it on Github?

2012/9/7 Leonidas Fegaras <[email protected]>

> Yes, this is a great idea. I have used GIT on my own server but I don't
> know how to do this for ASF. Could you please send me a link for setting up
> an open-source Apache project?
>
>
> On 09/05/2012 10:51 AM, Edward J. Yoon wrote:
>
>> If you can open source this then I'm sure the ASF community can help
>> you and make this software better.
>>
>> Pls feel free to ask us if you need any assistance donating source
>> code to the ASF or contributing to the Hama project in the future.
>>
>> On Thu, Aug 30, 2012 at 11:40 PM, Leonidas Fegaras<[email protected]>
>>  wrote:
>>
>>> Yes sure. I have fixed the bug with the repeat stopping condition but I
>>> have
>>> only tested pagerank on my small cluster. I still need to fix the k-means
>>> clustering (it's a special case because you improve a fixed number of
>>> points).
>>> Leonidas
>>>
>>>
>>> On Aug 30, 2012, at 9:02 AM, Edward J. Yoon wrote:
>>>
>>>  Shall we work together?
>>>>
>>>> On Fri, Aug 24, 2012 at 9:01 PM, Leonidas Fegaras<[email protected]>
>>>> wrote:
>>>>
>>>>> Thank you very much for your interest and for testing my system.
>>>>> It seems that my release was premature: It worked for some random data
>>>>> but
>>>>> didn't for some others. It's a minor logical error that I will try to
>>>>> fix
>>>>> in
>>>>> the next few days. The problem is with the stopping condition of the
>>>>> repeat
>>>>> expression that calculates the new pagerank from the old. It must stop
>>>>> if
>>>>> ALL peers reach  the specified precision. This is done by having those
>>>>> peers
>>>>> that need to continue send a message to others to continue. It seems
>>>>> that
>>>>> now when all peers agree at the same time, the program works fine. But
>>>>> if
>>>>> one finishes sooner, instead of continuing the repeat loop, it runs
>>>>> away
>>>>> to
>>>>> the next BSP step that follows the repeat, then exits prematurely and
>>>>> the
>>>>> system hangs. The casting errors are due to the run-away peers
>>>>> executing
>>>>> the
>>>>> wrong BSP steps reading wrong messages. Queries without repeat though
>>>>> are
>>>>> OK.
>>>>> By the way, I had a problem exchanging large amount of data during sync
>>>>> (I
>>>>> discussed this with Thomas).  My solution was to to break a BSP
>>>>> superstep
>>>>> into multiple substeps so that each substep can handle a max number of
>>>>> messages. Of course my program has to collect all messages in a vector
>>>>> in
>>>>> memory. When the vector is too big, it is spilled in a local file. This
>>>>> moved the problem from the Hama side to my side and allowed me to
>>>>> handle
>>>>> larger data, especially in joins. I think this problem of exchanging
>>>>> large
>>>>> amount of data during a superstep is currently a weakness of Hama.
>>>>> Leonidas
>>>>>
>>>>>
>>>>>
>>>>> On 08/24/2012 04:15 AM, Thomas Jungblut wrote:
>>>>>
>>>>>>
>>>>>> BTW, should we feature this on our website?
>>>>>>
>>>>>> 2012/8/24 Thomas 
>>>>>> Jungblut<thomas.jungblut@**gmail.com<[email protected]>
>>>>>> >
>>>>>>
>>>>>>  Hi Leonidas!
>>>>>>>
>>>>>>> I have to admit that I have known what is going on (and had to keep
>>>>>>> silent), but I have to say: Thank you very much!
>>>>>>> This will help many people writing BSPs in a more easier way.
>>>>>>>
>>>>>>> Of course this is not as fast as the native BSP code, Hive and Pig
>>>>>>> suffer
>>>>>>> from the same problems in MR.
>>>>>>> But it gives people the opportunity to develop faster and get their
>>>>>>> code
>>>>>>> in production with just a minor time expense.
>>>>>>>
>>>>>>> And I think, that we will help you gladly on improving the BSP part
>>>>>>> of
>>>>>>> your framework. At least I would do ;)
>>>>>>>
>>>>>>> Thanks!
>>>>>>>
>>>>>>> 2012/8/24 Edward J. Yoon<[email protected]>
>>>>>>>
>>>>>>> Here's my few test results on Oracle BDA (40G/s infiniband network).
>>>>>>>
>>>>>>>>
>>>>>>>> It seems slow than our PageRank example.
>>>>>>>>
>>>>>>>> P.S., There are some errors so I couldn't test large-scale.
>>>>>>>> (java.lang.ClassCastException: hadoop.mrql.MR_int cannot be cast to
>>>>>>>> hadoop.mrql.Inv and java.lang.Error: Cannot clear a non-materialized
>>>>>>>> sequence ..., etc.)
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> == 100K nodes and 1M edges ==
>>>>>>>>
>>>>>>>> *** Using 10 BSP tasks (out of a max 10). Each task will handle
>>>>>>>> about
>>>>>>>> 2383611 bytes of input data.
>>>>>>>>
>>>>>>>> Run time: 30.384 secs
>>>>>>>>
>>>>>>>> *** Using 20 BSP tasks (out of a max 20). Each task will handle
>>>>>>>> about
>>>>>>>> 1191805 bytes of input data.
>>>>>>>>
>>>>>>>> Run time: 24.412 secs
>>>>>>>>
>>>>>>>> On Fri, Aug 24, 2012 at 9:36 AM, Edward J. Yoon
>>>>>>>> <[email protected]>
>>>>>>>> wrote:
>>>>>>>>
>>>>>>>>>
>>>>>>>>> Wow, very interesting. I'm going to install and test on my large
>>>>>>>>>
>>>>>>>>
>>>>>>>> cluster.
>>>>>>>>
>>>>>>>>>
>>>>>>>>> On Fri, Aug 24, 2012 at 4:41 AM, Leonidas Fegaras
>>>>>>>>> <[email protected]>
>>>>>>>>>
>>>>>>>>
>>>>>>>> wrote:
>>>>>>>>
>>>>>>>>>
>>>>>>>>>> Dear Hama users,
>>>>>>>>>> I am pleased to announce that the MRQL query processing system can
>>>>>>>>>> now
>>>>>>>>>> evaluate SQL-like queries on a Hama cluster. MRQL is available at:
>>>>>>>>>>
>>>>>>>>>> http://lambda.uta.edu/mrql/
>>>>>>>>>>
>>>>>>>>>> MRQL (the Map-Reduce Query Language) is an SQL-like query language
>>>>>>>>>> for
>>>>>>>>>> large-scale, distributed data analysis. MRQL is powerful enough to
>>>>>>>>>> express most common data analysis tasks over many different kinds
>>>>>>>>>> of
>>>>>>>>>> raw data, including hierarchical data and nested collections, such
>>>>>>>>>> as
>>>>>>>>>> XML data. MRQL can run in two modes: in MR (Map-Reduce) mode using
>>>>>>>>>> Apache Hadoop and in BSP (Bulk Synchronous Parallel) mode using
>>>>>>>>>> Apache
>>>>>>>>>> Hama. Both modes use Apache's HDFS to read and write their data.
>>>>>>>>>>
>>>>>>>>>> Note that, the BSP mode is currently experimental (not fine-tuned
>>>>>>>>>> yet)
>>>>>>>>>> and lacks any fault-tolerance (if an error occurs, the entire job
>>>>>>>>>> must
>>>>>>>>>> be restarted). Due to our limited resources, MRQL has only been
>>>>>>>>>> tested
>>>>>>>>>> on a small cluster (7-nodes/28-cores). We compared the BSP mode
>>>>>>>>>> with
>>>>>>>>>> the MR mode by evaluating a pagerank query over a small graph
>>>>>>>>>> (100K
>>>>>>>>>> nodes, 1M edges) and found that BSP mode is about 4.5 times faster
>>>>>>>>>> than the MR mode. Please let me know if you'd like to contribute
>>>>>>>>>> to
>>>>>>>>>> this project by testing MRQL on a larger cluster.
>>>>>>>>>> Best regards,
>>>>>>>>>> Leonidas Fegaras
>>>>>>>>>> University of Texas at Arlington
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>> --
>>>>>>>>> Best Regards, Edward J. Yoon
>>>>>>>>> @eddieyoon
>>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> --
>>>>>>>> Best Regards, Edward J. Yoon
>>>>>>>> @eddieyoon
>>>>>>>>
>>>>>>>>  .
>>>>>>
>>>>>>
>>>>
>>>> --
>>>> Best Regards, Edward J. Yoon
>>>> @eddieyoon
>>>>
>>>
>>>
>>
>>
>

Re: [ANNOUNCEMENT] A query system for BSP processing

Reply via email to