Re: [Architecture] Annotation scheme for Hive scripts

Anjana Fernando Sun, 04 Aug 2013 23:09:11 -0700

On Mon, Aug 5, 2013 at 11:30 AM, Maninda Edirisooriya <[email protected]>wrote:


> On Mon, Aug 5, 2013 at 7:58 AM, Malith Dhanushka <[email protected]> wrote:
>
>> Hi all,
>>
>> Implementation of the above suggested approach is in the final stage. But
>> I had a minor clarification of the implementation with Anjana. There we
>> came across following drawbacks of the implemented approach,
>>
>> - Annotation should not be coupled with a class analyzer, rather it
>> should be a run time property injector to hive scripts.
>>
> Then won't the annotation feature will be bounded to Hive language? If so
> we will not be able to integrate more languages to annotations in future.
>

It won't be .. the current implementation happens to only support Hive at
the moment, since that's what we have now. Not coupling with class analyzer
itself is a sign of that, because a class analyzer is anyway a Hive
functionality we have.

Cheers,
Anjana.


>
>> - Abstract Annotation class adds unnecessary complications for user to
>> write a custom annotation.
>>
>> So by considering above information I am going to modify the current
>> implementation by adhering to those factors. If there any other concerns
>> and suggestions please feel free to add.
>>
>> Thanks,
>> Malith
>>
>>
>> On Wed, Jul 10, 2013 at 1:26 PM, Malith Dhanushka <[email protected]>wrote:
>>
>>> Hi Maninda,
>>>
>>> On Wed, Jul 10, 2013 at 11:37 AM, Maninda Edirisooriya <[email protected]
>>> > wrote:
>>>
>>>> This is nice. Can we use these annotations to unify the languages, Hive
>>>> and Siddhi? If we can use this annotation framework as a platform to
>>>> support different languages it will be very useful when it comes to
>>>> integrating other Hadoop related languages like Mahout and Pig. That means
>>>> we can separate each language relates script using annotations. This will
>>>> solve the problem of unifying all the languages into a single language.
>>>>
>>>
>>> Interesting suggestion and yes ,this is gettable via annotations. But
>>> the only limitation is that the current script implementation is only
>>> available for hive. So in order to achieve language unification via
>>> annotations firstly we need to have a unified script implementation for
>>> each underlying engine (ie- Siddhi, Mahout, pig).
>>>
>>>>
>>>> And also using this annotation framework we can create a generic
>>>> Process Flow model on the data. For example we can execute several Hive
>>>> scripts in parallel using a annotation block. And a barrier can be
>>>> introduced if all the parallel scripts should be finished before we move
>>>> onto the next script and so on.
>>>>
>>>
>>> Yes, this can be added as a built in annotation.
>>>
>>>
>>>>
>>>> Other than that we can provide a default set of class analysers as we
>>>> have discussed in a previous mail. The value of annotations is that we can
>>>> provide the available set of class analysers out of the box. Any idea about
>>>> the syntax?
>>>>
>>>
>>> Each annotation is associated with a particular class analyzer, which
>>> process the given parameters via the annotation. So we can wrap that
>>> default set of class analyzers and expose them as set of built in
>>> annotations and can stick to the same syntax as previous,
>>>
>>> @script.foo(bar="value", bar1="value1",*)
>>>
>>>
>>>> *
>>>> Maninda Edirisooriya*
>>>> Software Engineer
>>>> *WSO2, Inc.
>>>> *lean.enterprise.middleware.
>>>>
>>>> *Blog* : http://maninda.blogspot.com/
>>>> *Phone* : +94 777603226
>>>>
>>>>
>>>> On Tue, Jul 9, 2013 at 11:23 AM, Malith Dhanushka <[email protected]>wrote:
>>>>
>>>>>  Hi all,
>>>>>
>>>>> I have started implementing the $Subject. The idea of having an
>>>>> annotation facility is to carryout some pre-processing of Hive queries
>>>>> before they are being passed to the Hive engine. Currently we already have
>>>>> a "class analyzer" which can be used execute some custom logic as a part 
>>>>> of
>>>>> a Hive script. But the main use case of annotations is to inject run-time
>>>>> properties to Hive execution context before the actual queries are carried
>>>>> out by Hive. The annotation facility would be building upon this by having
>>>>> set of such common analyzers which can manipulate the Hive queries or Hive
>>>>> execution context which it is passed to Hive query engine.
>>>>>
>>>>> Annotation Syntax,
>>>>>
>>>>> *@script.foo(bar="value", bar1="value1",*)*
>>>>>
>>>>>
>>>>> Annotation scheme will be externalized by giving *abstract
>>>>> implementation of annotation* and *annotation-config.xml* file to
>>>>> provide the annotation configuration which allows third party annotations
>>>>> to be included to the system.
>>>>>
>>>>> *annotation-config.xml*
>>>>>
>>>>> <annotation>
>>>>> <name>foo</name>
>>>>> <class>org.wso2.carbon.analytics.hive.extension.annotation.foo</class>
>>>>> <analyzer>org.wso2.carbon.analytics.hive.extension.foo</analyzer>
>>>>> </annotation>
>>>>>
>>>>> <annotation>
>>>>> ................................
>>>>> </annotation>
>>>>>
>>>>>
>>>>> Potential use case for this in incremental data processing where any
>>>>> query associated with "*@script.incremental(foo="value1",
>>>>> bar="value2",*)*" would flag and setup the properties those are
>>>>> required to present in order for that particular query to be executed in 
>>>>> an
>>>>> incremental manner.There can be many other useful additions as well.
>>>>>
>>>>> Any suggestions, thoughts are welcome.
>>>>>
>>>>> --
>>>>> Malith Dhanushka
>>>>>
>>>>> Engineer - Data Technologies
>>>>> *WSO2, Inc. : wso2.com*
>>>>>
>>>>> *Mobile*          : +94 716 506 693
>>>>>
>>>>> _______________________________________________
>>>>> Architecture mailing list
>>>>> [email protected]
>>>>> https://mail.wso2.org/cgi-bin/mailman/listinfo/architecture
>>>>>
>>>>>
>>>>
>>>> _______________________________________________
>>>> Architecture mailing list
>>>> [email protected]
>>>> https://mail.wso2.org/cgi-bin/mailman/listinfo/architecture
>>>>
>>>>
>>>
>>>
>>> --
>>> Malith Dhanushka
>>>
>>> Engineer - Data Technologies
>>> *WSO2, Inc. : wso2.com*
>>>
>>> *Mobile*          : +94 716 506 693
>>>
>>
>>
>>
>> --
>> Malith Dhanushka
>>
>> Engineer - Data Technologies
>> *WSO2, Inc. : wso2.com*
>>
>> *Mobile*          : +94 716 506 693
>>
>> _______________________________________________
>> Architecture mailing list
>> [email protected]
>> https://mail.wso2.org/cgi-bin/mailman/listinfo/architecture
>>
>>
>
> _______________________________________________
> Architecture mailing list
> [email protected]
> https://mail.wso2.org/cgi-bin/mailman/listinfo/architecture
>
>


-- 
*Anjana Fernando*
Technical Lead
WSO2 Inc. | http://wso2.com
lean . enterprise . middleware

_______________________________________________
Architecture mailing list
[email protected]
https://mail.wso2.org/cgi-bin/mailman/listinfo/architecture

Re: [Architecture] Annotation scheme for Hive scripts

Reply via email to