Re: [Architecture] Annotation scheme for Hive scripts

Malith Dhanushka Sun, 04 Aug 2013 19:30:40 -0700

Hi all,

Implementation of the above suggested approach is in the final stage. But I
had a minor clarification of the implementation with Anjana. There we came
across following drawbacks of the implemented approach,


- Annotation should not be coupled with a class analyzer, rather it should
be a run time property injector to hive scripts.

- Abstract Annotation class adds unnecessary complications for user to
write a custom annotation.

So by considering above information I am going to modify the current
implementation by adhering to those factors. If there any other concerns
and suggestions please feel free to add.

Thanks,
Malith


On Wed, Jul 10, 2013 at 1:26 PM, Malith Dhanushka <[email protected]> wrote:

> Hi Maninda,
>
> On Wed, Jul 10, 2013 at 11:37 AM, Maninda Edirisooriya 
> <[email protected]>wrote:
>
>> This is nice. Can we use these annotations to unify the languages, Hive
>> and Siddhi? If we can use this annotation framework as a platform to
>> support different languages it will be very useful when it comes to
>> integrating other Hadoop related languages like Mahout and Pig. That means
>> we can separate each language relates script using annotations. This will
>> solve the problem of unifying all the languages into a single language.
>>
>
> Interesting suggestion and yes ,this is gettable via annotations. But the
> only limitation is that the current script implementation is only available
> for hive. So in order to achieve language unification via annotations
> firstly we need to have a unified script implementation for each underlying
> engine (ie- Siddhi, Mahout, pig).
>
>>
>> And also using this annotation framework we can create a generic Process
>> Flow model on the data. For example we can execute several Hive scripts in
>> parallel using a annotation block. And a barrier can be introduced if all
>> the parallel scripts should be finished before we move onto the next script
>> and so on.
>>
>
> Yes, this can be added as a built in annotation.
>
>
>>
>> Other than that we can provide a default set of class analysers as we
>> have discussed in a previous mail. The value of annotations is that we can
>> provide the available set of class analysers out of the box. Any idea about
>> the syntax?
>>
>
> Each annotation is associated with a particular class analyzer, which
> process the given parameters via the annotation. So we can wrap that
> default set of class analyzers and expose them as set of built in
> annotations and can stick to the same syntax as previous,
>
> @script.foo(bar="value", bar1="value1",*)
>
>
>> *
>> Maninda Edirisooriya*
>> Software Engineer
>> *WSO2, Inc.
>> *lean.enterprise.middleware.
>>
>> *Blog* : http://maninda.blogspot.com/
>> *Phone* : +94 777603226
>>
>>
>> On Tue, Jul 9, 2013 at 11:23 AM, Malith Dhanushka <[email protected]>wrote:
>>
>>>  Hi all,
>>>
>>> I have started implementing the $Subject. The idea of having an
>>> annotation facility is to carryout some pre-processing of Hive queries
>>> before they are being passed to the Hive engine. Currently we already have
>>> a "class analyzer" which can be used execute some custom logic as a part of
>>> a Hive script. But the main use case of annotations is to inject run-time
>>> properties to Hive execution context before the actual queries are carried
>>> out by Hive. The annotation facility would be building upon this by having
>>> set of such common analyzers which can manipulate the Hive queries or Hive
>>> execution context which it is passed to Hive query engine.
>>>
>>> Annotation Syntax,
>>>
>>> *@script.foo(bar="value", bar1="value1",*)*
>>>
>>>
>>> Annotation scheme will be externalized by giving *abstract
>>> implementation of annotation* and *annotation-config.xml* file to
>>> provide the annotation configuration which allows third party annotations
>>> to be included to the system.
>>>
>>> *annotation-config.xml*
>>>
>>> <annotation>
>>> <name>foo</name>
>>> <class>org.wso2.carbon.analytics.hive.extension.annotation.foo</class>
>>> <analyzer>org.wso2.carbon.analytics.hive.extension.foo</analyzer>
>>> </annotation>
>>>
>>> <annotation>
>>> ................................
>>> </annotation>
>>>
>>>
>>> Potential use case for this in incremental data processing where any
>>> query associated with "*@script.incremental(foo="value1",
>>> bar="value2",*)*" would flag and setup the properties those are
>>> required to present in order for that particular query to be executed in an
>>> incremental manner.There can be many other useful additions as well.
>>>
>>> Any suggestions, thoughts are welcome.
>>>
>>> --
>>> Malith Dhanushka
>>>
>>> Engineer - Data Technologies
>>> *WSO2, Inc. : wso2.com*
>>>
>>> *Mobile*          : +94 716 506 693
>>>
>>> _______________________________________________
>>> Architecture mailing list
>>> [email protected]
>>> https://mail.wso2.org/cgi-bin/mailman/listinfo/architecture
>>>
>>>
>>
>> _______________________________________________
>> Architecture mailing list
>> [email protected]
>> https://mail.wso2.org/cgi-bin/mailman/listinfo/architecture
>>
>>
>
>
> --
> Malith Dhanushka
>
> Engineer - Data Technologies
> *WSO2, Inc. : wso2.com*
>
> *Mobile*          : +94 716 506 693
>



-- 
Malith Dhanushka

Engineer - Data Technologies
*WSO2, Inc. : wso2.com*

*Mobile*          : +94 716 506 693

_______________________________________________
Architecture mailing list
[email protected]
https://mail.wso2.org/cgi-bin/mailman/listinfo/architecture

Re: [Architecture] Annotation scheme for Hive scripts

Reply via email to