Re: [Architecture] Annotation scheme for Hive scripts

Maninda Edirisooriya Sun, 04 Aug 2013 23:02:11 -0700

On Mon, Aug 5, 2013 at 7:58 AM, Malith Dhanushka <[email protected]> wrote:


> Hi all,
>
> Implementation of the above suggested approach is in the final stage. But
> I had a minor clarification of the implementation with Anjana. There we
> came across following drawbacks of the implemented approach,
>
> - Annotation should not be coupled with a class analyzer, rather it should
> be a run time property injector to hive scripts.
>
Then won't the annotation feature will be bounded to Hive language? If so
we will not be able to integrate more languages to annotations in future.

>
> - Abstract Annotation class adds unnecessary complications for user to
> write a custom annotation.
>
> So by considering above information I am going to modify the current
> implementation by adhering to those factors. If there any other concerns
> and suggestions please feel free to add.
>
> Thanks,
> Malith
>
>
> On Wed, Jul 10, 2013 at 1:26 PM, Malith Dhanushka <[email protected]> wrote:
>
>> Hi Maninda,
>>
>> On Wed, Jul 10, 2013 at 11:37 AM, Maninda Edirisooriya 
>> <[email protected]>wrote:
>>
>>> This is nice. Can we use these annotations to unify the languages, Hive
>>> and Siddhi? If we can use this annotation framework as a platform to
>>> support different languages it will be very useful when it comes to
>>> integrating other Hadoop related languages like Mahout and Pig. That means
>>> we can separate each language relates script using annotations. This will
>>> solve the problem of unifying all the languages into a single language.
>>>
>>
>> Interesting suggestion and yes ,this is gettable via annotations. But the
>> only limitation is that the current script implementation is only available
>> for hive. So in order to achieve language unification via annotations
>> firstly we need to have a unified script implementation for each underlying
>> engine (ie- Siddhi, Mahout, pig).
>>
>>>
>>> And also using this annotation framework we can create a generic Process
>>> Flow model on the data. For example we can execute several Hive scripts in
>>> parallel using a annotation block. And a barrier can be introduced if all
>>> the parallel scripts should be finished before we move onto the next script
>>> and so on.
>>>
>>
>> Yes, this can be added as a built in annotation.
>>
>>
>>>
>>> Other than that we can provide a default set of class analysers as we
>>> have discussed in a previous mail. The value of annotations is that we can
>>> provide the available set of class analysers out of the box. Any idea about
>>> the syntax?
>>>
>>
>> Each annotation is associated with a particular class analyzer, which
>> process the given parameters via the annotation. So we can wrap that
>> default set of class analyzers and expose them as set of built in
>> annotations and can stick to the same syntax as previous,
>>
>> @script.foo(bar="value", bar1="value1",*)
>>
>>
>>> *
>>> Maninda Edirisooriya*
>>> Software Engineer
>>> *WSO2, Inc.
>>> *lean.enterprise.middleware.
>>>
>>> *Blog* : http://maninda.blogspot.com/
>>> *Phone* : +94 777603226
>>>
>>>
>>> On Tue, Jul 9, 2013 at 11:23 AM, Malith Dhanushka <[email protected]>wrote:
>>>
>>>>  Hi all,
>>>>
>>>> I have started implementing the $Subject. The idea of having an
>>>> annotation facility is to carryout some pre-processing of Hive queries
>>>> before they are being passed to the Hive engine. Currently we already have
>>>> a "class analyzer" which can be used execute some custom logic as a part of
>>>> a Hive script. But the main use case of annotations is to inject run-time
>>>> properties to Hive execution context before the actual queries are carried
>>>> out by Hive. The annotation facility would be building upon this by having
>>>> set of such common analyzers which can manipulate the Hive queries or Hive
>>>> execution context which it is passed to Hive query engine.
>>>>
>>>> Annotation Syntax,
>>>>
>>>> *@script.foo(bar="value", bar1="value1",*)*
>>>>
>>>>
>>>> Annotation scheme will be externalized by giving *abstract
>>>> implementation of annotation* and *annotation-config.xml* file to
>>>> provide the annotation configuration which allows third party annotations
>>>> to be included to the system.
>>>>
>>>> *annotation-config.xml*
>>>>
>>>> <annotation>
>>>> <name>foo</name>
>>>> <class>org.wso2.carbon.analytics.hive.extension.annotation.foo</class>
>>>> <analyzer>org.wso2.carbon.analytics.hive.extension.foo</analyzer>
>>>> </annotation>
>>>>
>>>> <annotation>
>>>> ................................
>>>> </annotation>
>>>>
>>>>
>>>> Potential use case for this in incremental data processing where any
>>>> query associated with "*@script.incremental(foo="value1",
>>>> bar="value2",*)*" would flag and setup the properties those are
>>>> required to present in order for that particular query to be executed in an
>>>> incremental manner.There can be many other useful additions as well.
>>>>
>>>> Any suggestions, thoughts are welcome.
>>>>
>>>> --
>>>> Malith Dhanushka
>>>>
>>>> Engineer - Data Technologies
>>>> *WSO2, Inc. : wso2.com*
>>>>
>>>> *Mobile*          : +94 716 506 693
>>>>
>>>> _______________________________________________
>>>> Architecture mailing list
>>>> [email protected]
>>>> https://mail.wso2.org/cgi-bin/mailman/listinfo/architecture
>>>>
>>>>
>>>
>>> _______________________________________________
>>> Architecture mailing list
>>> [email protected]
>>> https://mail.wso2.org/cgi-bin/mailman/listinfo/architecture
>>>
>>>
>>
>>
>> --
>> Malith Dhanushka
>>
>> Engineer - Data Technologies
>> *WSO2, Inc. : wso2.com*
>>
>> *Mobile*          : +94 716 506 693
>>
>
>
>
> --
> Malith Dhanushka
>
> Engineer - Data Technologies
> *WSO2, Inc. : wso2.com*
>
> *Mobile*          : +94 716 506 693
>
> _______________________________________________
> Architecture mailing list
> [email protected]
> https://mail.wso2.org/cgi-bin/mailman/listinfo/architecture
>
>

_______________________________________________
Architecture mailing list
[email protected]
https://mail.wso2.org/cgi-bin/mailman/listinfo/architecture

Re: [Architecture] Annotation scheme for Hive scripts

Reply via email to