Hi Maninda, Of course, we can add them as set of built-in analyzers and it will improve the scope of Hive scripts.
Thanks, Malith On Wed, Aug 14, 2013 at 11:55 AM, Maninda Edirisooriya <[email protected]>wrote: > Hi Malith, > > Nice work. Using this new feature we can implement the requirements I > mentioned in the Architecture mail titled "Making BAM more useful platform > with well defined Class Analyzers / UDFs". WDYT? > > * > Maninda Edirisooriya* > Software Engineer > *WSO2, Inc. > *lean.enterprise.middleware. > > *Blog* : http://maninda.blogspot.com/ > *Phone* : +94 777603226 > > > On Tue, Aug 13, 2013 at 8:24 PM, Malith Dhanushka <[email protected]> wrote: > >> Hi all, >> >> I have modified the implementation according to above description and >> following is the modified version, >> >> - analyzer-config.xml contains the mapping details. >> >> ex : - >> >> analyzer-config.xml >> >> *<analyzerConfig xmlns="http://wso2.org/carbon/analytics">* >> * <analyzers>* >> * <analyzer>* >> * <name>foo</name>* >> * >> <class>org.wso2.carbon.analytics.hive.extension.builtin.FooAnalyzer</class> >> * >> * <parameters>bar,bat1,*</parameters>* >> * </analyzer>* >> * </analyzers>* >> *</analyzerConfig>* >> >> parameter description, >> >> *name *- alias name which maps to the class analyzer >> >> *class* - class analyzer >> >> *parameters* - parameters that are accepted by class analyzer >> >> - This can be utilized in hive script as follows, >> >> syntax >> >> *analyzer foo(bar="value",bar1="value1",*)*; >> >> Currently there is one built-in analyzer which is resolvePath analyzer >> and more will be added by considering other common use cases. >> >> Thanks, >> Malith >> >> >> On Mon, Aug 5, 2013 at 11:36 AM, Anjana Fernando <[email protected]> wrote: >> >>> On Mon, Aug 5, 2013 at 11:30 AM, Maninda Edirisooriya >>> <[email protected]>wrote: >>> >>>> On Mon, Aug 5, 2013 at 7:58 AM, Malith Dhanushka <[email protected]>wrote: >>>> >>>>> Hi all, >>>>> >>>>> Implementation of the above suggested approach is in the final stage. >>>>> But I had a minor clarification of the implementation with Anjana. There >>>>> we >>>>> came across following drawbacks of the implemented approach, >>>>> >>>>> - Annotation should not be coupled with a class analyzer, rather it >>>>> should be a run time property injector to hive scripts. >>>>> >>>> Then won't the annotation feature will be bounded to Hive language? If >>>> so we will not be able to integrate more languages to annotations in >>>> future. >>>> >>> >>> It won't be .. the current implementation happens to only support Hive >>> at the moment, since that's what we have now. Not coupling with class >>> analyzer itself is a sign of that, because a class analyzer is anyway a >>> Hive functionality we have. >>> >>> Cheers, >>> Anjana. >>> >>> >>>> >>>>> - Abstract Annotation class adds unnecessary complications for user to >>>>> write a custom annotation. >>>>> >>>>> So by considering above information I am going to modify the current >>>>> implementation by adhering to those factors. If there any other concerns >>>>> and suggestions please feel free to add. >>>>> >>>>> Thanks, >>>>> Malith >>>>> >>>>> >>>>> On Wed, Jul 10, 2013 at 1:26 PM, Malith Dhanushka <[email protected]>wrote: >>>>> >>>>>> Hi Maninda, >>>>>> >>>>>> On Wed, Jul 10, 2013 at 11:37 AM, Maninda Edirisooriya < >>>>>> [email protected]> wrote: >>>>>> >>>>>>> This is nice. Can we use these annotations to unify the languages, >>>>>>> Hive and Siddhi? If we can use this annotation framework as a platform >>>>>>> to >>>>>>> support different languages it will be very useful when it comes to >>>>>>> integrating other Hadoop related languages like Mahout and Pig. That >>>>>>> means >>>>>>> we can separate each language relates script using annotations. This >>>>>>> will >>>>>>> solve the problem of unifying all the languages into a single language. >>>>>>> >>>>>> >>>>>> Interesting suggestion and yes ,this is gettable via annotations. But >>>>>> the only limitation is that the current script implementation is only >>>>>> available for hive. So in order to achieve language unification via >>>>>> annotations firstly we need to have a unified script implementation for >>>>>> each underlying engine (ie- Siddhi, Mahout, pig). >>>>>> >>>>>>> >>>>>>> And also using this annotation framework we can create a generic >>>>>>> Process Flow model on the data. For example we can execute several Hive >>>>>>> scripts in parallel using a annotation block. And a barrier can be >>>>>>> introduced if all the parallel scripts should be finished before we move >>>>>>> onto the next script and so on. >>>>>>> >>>>>> >>>>>> Yes, this can be added as a built in annotation. >>>>>> >>>>>> >>>>>>> >>>>>>> Other than that we can provide a default set of class analysers as >>>>>>> we have discussed in a previous mail. The value of annotations is that >>>>>>> we >>>>>>> can provide the available set of class analysers out of the box. Any >>>>>>> idea >>>>>>> about the syntax? >>>>>>> >>>>>> >>>>>> Each annotation is associated with a particular class analyzer, which >>>>>> process the given parameters via the annotation. So we can wrap that >>>>>> default set of class analyzers and expose them as set of built in >>>>>> annotations and can stick to the same syntax as previous, >>>>>> >>>>>> @script.foo(bar="value", bar1="value1",*) >>>>>> >>>>>> >>>>>>> * >>>>>>> Maninda Edirisooriya* >>>>>>> Software Engineer >>>>>>> *WSO2, Inc. >>>>>>> *lean.enterprise.middleware. >>>>>>> >>>>>>> *Blog* : http://maninda.blogspot.com/ >>>>>>> *Phone* : +94 777603226 >>>>>>> >>>>>>> >>>>>>> On Tue, Jul 9, 2013 at 11:23 AM, Malith Dhanushka >>>>>>> <[email protected]>wrote: >>>>>>> >>>>>>>> Hi all, >>>>>>>> >>>>>>>> I have started implementing the $Subject. The idea of having an >>>>>>>> annotation facility is to carryout some pre-processing of Hive queries >>>>>>>> before they are being passed to the Hive engine. Currently we already >>>>>>>> have >>>>>>>> a "class analyzer" which can be used execute some custom logic as a >>>>>>>> part of >>>>>>>> a Hive script. But the main use case of annotations is to inject >>>>>>>> run-time >>>>>>>> properties to Hive execution context before the actual queries are >>>>>>>> carried >>>>>>>> out by Hive. The annotation facility would be building upon this by >>>>>>>> having >>>>>>>> set of such common analyzers which can manipulate the Hive queries or >>>>>>>> Hive >>>>>>>> execution context which it is passed to Hive query engine. >>>>>>>> >>>>>>>> Annotation Syntax, >>>>>>>> >>>>>>>> *@script.foo(bar="value", bar1="value1",*)* >>>>>>>> >>>>>>>> >>>>>>>> Annotation scheme will be externalized by giving *abstract >>>>>>>> implementation of annotation* and *annotation-config.xml* file to >>>>>>>> provide the annotation configuration which allows third party >>>>>>>> annotations >>>>>>>> to be included to the system. >>>>>>>> >>>>>>>> *annotation-config.xml* >>>>>>>> >>>>>>>> <annotation> >>>>>>>> <name>foo</name> >>>>>>>> >>>>>>>> <class>org.wso2.carbon.analytics.hive.extension.annotation.foo</class> >>>>>>>> <analyzer>org.wso2.carbon.analytics.hive.extension.foo</analyzer> >>>>>>>> </annotation> >>>>>>>> >>>>>>>> <annotation> >>>>>>>> ................................ >>>>>>>> </annotation> >>>>>>>> >>>>>>>> >>>>>>>> Potential use case for this in incremental data processing where >>>>>>>> any query associated with "*@script.incremental(foo="value1", >>>>>>>> bar="value2",*)*" would flag and setup the properties those are >>>>>>>> required to present in order for that particular query to be executed >>>>>>>> in an >>>>>>>> incremental manner.There can be many other useful additions as well. >>>>>>>> >>>>>>>> Any suggestions, thoughts are welcome. >>>>>>>> >>>>>>>> -- >>>>>>>> Malith Dhanushka >>>>>>>> >>>>>>>> Engineer - Data Technologies >>>>>>>> *WSO2, Inc. : wso2.com* >>>>>>>> >>>>>>>> *Mobile* : +94 716 506 693 >>>>>>>> >>>>>>>> _______________________________________________ >>>>>>>> Architecture mailing list >>>>>>>> [email protected] >>>>>>>> https://mail.wso2.org/cgi-bin/mailman/listinfo/architecture >>>>>>>> >>>>>>>> >>>>>>> >>>>>>> _______________________________________________ >>>>>>> Architecture mailing list >>>>>>> [email protected] >>>>>>> https://mail.wso2.org/cgi-bin/mailman/listinfo/architecture >>>>>>> >>>>>>> >>>>>> >>>>>> >>>>>> -- >>>>>> Malith Dhanushka >>>>>> >>>>>> Engineer - Data Technologies >>>>>> *WSO2, Inc. : wso2.com* >>>>>> >>>>>> *Mobile* : +94 716 506 693 >>>>>> >>>>> >>>>> >>>>> >>>>> -- >>>>> Malith Dhanushka >>>>> >>>>> Engineer - Data Technologies >>>>> *WSO2, Inc. : wso2.com* >>>>> >>>>> *Mobile* : +94 716 506 693 >>>>> >>>>> _______________________________________________ >>>>> Architecture mailing list >>>>> [email protected] >>>>> https://mail.wso2.org/cgi-bin/mailman/listinfo/architecture >>>>> >>>>> >>>> >>>> _______________________________________________ >>>> Architecture mailing list >>>> [email protected] >>>> https://mail.wso2.org/cgi-bin/mailman/listinfo/architecture >>>> >>>> >>> >>> >>> -- >>> *Anjana Fernando* >>> Technical Lead >>> WSO2 Inc. | http://wso2.com >>> lean . enterprise . middleware >>> >>> _______________________________________________ >>> Architecture mailing list >>> [email protected] >>> https://mail.wso2.org/cgi-bin/mailman/listinfo/architecture >>> >>> >> >> >> -- >> Malith Dhanushka >> >> Engineer - Data Technologies >> *WSO2, Inc. : wso2.com* >> >> *Mobile* : +94 716 506 693 >> >> _______________________________________________ >> Architecture mailing list >> [email protected] >> https://mail.wso2.org/cgi-bin/mailman/listinfo/architecture >> >> > > _______________________________________________ > Architecture mailing list > [email protected] > https://mail.wso2.org/cgi-bin/mailman/listinfo/architecture > > -- Malith Dhanushka Engineer - Data Technologies *WSO2, Inc. : wso2.com* *Mobile* : +94 716 506 693
_______________________________________________ Architecture mailing list [email protected] https://mail.wso2.org/cgi-bin/mailman/listinfo/architecture
