+1 I think it is useful.

We should define UDF (user define functions) that conform to Hive model as
close as possible.

--Srinath


On Wed, Jun 12, 2013 at 1:13 PM, Maninda Edirisooriya <[email protected]>wrote:

> The main configuration point of analytics in BAM is its Hive scripts. At
> the moment we only support Hive language. But Hive has many drawbacks. For
> example there is no generic way to execute a SQL script on a RDBMS table in
> a Hive command. We had that requirement to execute the "DROP TABLE" query
> in Hive. Though there was a work around it was not a proper solution.
>
> On the other hand AFAIK there is no proper listening mechanism for finish
> of execution of a Hive script. Such a technique is important to notify
> and/or pass data to another part of the WSO2 stack or call another part of
> the system. (e.g., invoke a proxy service in ESB after a certain Hive code
> line is executed) This kind of push modal is required for BAM when BAM is
> used for alerting scenarios. Yes, CEP is used for alerting but BAM can be
> used for not-urgent alerting. For example send an e-mail to marketing team
> after the sales has dropped to a certain level during one month. This can
> be done with BAM. Data push modal is required for reducing the latency as
> much as possible.
>
> I have come across another requirement for Class Analyzer / UDF with XML
> processing. Though simple XML can be processed with existing UDFs provided
> with Hive, XML strings with namespaces are not supported. We should add our
> own UDF for Hive level SOAP message processing.
>
> And also, already we pack a set of Class analyzers for IP to geo-location
> mapping and for deleting Cassandra rows for archiving purposes. Though they
> are packaged with BAM we do not explicitly mention about their existence in
> docs, though they are very useful for the users.
>
> I think it is the time to upgrade Hive script API we provide, with the new
> Class Analyzers / UDFs we have introduced. And also we should implement the
> Class Analyzers / UDFs for the commonly used scenarios such as the
> requirements I have mentioned earlier. This will improve the user
> experience by improving the userbility of Hive for broad range of
> solutions. And finally it will make the BAM analyzer similar to the
> Mediation Sequence in ESB which will improve the integration capability.
>
> Here are some of the Class analyzers / UDFs I think that will be useful
> for packing with BAM.
>
> 1. Executing a generic SQL statement (as described in the above example)
> 2. Web service / REST API calling (for invoking AS or ESB)
> 3. Java class execution - Similar to Jaggery Package host object. This
> should make it possible call any Java class in the OSGi environment
> 4. Thrift call execution - will help to pump data back to CEP and then
> take actions inside CEP
> 5. Registry value getting and setting - This will help a lot to share
> global values between Hive scripts while execution.
> 6. Hive script invoking - This will enable to asynchronously invoke
> another Hive script during execution of the original script.
> 7. Conditioning and iterating - will make it possible to stop/continue the
> script sequence by conditioning and will possible to execute a script as a
> for loop or while loop.
>
> There may be other useful features as well for defining the API. WDYT?
> *
> Maninda Edirisooriya*
> Software Engineer
> *WSO2, Inc.
> *lean.enterprise.middleware.
>
> *Blog* : http://maninda.blogspot.com/
> *Phone* : +94 777603226
>
> _______________________________________________
> Architecture mailing list
> [email protected]
> https://mail.wso2.org/cgi-bin/mailman/listinfo/architecture
>
>


-- 
============================
Srinath Perera, Ph.D.
   http://people.apache.org/~hemapani/
   http://srinathsview.blogspot.com/
_______________________________________________
Architecture mailing list
[email protected]
https://mail.wso2.org/cgi-bin/mailman/listinfo/architecture

Reply via email to