[
https://issues.apache.org/jira/browse/EAGLE-66?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15044517#comment-15044517
]
ASF GitHub Bot commented on EAGLE-66:
-------------------------------------
GitHub user haoch opened a pull request:
https://github.com/apache/incubator-eagle/pull/17
[EAGLE-66] Typesafe Streaming DSL and KeyValue based Grouping
https://issues.apache.org/jira/browse/EAGLE-66
You can merge this pull request into a Git repository by running:
$ git pull https://github.com/haoch/incubator-eagle EAGLE-66-TYPESAFE
Alternatively you can review and apply these changes as the patch at:
https://github.com/apache/incubator-eagle/pull/17.patch
To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:
This closes #17
----
commit 2d7003dd1200a500b329a518d63484600e1c2cdf
Author: Hao Chen <[email protected]>
Date: 2015-12-02T16:28:58Z
[EAGLE-66] Keep type information in StreamProcessor
commit e3389cdd52f7f747db0968d9e233e4184b99f5ff
Author: Hao Chen <[email protected]>
Date: 2015-12-04T09:25:39Z
[EAGLE-66] Decouple StreamProtocol and StreamProducer
commit e419ff886316bd536aa3c23aa337e51fa27a8c01
Author: Hao Chen <[email protected]>
Date: 2015-12-04T13:33:21Z
[EAGLE-66] Decouple execution environment and config
commit 8b2d7a64f3f600cf49204cd5abff762e70e8791e
Author: Hao Chen <[email protected]>
Date: 2015-12-04T17:10:32Z
[EAGLE-66] Simplify the stream processing API and configuration
commit 66abc2283092e04cb866b1280d1a286a14b3e117
Author: Hao Chen <[email protected]>
Date: 2015-12-04T17:25:39Z
[EAGLE-66] Refactor AbstractStormSpoutProvider from abstract class to
interface StormSpoutProvider
commit 5b50c3fdb457f7f70f1654c8b19dcb1dae08b1cf
Author: Hao Chen <[email protected]>
Date: 2015-12-04T18:18:14Z
[EAGLE-66] Decouple ExecutionEnvironments factory functions
commit 5ca63dfa0fd755f008970cfa46dafdf5ac89dc29
Author: Hao Chen <[email protected]>
Date: 2015-12-04T18:20:54Z
[EAGLE-66] Rename ConfigWrapper to Configurator
commit f61fde9513de0e65dab7c22cd7f1552022764f58
Author: Hao Chen <[email protected]>
Date: 2015-12-04T19:15:45Z
[EAGLE-66] Refactor StreamProtocol `withName` to `as` interface
commit fbc234473d67d75aeece3c8f2d8963b93f11b3a0
Author: Hao Chen <[email protected]>
Date: 2015-12-05T22:19:17Z
[EAGLE-66] Basically workable key-value stream processing API
commit c20bfa141d5db4fd64790db87e7980c05984e671
Author: Hao Chen <[email protected]>
Date: 2015-12-06T07:24:26Z
[EAGLE-66] Simplify DAG structure log
commit a0ca7fa15088aded4f06eec8ab44bc2cacbb699c
Author: Hao Chen <[email protected]>
Date: 2015-12-06T08:27:53Z
[EAGLE-66] Implement from(Iterable) to replace fromSeq(Seq)
commit 69e68746520ca4648b3987f46d30acc6d6ced79a
Author: Hao Chen <[email protected]>
Date: 2015-12-06T10:04:05Z
[EAGLE-66] Clean DAG graph log
commit 7a6a13636ace590c148fc5793893d88b580e1f45
Author: Hao Chen <[email protected]>
Date: 2015-12-06T17:54:33Z
[EAGLE-66] Keep entity type info through dataflow and resolve scala/java
compatibility issues
commit 9d7b8ceaa6b0b75b132e319da9b6efa5d0f164e1
Author: Hao Chen <[email protected]>
Date: 2015-12-07T07:18:24Z
[EAGLE-66] Refactored the package of datastream API
----
> Eagle TypeSafe Stream Processing DSL
> ------------------------------------
>
> Key: EAGLE-66
> URL: https://issues.apache.org/jira/browse/EAGLE-66
> Project: Eagle
> Issue Type: Improvement
> Affects Versions: 0.3.0
> Reporter: Hao Chen
> Assignee: Hao Chen
> Fix For: 0.3.0
>
>
> Firstly, as to application developer space, developer should not need to care
> about exchanging internal data structure between processing elements, so that
> it should allow user to process the stream just like processing a collection
> of typed object, for example in following case, we should allow to:
> In type-safe way: groupBy(_.user)
> In field-base way: groupBy("user”)
> @StreamDef("hdfsAuditLogStream")
> case class AuditLog(timestamp:Long,user:String,path:String,cmd:String)
> object TypedSafeApp extends App{
> val ec = ExecutionContext.get[StormExecutionContext]
> ec.fromKafka(ec.getConfig){ bytes =>
> AuditLog(1000,"someone","/tmp/private","open")}.parallism(123)//
> Stream[AuditLog]
> .groupBy(_.user).parallism(12)
> // Stream[AuditLog]
> .map { obj => (obj.user,obj)}// Stream[(String, AuditLog)]
> .groupBy(_._1)// Stream[(String, AuditLog)]
> .map(_._2)// Stream[AuditLog]
> .alert.persistAndEmail// Stream[AlertAPIEntity]
> }
> As to framework space, especially for groupBy semantics, it should look like
> nothing different as to developer, but in internal space, the exchange
> structure is (AnyRef, AuditLog)
> As to groupBy(_.user) and object: obj in type of AuditLog, it’s always
> (obj.user,obj) + fieldsGrouping by 0 for storm
> As to groupBy(“user") and object: obj in type of AuditLog, it’s
> (readValueByReflect(“user”,obj),obj) + fieldsGrouping by 0 for storm
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)