[ https://issues.apache.org/jira/browse/CAMEL-9385?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Henryk Konsek resolved CAMEL-9385. ---------------------------------- Resolution: Fixed Done. > Create Apache Spark component > ----------------------------- > > Key: CAMEL-9385 > URL: https://issues.apache.org/jira/browse/CAMEL-9385 > Project: Camel > Issue Type: New Feature > Reporter: Henryk Konsek > Assignee: Henryk Konsek > Fix For: 2.17.0 > > > As a part of the the IoT project I'm working on, I have created a Spark > component (1) to make it easier to handle analytics requests from devices. I > would like to donate this code to the ASF Camel and extend it here, as I > guess that there would be many people interested in using Spark from Camel. > The URI looks like {{spark:rdd/rddName/rddCallback}} or > {{spark:dataframe/frameName/frameCallback}} depending if you would like to > work with RDDs or DataFrames. > The idea here is that Camel route acts as a driver application. You specify > RDD/DataFrames definitions (and callbacks to act against those) in a registry > (for example as Spring beans or OSGi services). Then you send a parameters > for the computations as a body of a message. > For example in Spring Boot you specify RDD+callback as: > {code} > @Bean > JavaRDD myRdd(SparkContext sparkContext) { > return sparkContext.textFile("foo.txt"); > } > @Bean > class MyAnalytics { > @RddCallback > long countLines(JavaRDD<String> textFile, long argument) { > return rdd.count() * argument; > } > } > {code} > Then you ask for the results of computations: > {code} > long results = producerTemplate.requestBody("spark:rdd/myRdd/MyAnalytics", > 10, long.class); > {code} > Such setup is extremely useful for bridging Spark computations via different > transports. > (1) https://github.com/rhiot/rhiot/tree/master/components/camel-spark -- This message was sent by Atlassian JIRA (v6.3.4#6332)