Responses in-inline.
On October 18, 2016 at 9:19:50 AM, Benjamin Young (byo...@bigbluehat.com) wrote:

Hi all,  

Sorry I’ve not written here sooner. I’d reached out to the Incubator list while 
at the W3C’s TPAC even about keeping Apache Streams in the incubator in hopes 
of also seeing it support the nearly finalized ActivityStreams 2.0 
specification:  
https://www.w3.org/TR/activitystreams-core/  

Thanks!

Since then, I’ve noticed Steve’s efforts to make Streams much simpler for new 
years—which is fabulous! I (sadly) don’t code in Java…since college, but I do 
have a desire to run code that aggregates my social streams into a standard 
format, store it in a database I prefer (in my case Apache CouchDB), and do 
cool stuff with it for my own reasons. ;) That desire is what drew me into the 
Streams talk at ApacheCon.  

A lot of businesses, techies, and non-techies are interested in producing and 
consuming content outside of standard single-channel generic web and mobile 
apps - but there seems to be a dearth of quality low-cost commercial offerings 
to do so. 

While digging around the project documents, I’ve found two overview 
descriptions of the project.  

This one’s from the web site:  
http://streams.incubator.apache.org/site/0.4-incubating-SNAPSHOT/streams-master/
  
”Apache Streams (incubating) unifies a diverse world of digital profiles and 
online activities into common formats and vocabularies, and makes these 
datasets accessible across a variety of databases, devices, and platforms for 
streaming, browsing, search, sharing, and analytics use-cases.”  

This is our primary focus right now - expanding interoperability to more 
sources, and enabling interesting use cases that grow the community.

And this one from the repo’s readme file:  
https://svn.apache.org/repos/asf/incubator/streams/trunk/README.txt  
“Apache Streams is a lightweight (yet scalable) server for ActivityStreams. The 
role of Apache Streams is to provide a central point of aggregation, filtering 
and querying for Activities that have been submitted by disparate systems. 
Apache Streams also intends to include a mechanism for intelligent filtering 
and recommendation to reduce the noise to end users.”  

This copy is older (the project moved from SVN to GIT in 2013).  It’s still an 
interesting goal, but data interoperability is a more pressing problem in need 
of a robust open-source solution, IMO.  There are plenty of mature databases, 
data science tools, and data vis libraries around -  I think if it were dead 
simple for anyone to collect and normalize social streams we’d see 
experimentation and adjacent tooling flourish.

In either case, the story that I get—and the thing I want—is minimal setup to 
get my Twitter, etc, piped into a database +/- an API +/- a UI.  

I think we are closing in on this, minus official API and UI.  The group of 
active contributors will need to grow and diversify to tackle those but there’s 
nothing impeding their development (integration and deployment will require 
making some choices). 

Am I on the right track here? Or is Streams really meant for Java-developers to 
mix into their projects?  

We’re looking into distribution with docker which will be a good way for 
power-users with zero interest in Java or Apache technologies to run streams.  
The core project libraries, connectors, and converters may be Java, but there’s 
plenty of room to innovate and improve the project outside that world.  We have 
a ton of work ahead answering questions about what normalized data types to 
support, which systems to prioritize, how we want the normalized data to look, 
and how to map in data from upstream systems.  Design and product work, not 
code work.

Once I know that, I’ll know best how to help. :) 

If I can make a suggestion for how to get started, try to run any/all of our 
providers and examples while refusing to look at any source code. Let us if 
that’s not working out so we can change things up until it does.  Also let us 
know how well the existing providers and examples meet your needs as a social 
data power-user and what opportunities for improvement you see, to help us 
build out the JIRA backlog.  

Cheers! 
Benjamin 
-- 
http://bigbluehat.com/ 
http://linkedin.com/in/benjaminyoung 

Reply via email to