Re: Granary & SocketHub
Then it sounds like we're decidedly on the same page...and now to the work! ;) Also (for later): http://www.hydra-cg.com/ It's more or less what you're after, I think: descriptions of endpoints and HTTP-based actions to be taken on them. There are certainly others like this out there also, but most are more limited and more focused on documenting or describing one's own API vs. doing that same thing for any API anywhere + Linked Data. ^_^ Thanks, Steve. Excited! Benjamin From: sblackmon Sent: Thursday, October 20, 2016 3:06:21 PM To: dev@streams.incubator.apache.org; Benjamin Young Cc: Matt Franklin Subject: Re: Granary & SocketHub On October 20, 2016 at 1:33:33 PM, Benjamin Young (byo...@bigbluehat.com<mailto:byo...@bigbluehat.com>) wrote: Great points, Steven. What's always attracted me to Apache Streams is it's descriptiveness (via JSON Schemas documents) vs. prescriptive-ness. Granary's approach is (currently? ;) ) more prescriptive: https://github.com/snarfed/granary/blob/master/granary/twitter.py vs. https://github.com/apache/streams/tree/STREAMS-26/streams-contrib/streams-provider-twitter ...which is mostly (though not all) a collection of .json and .conf files with a handful of .java files needed (afaict) for last-mile integration with one's tool. The future I dream about is one where I can pick my tool for my idiosyncratic language, operating system, license reasons, but they'll all work off shared, descriptive "knowledge" documents. I think we (the active streams developers) agree that data descriptor and translation rules should ideally be a) human-readable b) machine-readable c) programming language-agnostic d) internationalized e) compliant with web standards, or community standards where suitable web standards don’t exist With regard to operating system flexibility, java byte-code and docker containers are as universal as any other run-time platform, with the possible exception of javascript. Apache License 2.0 is about as permissive as they come, so hopefully no concerns there? Otherwise, we're all pulling separately, and end up with snowflake systems to process snowflake APIs. However, I also know it's unlikely everyone will come "under one roof" to work on things. My hope, though, is that the output of this group (and Granary and Sockethub and...) will be re-usable by as wide an audience as possible--hence the value of description over prescription (at least in my book ;) ). A major driver behind using json schemas and hocon snippets so widely throughout the project, and the reason we push all of them onto the website with each snapshot and each release, is to facilitate re-use and re-mixing of those artifacts in other projects using URIs. Ideally it should be possible for some other project to build a system with black-box behavior identical to Apache Streams in another language with much less code by piggy-backing on these public resources. This could even extend to the mechanics of collection, if we specified how to submit and parse HTTP requests with text resources rather than using java sdks. We’ve got a long way to go to reach this objective, but philosophically I’m all for maximizing the re-usability of this codebase and supporting complementary efforts. Granted, if I'm barking up the wrong tree (again), I'm happy to wander off... Please don’t! Stay and help! Is anything in the above sane? ;) I think so! P.S. I’m hopeful that bringing this code-base into compliance with AS 2.0 and JSON-LD will open up the world of RDF tools and specs and allow more of what’s currently expressible only with source code to be expressed with W3C standards compliant resource files published on the web going forward. Cheers! Benjamin -- http://bigbluehat.com/ http://linkedin.com/in/benjaminyoung From: sblackmon Sent: Thursday, October 20, 2016 1:26:38 PM To: dev@streams.incubator.apache.org Cc: Matt Franklin; Benjamin Young Subject: Re: Granary & SocketHub On October 18, 2016 at 6:09:49 PM, Matt Franklin (m.ben.frank...@gmail.com<mailto:m.ben.frank...@gmail.com>) wrote: On Tue, Oct 18, 2016 at 10:39 AM Benjamin Young wrote: > (resending from the correct account…likely the other got spammed…) > > Granary is a project with similar ideas and intents as Apache Streams > (which also needs AS2 support ;) ): > https://github.com/snarfed/granary > Ryan from Granary is on the list I think. Hey Ryan! Cool stuff, too bad it’s python :) > In fact Apache Streams gets a mention in their “Related Work” section: > https://github.com/snarfed/granary#related-work > > Also mentioned in the Granary related work section is SocketHub: > https://github.com/sockethub/sockethub > Cool stuff, too bad it’s LGPL :) > It’s aims a
Re: Granary & SocketHub
On October 20, 2016 at 1:33:33 PM, Benjamin Young (byo...@bigbluehat.com) wrote: Great points, Steven. What's always attracted me to Apache Streams is it's descriptiveness (via JSON Schemas documents) vs. prescriptive-ness. Granary's approach is (currently? ;) ) more prescriptive: https://github.com/snarfed/granary/blob/master/granary/twitter.py vs. https://github.com/apache/streams/tree/STREAMS-26/streams-contrib/streams-provider-twitter ...which is mostly (though not all) a collection of .json and .conf files with a handful of .java files needed (afaict) for last-mile integration with one's tool. The future I dream about is one where I can pick my tool for my idiosyncratic language, operating system, license reasons, but they'll all work off shared, descriptive "knowledge" documents. I think we (the active streams developers) agree that data descriptor and translation rules should ideally be a) human-readable b) machine-readable c) programming language-agnostic d) internationalized e) compliant with web standards, or community standards where suitable web standards don’t exist With regard to operating system flexibility, java byte-code and docker containers are as universal as any other run-time platform, with the possible exception of javascript. Apache License 2.0 is about as permissive as they come, so hopefully no concerns there? Otherwise, we're all pulling separately, and end up with snowflake systems to process snowflake APIs. However, I also know it's unlikely everyone will come "under one roof" to work on things. My hope, though, is that the output of this group (and Granary and Sockethub and...) will be re-usable by as wide an audience as possible--hence the value of description over prescription (at least in my book ;) ). A major driver behind using json schemas and hocon snippets so widely throughout the project, and the reason we push all of them onto the website with each snapshot and each release, is to facilitate re-use and re-mixing of those artifacts in other projects using URIs. Ideally it should be possible for some other project to build a system with black-box behavior identical to Apache Streams in another language with much less code by piggy-backing on these public resources. This could even extend to the mechanics of collection, if we specified how to submit and parse HTTP requests with text resources rather than using java sdks. We’ve got a long way to go to reach this objective, but philosophically I’m all for maximizing the re-usability of this codebase and supporting complementary efforts. Granted, if I'm barking up the wrong tree (again), I'm happy to wander off... Please don’t! Stay and help! Is anything in the above sane? ;) I think so! P.S. I’m hopeful that bringing this code-base into compliance with AS 2.0 and JSON-LD will open up the world of RDF tools and specs and allow more of what’s currently expressible only with source code to be expressed with W3C standards compliant resource files published on the web going forward. Cheers! Benjamin -- http://bigbluehat.com/ http://linkedin.com/in/benjaminyoung From: sblackmon Sent: Thursday, October 20, 2016 1:26:38 PM To: dev@streams.incubator.apache.org Cc: Matt Franklin; Benjamin Young Subject: Re: Granary & SocketHub On October 18, 2016 at 6:09:49 PM, Matt Franklin (m.ben.frank...@gmail.com) wrote: On Tue, Oct 18, 2016 at 10:39 AM Benjamin Young wrote: > (resending from the correct account…likely the other got spammed…) > > Granary is a project with similar ideas and intents as Apache Streams > (which also needs AS2 support ;) ): > https://github.com/snarfed/granary > Ryan from Granary is on the list I think. Hey Ryan! Cool stuff, too bad it’s python :) > In fact Apache Streams gets a mention in their “Related Work” section: > https://github.com/snarfed/granary#related-work > > Also mentioned in the Granary related work section is SocketHub: > https://github.com/sockethub/sockethub > Cool stuff, too bad it’s LGPL :) > It’s aims are similar, but it’s reaching way beyond Web-based social APIs > and “back” to including things like IRC, Email, etc. Non-SNS data sources are important for sure. I’ve posted some work on my personal github using the streams framework to parse MBOX files - https://github.com/steveblackmon/streams-apache - and to collect quantified self data - https://github.com/steveblackmon/humanapi-streams IRC is interesting as well. > What’s significant about both these projects (and others they link to) are > the stories they’re telling developers—which we can crib from as we think > about the Streams “pitch.” They also have relatively minimal setup > docs—which Streams is also heading toward (go Steve!). > Agreed this is key The existence of other open-source projects with similar
Re: Granary & SocketHub
Great points, Steven. What's always attracted me to Apache Streams is it's descriptiveness (via JSON Schemas documents) vs. prescriptive-ness. Granary's approach is (currently? ;) ) more prescriptive: https://github.com/snarfed/granary/blob/master/granary/twitter.py vs. https://github.com/apache/streams/tree/STREAMS-26/streams-contrib/streams-provider-twitter ...which is mostly (though not all) a collection of .json and .conf files with a handful of .java files needed (afaict) for last-mile integration with one's tool. The future I dream about is one where I can pick my tool for my idiosyncratic language, operating system, license reasons, but they'll all work off shared, descriptive "knowledge" documents. Otherwise, we're all pulling separately, and end up with snowflake systems to process snowflake APIs. However, I also know it's unlikely everyone will come "under one roof" to work on things. My hope, though, is that the output of this group (and Granary and Sockethub and...) will be re-usable by as wide an audience as possible--hence the value of description over prescription (at least in my book ;) ). Granted, if I'm barking up the wrong tree (again), I'm happy to wander off... Is anything in the above sane? ;) Cheers! Benjamin -- http://bigbluehat.com/ http://linkedin.com/in/benjaminyoung From: sblackmon Sent: Thursday, October 20, 2016 1:26:38 PM To: dev@streams.incubator.apache.org Cc: Matt Franklin; Benjamin Young Subject: Re: Granary & SocketHub On October 18, 2016 at 6:09:49 PM, Matt Franklin (m.ben.frank...@gmail.com<mailto:m.ben.frank...@gmail.com>) wrote: On Tue, Oct 18, 2016 at 10:39 AM Benjamin Young wrote: > (resending from the correct account...likely the other got spammed...) > > Granary is a project with similar ideas and intents as Apache Streams > (which also needs AS2 support ;) ): > https://github.com/snarfed/granary > Ryan from Granary is on the list I think. Hey Ryan! Cool stuff, too bad it's python :) > In fact Apache Streams gets a mention in their "Related Work" section: > https://github.com/snarfed/granary#related-work > > Also mentioned in the Granary related work section is SocketHub: > https://github.com/sockethub/sockethub > Cool stuff, too bad it's LGPL :) > It's aims are similar, but it's reaching way beyond Web-based social APIs > and "back" to including things like IRC, Email, etc. Non-SNS data sources are important for sure. I've posted some work on my personal github using the streams framework to parse MBOX files - https://github.com/steveblackmon/streams-apache - and to collect quantified self data - https://github.com/steveblackmon/humanapi-streams IRC is interesting as well. > What's significant about both these projects (and others they link to) are > the stories they're telling developers-which we can crib from as we think > about the Streams "pitch." They also have relatively minimal setup > docs-which Streams is also heading toward (go Steve!). > Agreed this is key The existence of other open-source projects with similar themes suggests we're onto an important problem. We should pay attention to these projects and what is working for them WRT user growth, community growth, tech media coverage, etc... > > Again, my key objective is to understand the Apache Streams vision along > side projects like these and within the wider space of consolidating social > data. What market does it serve? Is it "personal" (as these projects seem > to be)? Or commercial? Or developer-only (library/framework for wiring up > your own idiosyncratic stuff)? > I think the overall objective of streams remains very similar to what it started as: A way to easily and flexibly ingest multiple different sources of 'activity' data in a normalized ActivityStreams format. For me personally, my interest is in ingesting this data at scale and with as little internally-maintained code as possible. While most of the development so far has been geared toward enabling back-end / commercial-scale data collection and management, I think the future should be more about enabling individuals and businesses to transcend data silos using computing resources and code entirely under their own control. This might mean supporting regular users with a full-featured SaaS application in addition to continued work on data interoperability. > > Thanks for reading, pondering, and helping me help. :) > > Cheers! > Benjamin > > -- > http://bigbluehat.com/ > http://linkedin.com/in/benjaminyoung > >
Re: Granary & SocketHub
On October 18, 2016 at 6:09:49 PM, Matt Franklin (m.ben.frank...@gmail.com) wrote: On Tue, Oct 18, 2016 at 10:39 AM Benjamin Young wrote: > (resending from the correct account…likely the other got spammed…) > > Granary is a project with similar ideas and intents as Apache Streams > (which also needs AS2 support ;) ): > https://github.com/snarfed/granary > Ryan from Granary is on the list I think. Hey Ryan! Cool stuff, too bad it’s python :) > In fact Apache Streams gets a mention in their “Related Work” section: > https://github.com/snarfed/granary#related-work > > Also mentioned in the Granary related work section is SocketHub: > https://github.com/sockethub/sockethub > Cool stuff, too bad it’s LGPL :) > It’s aims are similar, but it’s reaching way beyond Web-based social APIs > and “back” to including things like IRC, Email, etc. Non-SNS data sources are important for sure. I’ve posted some work on my personal github using the streams framework to parse MBOX files - https://github.com/steveblackmon/streams-apache - and to collect quantified self data - https://github.com/steveblackmon/humanapi-streams IRC is interesting as well. > What’s significant about both these projects (and others they link to) are > the stories they’re telling developers—which we can crib from as we think > about the Streams “pitch.” They also have relatively minimal setup > docs—which Streams is also heading toward (go Steve!). > Agreed this is key The existence of other open-source projects with similar themes suggests we’re onto an important problem. We should pay attention to these projects and what is working for them WRT user growth, community growth, tech media coverage, etc... > > Again, my key objective is to understand the Apache Streams vision along > side projects like these and within the wider space of consolidating social > data. What market does it serve? Is it “personal” (as these projects seem > to be)? Or commercial? Or developer-only (library/framework for wiring up > your own idiosyncratic stuff)? > I think the overall objective of streams remains very similar to what it started as: A way to easily and flexibly ingest multiple different sources of 'activity' data in a normalized ActivityStreams format. For me personally, my interest is in ingesting this data at scale and with as little internally-maintained code as possible. While most of the development so far has been geared toward enabling back-end / commercial-scale data collection and management, I think the future should be more about enabling individuals and businesses to transcend data silos using computing resources and code entirely under their own control. This might mean supporting regular users with a full-featured SaaS application in addition to continued work on data interoperability. > > Thanks for reading, pondering, and helping me help. :) > > Cheers! > Benjamin > > -- > http://bigbluehat.com/ > http://linkedin.com/in/benjaminyoung > >
Re: Granary & SocketHub
On Tue, Oct 18, 2016 at 10:39 AM Benjamin Young wrote: > (resending from the correct account…likely the other got spammed…) > > Granary is a project with similar ideas and intents as Apache Streams > (which also needs AS2 support ;) ): > https://github.com/snarfed/granary > > In fact Apache Streams gets a mention in their “Related Work” section: > https://github.com/snarfed/granary#related-work > > Also mentioned in the Granary related work section is SocketHub: > https://github.com/sockethub/sockethub > > It’s aims are similar, but it’s reaching way beyond Web-based social APIs > and “back” to including things like IRC, Email, etc. > What’s significant about both these projects (and others they link to) are > the stories they’re telling developers—which we can crib from as we think > about the Streams “pitch.” They also have relatively minimal setup > docs—which Streams is also heading toward (go Steve!). > Agreed this is key > > Again, my key objective is to understand the Apache Streams vision along > side projects like these and within the wider space of consolidating social > data. What market does it serve? Is it “personal” (as these projects seem > to be)? Or commercial? Or developer-only (library/framework for wiring up > your own idiosyncratic stuff)? > I think the overall objective of streams remains very similar to what it started as: A way to easily and flexibly ingest multiple different sources of 'activity' data in a normalized ActivityStreams format. For me personally, my interest is in ingesting this data at scale and with as little internally-maintained code as possible. > > Thanks for reading, pondering, and helping me help. :) > > Cheers! > Benjamin > > -- > http://bigbluehat.com/ > http://linkedin.com/in/benjaminyoung > >