Re: [RESULT][VOTE] Accept Torii into Apache Incubator
On Wed, Dec 2, 2015 at 5:24 PM, Luciano Resendewrote: > Vote has passed with 7 binding +1 from : Hitesh Shah, Luciano Resende, > Sam Ruby, Chris A Mattmann, Jim Jagielski, Reynold Xin, Steve Loughran > and 2 non-binding +1 from Sree V, Luke Han. > > There is an issue with the project name, see discussion at [1]. We will be > identifying a new name for the project before we start creating the project > infrastructure. I will update the vote thread with the project new name for > historical reasons. > > [1] > https://www.mail-archive.com/general@incubator.apache.org/msg52224.html > > Thank you. > > Just an Update on the vote thread, we have chosen the name Toree which currently seems available (see [1] for some more details) . I'll start working on the podling infrastructure setup soon. Thank You [1] https://www.mail-archive.com/general%40incubator.apache.org/msg52527.html > On Thu, Nov 26, 2015 at 7:33 AM, Luciano Resende > wrote: > >> After initial discussion (under the name Spark-Kernel), please vote on >> the acceptance of Torii Project for incubation at the Apache Incubator. >> The full proposal is >> available at the end of this message and on the wiki at : >> >> https://wiki.apache.org/incubator/ToriiProposal >> >> Please cast your votes: >> >> [ ] +1, bring Torii into Incubator >> [ ] +0, I don't care either way >> [ ] -1, do not bring Torii into Incubator, because... >> >> Due to long weekend holiday in US, I will leave the vote open until >> December 1st. >> >> >> = Torii = >> >> == Abstract == >> Torii provides applications with a mechanism to interactively and >> remotely access Apache Spark. >> >> == Proposal == >> Torii enables interactive applications to access Apache Spark clusters. >> More specifically: >> * Applications can send code-snippets and libraries for execution by >> Spark >> * Applications can be deployed separately from Spark clusters and >> communicate with the Torii using the provided Torii client >> * Execution results and streaming data can be sent back to calling >> applications >> * Applications no longer have to be network connected to the workers on >> a Spark cluster because the Torii acts as each application’s proxy >> * Work has started on enabling Torii to support languages in addition to >> Scala, namely Python (with PySpark), R (with SparkR), and SQL (with >> SparkSQL) >> >> == Background & Rationale == >> Apache Spark provides applications with a fast and general purpose >> distributed computing engine that supports static and streaming data, >> tabular and graph representations of data, and an extensive library of >> machine learning libraries. Consequently, a wide variety of applications >> will be written for Spark and there will be interactive applications that >> require relatively frequent function evaluations, and batch-oriented >> applications that require one-shot or only occasional evaluation. >> >> Apache Spark provides two mechanisms for applications to connect with >> Spark. The primary mechanism launches applications on Spark clusters using >> spark-submit ( >> http://spark.apache.org/docs/latest/submitting-applications.html); this >> requires developers to bundle their application code plus any dependencies >> into JAR files, and then submit them to Spark. A second mechanism is an >> ODBC/JDBC API ( >> http://spark.apache.org/docs/latest/sql-programming-guide.html#distributed-sql-engine) >> which enables applications to issue SQL queries against SparkSQL. >> >> Our experience when developing interactive applications, such as analytic >> applications integrated with Notebooks, to run against Spark was that the >> spark-submit mechanism was overly cumbersome and slow (requiring JAR >> creation and forking processes to run spark-submit), and the SQL interface >> was too limiting and did not offer easy access to components other than >> SparkSQL, such as streaming. The most promising mechanism provided by >> Apache Spark was the command-line shell ( >> http://spark.apache.org/docs/latest/programming-guide.html#using-the-shell) >> which enabled us to execute code snippets and dynamically control the tasks >> submitted to a Spark cluster. Spark does not provide the command-line >> shell as a consumable service but it provided us with the starting point >> from which we developed Torii. >> >> >> == Current Status == >> Torii was first developed by a small team working on an internal-IBM >> Spark-related project in July 2014. In recognition of its likely general >> utility to Spark users and developers, in November 2014 the Torii project >> was moved to GitHub and made available under the Apache License V2. >> >> == Meritocracy == >> The current developers are familiar with the meritocratic open source >> development process at Apache. As the project has gathered interest at >> GitHub the developers have actively started a process to invite additional >> developers into the project, and we have
Re: [VOTE] Accept Torii into Apache Incubator
On Tue, Dec 1, 2015 at 10:24 AM, Steve Loughranwrote: > Think I've missed the vote window, but > > +1 binding > > I will repeat what I raised when the proposal first came up, something that > wasn't addresses at all: ZeroMQ is LGPL, which is forbidden as a mandatory > dependency in ASF projects. > > Step 1 of the project is going to have to confirm that the zeroMQ : LGPL+ > Static Linking Exception is sufficient for it to be allowed as a dependency > on the project. I'd like to encourage zeroMQ to move to MPL (and I'm willing to help make that case). Given that LGPL is essentially GPL+a static linking exception, I don't know how LGPL+Static Linking Exception helps; the ZeroMQ licensing page[1] suggests that it is a problem for corporate lawyers to accept; Jim has repeatedly said in various ways that our goal is to be a no-brainer. > If it's not, then that's going to be a fundamental barrier to releasing Torii > as ASF-signed off artifacts - Sam Ruby [1] http://zeromq.org/area:licensing >>> On Thu, Nov 26, 2015 at 10:33 AM, Luciano Resende >>> wrote: After initial discussion (under the name Spark-Kernel), please vote on the acceptance of Torii Project for incubation at the Apache Incubator. The full proposal is available at the end of this message and on the wiki at : https://wiki.apache.org/incubator/ToriiProposal Please cast your votes: [ ] +1, bring Torii into Incubator [ ] +0, I don't care either way [ ] -1, do not bring Torii into Incubator, because... Due to long weekend holiday in US, I will leave the vote open until December 1st. = Torii = == Abstract == Torii provides applications with a mechanism to interactively and remotely access Apache Spark. == Proposal == Torii enables interactive applications to access Apache Spark clusters. More specifically: * Applications can send code-snippets and libraries for execution by Spark * Applications can be deployed separately from Spark clusters and communicate with the Torii using the provided Torii client * Execution results and streaming data can be sent back to calling applications * Applications no longer have to be network connected to the workers on a Spark cluster because the Torii acts as each application’s proxy * Work has started on enabling Torii to support languages in addition to Scala, namely Python (with PySpark), R (with SparkR), and SQL (with SparkSQL) == Background & Rationale == Apache Spark provides applications with a fast and general purpose distributed computing engine that supports static and streaming data, tabular and graph representations of data, and an extensive library of machine learning libraries. Consequently, a wide variety of applications will be written for Spark and there will be interactive applications that require relatively frequent function evaluations, and batch-oriented applications that require one-shot or only occasional evaluation. Apache Spark provides two mechanisms for applications to connect with Spark. The primary mechanism launches applications on Spark clusters using spark-submit ( http://spark.apache.org/docs/latest/submitting-applications.html); this requires developers to bundle their application code plus any dependencies into JAR files, and then submit them to Spark. A second mechanism is an ODBC/JDBC API ( http://spark.apache.org/docs/latest/sql-programming-guide.html#distribute d-sql-engine) which enables applications to issue SQL queries against SparkSQL. Our experience when developing interactive applications, such as analytic applications integrated with Notebooks, to run against Spark was that the spark-submit mechanism was overly cumbersome and slow (requiring JAR creation and forking processes to run spark-submit), and the SQL interface was too limiting and did not offer easy access to components other than SparkSQL, such as streaming. The most promising mechanism provided by Apache Spark was the command-line shell ( http://spark.apache.org/docs/latest/programming-guide.html#using-the-shel l) which enabled us to execute code snippets and dynamically control the tasks submitted to a Spark cluster. Spark does not provide the command-line shell as a consumable service but it provided us with the starting point from which we developed Torii. == Current Status == Torii was first developed by a small team working on an internal-IBM Spark-related project in July 2014. In recognition of its likely general utility to Spark users and developers, in November 2014 the Torii project was moved to
[RESULT][VOTE] Accept Torii into Apache Incubator
Vote has passed with 7 binding +1 from : Hitesh Shah, Luciano Resende, Sam Ruby, Chris A Mattmann, Jim Jagielski, Reynold Xin, Steve Loughran and 2 non-binding +1 from Sree V, Luke Han. There is an issue with the project name, see discussion at [1]. We will be identifying a new name for the project before we start creating the project infrastructure. I will update the vote thread with the project new name for historical reasons. [1] https://www.mail-archive.com/general@incubator.apache.org/msg52224.html Thank you. On Thu, Nov 26, 2015 at 7:33 AM, Luciano Resendewrote: > After initial discussion (under the name Spark-Kernel), please vote on > the acceptance of Torii Project for incubation at the Apache Incubator. > The full proposal is > available at the end of this message and on the wiki at : > > https://wiki.apache.org/incubator/ToriiProposal > > Please cast your votes: > > [ ] +1, bring Torii into Incubator > [ ] +0, I don't care either way > [ ] -1, do not bring Torii into Incubator, because... > > Due to long weekend holiday in US, I will leave the vote open until > December 1st. > > > = Torii = > > == Abstract == > Torii provides applications with a mechanism to interactively and remotely > access Apache Spark. > > == Proposal == > Torii enables interactive applications to access Apache Spark clusters. > More specifically: > * Applications can send code-snippets and libraries for execution by Spark > * Applications can be deployed separately from Spark clusters and > communicate with the Torii using the provided Torii client > * Execution results and streaming data can be sent back to calling > applications > * Applications no longer have to be network connected to the workers on a > Spark cluster because the Torii acts as each application’s proxy > * Work has started on enabling Torii to support languages in addition to > Scala, namely Python (with PySpark), R (with SparkR), and SQL (with > SparkSQL) > > == Background & Rationale == > Apache Spark provides applications with a fast and general purpose > distributed computing engine that supports static and streaming data, > tabular and graph representations of data, and an extensive library of > machine learning libraries. Consequently, a wide variety of applications > will be written for Spark and there will be interactive applications that > require relatively frequent function evaluations, and batch-oriented > applications that require one-shot or only occasional evaluation. > > Apache Spark provides two mechanisms for applications to connect with > Spark. The primary mechanism launches applications on Spark clusters using > spark-submit ( > http://spark.apache.org/docs/latest/submitting-applications.html); this > requires developers to bundle their application code plus any dependencies > into JAR files, and then submit them to Spark. A second mechanism is an > ODBC/JDBC API ( > http://spark.apache.org/docs/latest/sql-programming-guide.html#distributed-sql-engine) > which enables applications to issue SQL queries against SparkSQL. > > Our experience when developing interactive applications, such as analytic > applications integrated with Notebooks, to run against Spark was that the > spark-submit mechanism was overly cumbersome and slow (requiring JAR > creation and forking processes to run spark-submit), and the SQL interface > was too limiting and did not offer easy access to components other than > SparkSQL, such as streaming. The most promising mechanism provided by > Apache Spark was the command-line shell ( > http://spark.apache.org/docs/latest/programming-guide.html#using-the-shell) > which enabled us to execute code snippets and dynamically control the tasks > submitted to a Spark cluster. Spark does not provide the command-line > shell as a consumable service but it provided us with the starting point > from which we developed Torii. > > == Current Status == > Torii was first developed by a small team working on an internal-IBM > Spark-related project in July 2014. In recognition of its likely general > utility to Spark users and developers, in November 2014 the Torii project > was moved to GitHub and made available under the Apache License V2. > > == Meritocracy == > The current developers are familiar with the meritocratic open source > development process at Apache. As the project has gathered interest at > GitHub the developers have actively started a process to invite additional > developers into the project, and we have at least one new developer who is > ready to contribute code to the project. > > == Community == > We started building a community around Torii project when we moved it to > GitHub about one year ago. Since then we have grown to about 70 people, and > there are regular requests and suggestions from the community. We believe > that providing Apache Spark application developers with a general-purpose > and interactive API holds a lot of community potential, especially > considering
Re: [VOTE] Accept Torii into Apache Incubator
+1 (non-binding) Best Regards! - Luke Han On Tue, Dec 1, 2015 at 3:39 PM, Sree Vwrote: > +1 (non-binding) Thanking you.With RegardsSree > > > On Monday, November 30, 2015 3:21 PM, Reynold Xin > wrote: > > > +1 > > > On Dec 1, 2015, at 2:08 AM, Luciano Resende > wrote: > > > > And off-course, Here is my +1 (binding). > > > > On Thu, Nov 26, 2015 at 7:33 AM, Luciano Resende > > wrote: > > > >> After initial discussion (under the name Spark-Kernel), please vote on > >> the acceptance of Torii Project for incubation at the Apache Incubator. > >> The full proposal is > >> available at the end of this message and on the wiki at : > >> > >> https://wiki.apache.org/incubator/ToriiProposal > >> > >> Please cast your votes: > >> > >> [ ] +1, bring Torii into Incubator > >> [ ] +0, I don't care either way > >> [ ] -1, do not bring Torii into Incubator, because... > >> > >> Due to long weekend holiday in US, I will leave the vote open until > >> December 1st. > >> > >> > >> = Torii = > >> > >> == Abstract == > >> Torii provides applications with a mechanism to interactively and > remotely > >> access Apache Spark. > >> > >> == Proposal == > >> Torii enables interactive applications to access Apache Spark clusters. > >> More specifically: > >> * Applications can send code-snippets and libraries for execution by > Spark > >> * Applications can be deployed separately from Spark clusters and > >> communicate with the Torii using the provided Torii client > >> * Execution results and streaming data can be sent back to calling > >> applications > >> * Applications no longer have to be network connected to the workers on > a > >> Spark cluster because the Torii acts as each application’s proxy > >> * Work has started on enabling Torii to support languages in addition to > >> Scala, namely Python (with PySpark), R (with SparkR), and SQL (with > >> SparkSQL) > >> > >> == Background & Rationale == > >> Apache Spark provides applications with a fast and general purpose > >> distributed computing engine that supports static and streaming data, > >> tabular and graph representations of data, and an extensive library of > >> machine learning libraries. Consequently, a wide variety of applications > >> will be written for Spark and there will be interactive applications > that > >> require relatively frequent function evaluations, and batch-oriented > >> applications that require one-shot or only occasional evaluation. > >> > >> Apache Spark provides two mechanisms for applications to connect with > >> Spark. The primary mechanism launches applications on Spark clusters > using > >> spark-submit ( > >> http://spark.apache.org/docs/latest/submitting-applications.html); this > >> requires developers to bundle their application code plus any > dependencies > >> into JAR files, and then submit them to Spark. A second mechanism is an > >> ODBC/JDBC API ( > >> > http://spark.apache.org/docs/latest/sql-programming-guide.html#distributed-sql-engine > ) > >> which enables applications to issue SQL queries against SparkSQL. > >> > >> Our experience when developing interactive applications, such as > analytic > >> applications integrated with Notebooks, to run against Spark was that > the > >> spark-submit mechanism was overly cumbersome and slow (requiring JAR > >> creation and forking processes to run spark-submit), and the SQL > interface > >> was too limiting and did not offer easy access to components other than > >> SparkSQL, such as streaming. The most promising mechanism provided by > >> Apache Spark was the command-line shell ( > >> > http://spark.apache.org/docs/latest/programming-guide.html#using-the-shell > ) > >> which enabled us to execute code snippets and dynamically control the > tasks > >> submitted to a Spark cluster. Spark does not provide the command-line > >> shell as a consumable service but it provided us with the starting point > >> from which we developed Torii. > >> > >> == Current Status == > >> Torii was first developed by a small team working on an internal-IBM > >> Spark-related project in July 2014. In recognition of its likely general > >> utility to Spark users and developers, in November 2014 the Torii > project > >> was moved to GitHub and made available under the Apache License V2. > >> > >> == Meritocracy == > >> The current developers are familiar with the meritocratic open source > >> development process at Apache. As the project has gathered interest at > >> GitHub the developers have actively started a process to invite > additional > >> developers into the project, and we have at least one new developer who > is > >> ready to contribute code to the project. > >> > >> == Community == > >> We started building a community around Torii project when we moved it to > >> GitHub about one year ago. Since then we have grown to about 70 people, > and > >> there are regular
Re: [VOTE] Accept Torii into Apache Incubator
Think I've missed the vote window, but +1 binding I will repeat what I raised when the proposal first came up, something that wasn't addresses at all: ZeroMQ is LGPL, which is forbidden as a mandatory dependency in ASF projects. Step 1 of the project is going to have to confirm that the zeroMQ : LGPL+ Static Linking Exception is sufficient for it to be allowed as a dependency on the project. If it's not, then that's going to be a fundamental barrier to releasing Torii as ASF-signed off artifacts >> >> On Thu, Nov 26, 2015 at 10:33 AM, Luciano Resende>> wrote: >>> After initial discussion (under the name Spark-Kernel), please vote on >>> the >>> acceptance of Torii Project for incubation at the Apache Incubator. The >>> full proposal is >>> available at the end of this message and on the wiki at : >>> >>> https://wiki.apache.org/incubator/ToriiProposal >>> >>> Please cast your votes: >>> >>> [ ] +1, bring Torii into Incubator >>> [ ] +0, I don't care either way >>> [ ] -1, do not bring Torii into Incubator, because... >>> >>> Due to long weekend holiday in US, I will leave the vote open until >>> December 1st. >>> >>> >>> = Torii = >>> >>> == Abstract == >>> Torii provides applications with a mechanism to interactively and >>> remotely >>> access Apache Spark. >>> >>> == Proposal == >>> Torii enables interactive applications to access Apache Spark clusters. >>> More specifically: >>> * Applications can send code-snippets and libraries for execution by >>> Spark >>> * Applications can be deployed separately from Spark clusters and >>> communicate with the Torii using the provided Torii client >>> * Execution results and streaming data can be sent back to calling >>> applications >>> * Applications no longer have to be network connected to the workers >>> on a >>> Spark cluster because the Torii acts as each application’s proxy >>> * Work has started on enabling Torii to support languages in addition >>> to >>> Scala, namely Python (with PySpark), R (with SparkR), and SQL (with >>> SparkSQL) >>> >>> == Background & Rationale == >>> Apache Spark provides applications with a fast and general purpose >>> distributed computing engine that supports static and streaming data, >>> tabular and graph representations of data, and an extensive library of >>> machine learning libraries. Consequently, a wide variety of applications >>> will be written for Spark and there will be interactive applications >>> that >>> require relatively frequent function evaluations, and batch-oriented >>> applications that require one-shot or only occasional evaluation. >>> >>> Apache Spark provides two mechanisms for applications to connect with >>> Spark. The primary mechanism launches applications on Spark clusters >>> using >>> spark-submit ( >>> http://spark.apache.org/docs/latest/submitting-applications.html); this >>> requires developers to bundle their application code plus any >>> dependencies >>> into JAR files, and then submit them to Spark. A second mechanism is an >>> ODBC/JDBC API ( >>> >>> http://spark.apache.org/docs/latest/sql-programming-guide.html#distribute >>> d-sql-engine) >>> which enables applications to issue SQL queries against SparkSQL. >>> >>> Our experience when developing interactive applications, such as >>> analytic >>> applications integrated with Notebooks, to run against Spark was that >>> the >>> spark-submit mechanism was overly cumbersome and slow (requiring JAR >>> creation and forking processes to run spark-submit), and the SQL >>> interface >>> was too limiting and did not offer easy access to components other than >>> SparkSQL, such as streaming. The most promising mechanism provided by >>> Apache Spark was the command-line shell ( >>> >>> http://spark.apache.org/docs/latest/programming-guide.html#using-the-shel >>> l) >>> which enabled us to execute code snippets and dynamically control the >>> tasks >>> submitted to a Spark cluster. Spark does not provide the command-line >>> shell as a consumable service but it provided us with the starting point >>> from which we developed Torii. >>> >>> == Current Status == >>> Torii was first developed by a small team working on an internal-IBM >>> Spark-related project in July 2014. In recognition of its likely general >>> utility to Spark users and developers, in November 2014 the Torii >>> project >>> was moved to GitHub and made available under the Apache License V2. >>> >>> == Meritocracy == >>> The current developers are familiar with the meritocratic open source >>> development process at Apache. As the project has gathered interest at >>> GitHub the developers have actively started a process to invite >>> additional >>> developers into the project, and we have at least one new developer who >>> is >>> ready to contribute code to the project. >>> >>> == Community == >>> We started building a community around Torii project when we moved it to >>> GitHub about one year ago. Since then we have grown to
Re: [VOTE] Accept Torii into Apache Incubator
+1 (binding) - Sam Ruby On Thu, Nov 26, 2015 at 10:33 AM, Luciano Resendewrote: > After initial discussion (under the name Spark-Kernel), please vote on the > acceptance of Torii Project for incubation at the Apache Incubator. The > full proposal is > available at the end of this message and on the wiki at : > > https://wiki.apache.org/incubator/ToriiProposal > > Please cast your votes: > > [ ] +1, bring Torii into Incubator > [ ] +0, I don't care either way > [ ] -1, do not bring Torii into Incubator, because... > > Due to long weekend holiday in US, I will leave the vote open until > December 1st. > > > = Torii = > > == Abstract == > Torii provides applications with a mechanism to interactively and remotely > access Apache Spark. > > == Proposal == > Torii enables interactive applications to access Apache Spark clusters. > More specifically: > * Applications can send code-snippets and libraries for execution by Spark > * Applications can be deployed separately from Spark clusters and > communicate with the Torii using the provided Torii client > * Execution results and streaming data can be sent back to calling > applications > * Applications no longer have to be network connected to the workers on a > Spark cluster because the Torii acts as each application’s proxy > * Work has started on enabling Torii to support languages in addition to > Scala, namely Python (with PySpark), R (with SparkR), and SQL (with > SparkSQL) > > == Background & Rationale == > Apache Spark provides applications with a fast and general purpose > distributed computing engine that supports static and streaming data, > tabular and graph representations of data, and an extensive library of > machine learning libraries. Consequently, a wide variety of applications > will be written for Spark and there will be interactive applications that > require relatively frequent function evaluations, and batch-oriented > applications that require one-shot or only occasional evaluation. > > Apache Spark provides two mechanisms for applications to connect with > Spark. The primary mechanism launches applications on Spark clusters using > spark-submit ( > http://spark.apache.org/docs/latest/submitting-applications.html); this > requires developers to bundle their application code plus any dependencies > into JAR files, and then submit them to Spark. A second mechanism is an > ODBC/JDBC API ( > http://spark.apache.org/docs/latest/sql-programming-guide.html#distributed-sql-engine) > which enables applications to issue SQL queries against SparkSQL. > > Our experience when developing interactive applications, such as analytic > applications integrated with Notebooks, to run against Spark was that the > spark-submit mechanism was overly cumbersome and slow (requiring JAR > creation and forking processes to run spark-submit), and the SQL interface > was too limiting and did not offer easy access to components other than > SparkSQL, such as streaming. The most promising mechanism provided by > Apache Spark was the command-line shell ( > http://spark.apache.org/docs/latest/programming-guide.html#using-the-shell) > which enabled us to execute code snippets and dynamically control the tasks > submitted to a Spark cluster. Spark does not provide the command-line > shell as a consumable service but it provided us with the starting point > from which we developed Torii. > > == Current Status == > Torii was first developed by a small team working on an internal-IBM > Spark-related project in July 2014. In recognition of its likely general > utility to Spark users and developers, in November 2014 the Torii project > was moved to GitHub and made available under the Apache License V2. > > == Meritocracy == > The current developers are familiar with the meritocratic open source > development process at Apache. As the project has gathered interest at > GitHub the developers have actively started a process to invite additional > developers into the project, and we have at least one new developer who is > ready to contribute code to the project. > > == Community == > We started building a community around Torii project when we moved it to > GitHub about one year ago. Since then we have grown to about 70 people, and > there are regular requests and suggestions from the community. We believe > that providing Apache Spark application developers with a general-purpose > and interactive API holds a lot of community potential, especially > considering possible tie-in’s with Notebooks and data science community. > > == Core Developers == > The core developers of the project are currently all from IBM, from the IBM > Emerging Technology team and from IBM’s recently formed Spark Technology > Center. > > == Alignment == > Apache, as the home of Apache Spark, is the most natural home for the Torii > project because it was designed to work with Apache Spark and to provide > capabilities for interactive applications and data science tools not >
Re: [VOTE] Accept Torii into Apache Incubator
+1 from me. ++ Chris Mattmann, Ph.D. Chief Architect Instrument Software and Science Data Systems Section (398) NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA Office: 168-519, Mailstop: 168-527 Email: chris.a.mattm...@nasa.gov WWW: http://sunset.usc.edu/~mattmann/ ++ Adjunct Associate Professor, Computer Science Department University of Southern California, Los Angeles, CA 90089 USA ++ -Original Message- From: <sa3r...@gmail.com> on behalf of Sam Ruby <ru...@intertwingly.net> Reply-To: "general@incubator.apache.org" <general@incubator.apache.org> Date: Monday, November 30, 2015 at 10:58 AM To: "general@incubator.apache.org" <general@incubator.apache.org> Subject: Re: [VOTE] Accept Torii into Apache Incubator >+1 (binding) > >- Sam Ruby > >On Thu, Nov 26, 2015 at 10:33 AM, Luciano Resende <luckbr1...@gmail.com> >wrote: >> After initial discussion (under the name Spark-Kernel), please vote on >>the >> acceptance of Torii Project for incubation at the Apache Incubator. The >> full proposal is >> available at the end of this message and on the wiki at : >> >> https://wiki.apache.org/incubator/ToriiProposal >> >> Please cast your votes: >> >> [ ] +1, bring Torii into Incubator >> [ ] +0, I don't care either way >> [ ] -1, do not bring Torii into Incubator, because... >> >> Due to long weekend holiday in US, I will leave the vote open until >> December 1st. >> >> >> = Torii = >> >> == Abstract == >> Torii provides applications with a mechanism to interactively and >>remotely >> access Apache Spark. >> >> == Proposal == >> Torii enables interactive applications to access Apache Spark clusters. >> More specifically: >> * Applications can send code-snippets and libraries for execution by >>Spark >> * Applications can be deployed separately from Spark clusters and >> communicate with the Torii using the provided Torii client >> * Execution results and streaming data can be sent back to calling >> applications >> * Applications no longer have to be network connected to the workers >>on a >> Spark cluster because the Torii acts as each application’s proxy >> * Work has started on enabling Torii to support languages in addition >>to >> Scala, namely Python (with PySpark), R (with SparkR), and SQL (with >> SparkSQL) >> >> == Background & Rationale == >> Apache Spark provides applications with a fast and general purpose >> distributed computing engine that supports static and streaming data, >> tabular and graph representations of data, and an extensive library of >> machine learning libraries. Consequently, a wide variety of applications >> will be written for Spark and there will be interactive applications >>that >> require relatively frequent function evaluations, and batch-oriented >> applications that require one-shot or only occasional evaluation. >> >> Apache Spark provides two mechanisms for applications to connect with >> Spark. The primary mechanism launches applications on Spark clusters >>using >> spark-submit ( >> http://spark.apache.org/docs/latest/submitting-applications.html); this >> requires developers to bundle their application code plus any >>dependencies >> into JAR files, and then submit them to Spark. A second mechanism is an >> ODBC/JDBC API ( >> >>http://spark.apache.org/docs/latest/sql-programming-guide.html#distribute >>d-sql-engine) >> which enables applications to issue SQL queries against SparkSQL. >> >> Our experience when developing interactive applications, such as >>analytic >> applications integrated with Notebooks, to run against Spark was that >>the >> spark-submit mechanism was overly cumbersome and slow (requiring JAR >> creation and forking processes to run spark-submit), and the SQL >>interface >> was too limiting and did not offer easy access to components other than >> SparkSQL, such as streaming. The most promising mechanism provided by >> Apache Spark was the command-line shell ( >> >>http://spark.apache.org/docs/latest/programming-guide.html#using-the-shel >>l) >> which enabled us to execute code snippets and dynamically control the >>tasks >> submitted to a Spark cluster. Spark does not provide the command-line >> shell as a consumable service but it provided us with the starting point >&g
Re: [VOTE] Accept Torii into Apache Incubator
And off-course, Here is my +1 (binding). On Thu, Nov 26, 2015 at 7:33 AM, Luciano Resendewrote: > After initial discussion (under the name Spark-Kernel), please vote on > the acceptance of Torii Project for incubation at the Apache Incubator. > The full proposal is > available at the end of this message and on the wiki at : > > https://wiki.apache.org/incubator/ToriiProposal > > Please cast your votes: > > [ ] +1, bring Torii into Incubator > [ ] +0, I don't care either way > [ ] -1, do not bring Torii into Incubator, because... > > Due to long weekend holiday in US, I will leave the vote open until > December 1st. > > > = Torii = > > == Abstract == > Torii provides applications with a mechanism to interactively and remotely > access Apache Spark. > > == Proposal == > Torii enables interactive applications to access Apache Spark clusters. > More specifically: > * Applications can send code-snippets and libraries for execution by Spark > * Applications can be deployed separately from Spark clusters and > communicate with the Torii using the provided Torii client > * Execution results and streaming data can be sent back to calling > applications > * Applications no longer have to be network connected to the workers on a > Spark cluster because the Torii acts as each application’s proxy > * Work has started on enabling Torii to support languages in addition to > Scala, namely Python (with PySpark), R (with SparkR), and SQL (with > SparkSQL) > > == Background & Rationale == > Apache Spark provides applications with a fast and general purpose > distributed computing engine that supports static and streaming data, > tabular and graph representations of data, and an extensive library of > machine learning libraries. Consequently, a wide variety of applications > will be written for Spark and there will be interactive applications that > require relatively frequent function evaluations, and batch-oriented > applications that require one-shot or only occasional evaluation. > > Apache Spark provides two mechanisms for applications to connect with > Spark. The primary mechanism launches applications on Spark clusters using > spark-submit ( > http://spark.apache.org/docs/latest/submitting-applications.html); this > requires developers to bundle their application code plus any dependencies > into JAR files, and then submit them to Spark. A second mechanism is an > ODBC/JDBC API ( > http://spark.apache.org/docs/latest/sql-programming-guide.html#distributed-sql-engine) > which enables applications to issue SQL queries against SparkSQL. > > Our experience when developing interactive applications, such as analytic > applications integrated with Notebooks, to run against Spark was that the > spark-submit mechanism was overly cumbersome and slow (requiring JAR > creation and forking processes to run spark-submit), and the SQL interface > was too limiting and did not offer easy access to components other than > SparkSQL, such as streaming. The most promising mechanism provided by > Apache Spark was the command-line shell ( > http://spark.apache.org/docs/latest/programming-guide.html#using-the-shell) > which enabled us to execute code snippets and dynamically control the tasks > submitted to a Spark cluster. Spark does not provide the command-line > shell as a consumable service but it provided us with the starting point > from which we developed Torii. > > == Current Status == > Torii was first developed by a small team working on an internal-IBM > Spark-related project in July 2014. In recognition of its likely general > utility to Spark users and developers, in November 2014 the Torii project > was moved to GitHub and made available under the Apache License V2. > > == Meritocracy == > The current developers are familiar with the meritocratic open source > development process at Apache. As the project has gathered interest at > GitHub the developers have actively started a process to invite additional > developers into the project, and we have at least one new developer who is > ready to contribute code to the project. > > == Community == > We started building a community around Torii project when we moved it to > GitHub about one year ago. Since then we have grown to about 70 people, and > there are regular requests and suggestions from the community. We believe > that providing Apache Spark application developers with a general-purpose > and interactive API holds a lot of community potential, especially > considering possible tie-in’s with Notebooks and data science community. > > == Core Developers == > The core developers of the project are currently all from IBM, from the > IBM Emerging Technology team and from IBM’s recently formed Spark > Technology Center. > > == Alignment == > Apache, as the home of Apache Spark, is the most natural home for the > Torii project because it was designed to work with Apache Spark and to > provide capabilities for interactive applications and data
Re: [VOTE] Accept Torii into Apache Incubator
+1 (binding) > On Nov 30, 2015, at 1:08 PM, Luciano Resendewrote: > > And off-course, Here is my +1 (binding). > > On Thu, Nov 26, 2015 at 7:33 AM, Luciano Resende > wrote: > >> After initial discussion (under the name Spark-Kernel), please vote on >> the acceptance of Torii Project for incubation at the Apache Incubator. >> The full proposal is >> available at the end of this message and on the wiki at : >> >> https://wiki.apache.org/incubator/ToriiProposal >> >> Please cast your votes: >> >> [ ] +1, bring Torii into Incubator >> [ ] +0, I don't care either way >> [ ] -1, do not bring Torii into Incubator, because... >> >> Due to long weekend holiday in US, I will leave the vote open until >> December 1st. >> >> >> = Torii = >> >> == Abstract == >> Torii provides applications with a mechanism to interactively and remotely >> access Apache Spark. >> >> == Proposal == >> Torii enables interactive applications to access Apache Spark clusters. >> More specifically: >> * Applications can send code-snippets and libraries for execution by Spark >> * Applications can be deployed separately from Spark clusters and >> communicate with the Torii using the provided Torii client >> * Execution results and streaming data can be sent back to calling >> applications >> * Applications no longer have to be network connected to the workers on a >> Spark cluster because the Torii acts as each application’s proxy >> * Work has started on enabling Torii to support languages in addition to >> Scala, namely Python (with PySpark), R (with SparkR), and SQL (with >> SparkSQL) >> >> == Background & Rationale == >> Apache Spark provides applications with a fast and general purpose >> distributed computing engine that supports static and streaming data, >> tabular and graph representations of data, and an extensive library of >> machine learning libraries. Consequently, a wide variety of applications >> will be written for Spark and there will be interactive applications that >> require relatively frequent function evaluations, and batch-oriented >> applications that require one-shot or only occasional evaluation. >> >> Apache Spark provides two mechanisms for applications to connect with >> Spark. The primary mechanism launches applications on Spark clusters using >> spark-submit ( >> http://spark.apache.org/docs/latest/submitting-applications.html); this >> requires developers to bundle their application code plus any dependencies >> into JAR files, and then submit them to Spark. A second mechanism is an >> ODBC/JDBC API ( >> http://spark.apache.org/docs/latest/sql-programming-guide.html#distributed-sql-engine) >> which enables applications to issue SQL queries against SparkSQL. >> >> Our experience when developing interactive applications, such as analytic >> applications integrated with Notebooks, to run against Spark was that the >> spark-submit mechanism was overly cumbersome and slow (requiring JAR >> creation and forking processes to run spark-submit), and the SQL interface >> was too limiting and did not offer easy access to components other than >> SparkSQL, such as streaming. The most promising mechanism provided by >> Apache Spark was the command-line shell ( >> http://spark.apache.org/docs/latest/programming-guide.html#using-the-shell) >> which enabled us to execute code snippets and dynamically control the tasks >> submitted to a Spark cluster. Spark does not provide the command-line >> shell as a consumable service but it provided us with the starting point >> from which we developed Torii. >> >> == Current Status == >> Torii was first developed by a small team working on an internal-IBM >> Spark-related project in July 2014. In recognition of its likely general >> utility to Spark users and developers, in November 2014 the Torii project >> was moved to GitHub and made available under the Apache License V2. >> >> == Meritocracy == >> The current developers are familiar with the meritocratic open source >> development process at Apache. As the project has gathered interest at >> GitHub the developers have actively started a process to invite additional >> developers into the project, and we have at least one new developer who is >> ready to contribute code to the project. >> >> == Community == >> We started building a community around Torii project when we moved it to >> GitHub about one year ago. Since then we have grown to about 70 people, and >> there are regular requests and suggestions from the community. We believe >> that providing Apache Spark application developers with a general-purpose >> and interactive API holds a lot of community potential, especially >> considering possible tie-in’s with Notebooks and data science community. >> >> == Core Developers == >> The core developers of the project are currently all from IBM, from the >> IBM Emerging Technology team and from IBM’s recently formed Spark >> Technology Center. >> >> == Alignment == >>
Re: [VOTE] Accept Torii into Apache Incubator
+1 > On Dec 1, 2015, at 2:08 AM, Luciano Resendewrote: > > And off-course, Here is my +1 (binding). > > On Thu, Nov 26, 2015 at 7:33 AM, Luciano Resende > wrote: > >> After initial discussion (under the name Spark-Kernel), please vote on >> the acceptance of Torii Project for incubation at the Apache Incubator. >> The full proposal is >> available at the end of this message and on the wiki at : >> >> https://wiki.apache.org/incubator/ToriiProposal >> >> Please cast your votes: >> >> [ ] +1, bring Torii into Incubator >> [ ] +0, I don't care either way >> [ ] -1, do not bring Torii into Incubator, because... >> >> Due to long weekend holiday in US, I will leave the vote open until >> December 1st. >> >> >> = Torii = >> >> == Abstract == >> Torii provides applications with a mechanism to interactively and remotely >> access Apache Spark. >> >> == Proposal == >> Torii enables interactive applications to access Apache Spark clusters. >> More specifically: >> * Applications can send code-snippets and libraries for execution by Spark >> * Applications can be deployed separately from Spark clusters and >> communicate with the Torii using the provided Torii client >> * Execution results and streaming data can be sent back to calling >> applications >> * Applications no longer have to be network connected to the workers on a >> Spark cluster because the Torii acts as each application’s proxy >> * Work has started on enabling Torii to support languages in addition to >> Scala, namely Python (with PySpark), R (with SparkR), and SQL (with >> SparkSQL) >> >> == Background & Rationale == >> Apache Spark provides applications with a fast and general purpose >> distributed computing engine that supports static and streaming data, >> tabular and graph representations of data, and an extensive library of >> machine learning libraries. Consequently, a wide variety of applications >> will be written for Spark and there will be interactive applications that >> require relatively frequent function evaluations, and batch-oriented >> applications that require one-shot or only occasional evaluation. >> >> Apache Spark provides two mechanisms for applications to connect with >> Spark. The primary mechanism launches applications on Spark clusters using >> spark-submit ( >> http://spark.apache.org/docs/latest/submitting-applications.html); this >> requires developers to bundle their application code plus any dependencies >> into JAR files, and then submit them to Spark. A second mechanism is an >> ODBC/JDBC API ( >> http://spark.apache.org/docs/latest/sql-programming-guide.html#distributed-sql-engine) >> which enables applications to issue SQL queries against SparkSQL. >> >> Our experience when developing interactive applications, such as analytic >> applications integrated with Notebooks, to run against Spark was that the >> spark-submit mechanism was overly cumbersome and slow (requiring JAR >> creation and forking processes to run spark-submit), and the SQL interface >> was too limiting and did not offer easy access to components other than >> SparkSQL, such as streaming. The most promising mechanism provided by >> Apache Spark was the command-line shell ( >> http://spark.apache.org/docs/latest/programming-guide.html#using-the-shell) >> which enabled us to execute code snippets and dynamically control the tasks >> submitted to a Spark cluster. Spark does not provide the command-line >> shell as a consumable service but it provided us with the starting point >> from which we developed Torii. >> >> == Current Status == >> Torii was first developed by a small team working on an internal-IBM >> Spark-related project in July 2014. In recognition of its likely general >> utility to Spark users and developers, in November 2014 the Torii project >> was moved to GitHub and made available under the Apache License V2. >> >> == Meritocracy == >> The current developers are familiar with the meritocratic open source >> development process at Apache. As the project has gathered interest at >> GitHub the developers have actively started a process to invite additional >> developers into the project, and we have at least one new developer who is >> ready to contribute code to the project. >> >> == Community == >> We started building a community around Torii project when we moved it to >> GitHub about one year ago. Since then we have grown to about 70 people, and >> there are regular requests and suggestions from the community. We believe >> that providing Apache Spark application developers with a general-purpose >> and interactive API holds a lot of community potential, especially >> considering possible tie-in’s with Notebooks and data science community. >> >> == Core Developers == >> The core developers of the project are currently all from IBM, from the >> IBM Emerging Technology team and from IBM’s recently formed Spark >> Technology Center. >> >> == Alignment == >> Apache,
Re: [VOTE] Accept Torii into Apache Incubator
+1 (non-binding) Thanking you.With RegardsSree On Monday, November 30, 2015 3:21 PM, Reynold Xinwrote: +1 > On Dec 1, 2015, at 2:08 AM, Luciano Resende wrote: > > And off-course, Here is my +1 (binding). > > On Thu, Nov 26, 2015 at 7:33 AM, Luciano Resende > wrote: > >> After initial discussion (under the name Spark-Kernel), please vote on >> the acceptance of Torii Project for incubation at the Apache Incubator. >> The full proposal is >> available at the end of this message and on the wiki at : >> >> https://wiki.apache.org/incubator/ToriiProposal >> >> Please cast your votes: >> >> [ ] +1, bring Torii into Incubator >> [ ] +0, I don't care either way >> [ ] -1, do not bring Torii into Incubator, because... >> >> Due to long weekend holiday in US, I will leave the vote open until >> December 1st. >> >> >> = Torii = >> >> == Abstract == >> Torii provides applications with a mechanism to interactively and remotely >> access Apache Spark. >> >> == Proposal == >> Torii enables interactive applications to access Apache Spark clusters. >> More specifically: >> * Applications can send code-snippets and libraries for execution by Spark >> * Applications can be deployed separately from Spark clusters and >> communicate with the Torii using the provided Torii client >> * Execution results and streaming data can be sent back to calling >> applications >> * Applications no longer have to be network connected to the workers on a >> Spark cluster because the Torii acts as each application’s proxy >> * Work has started on enabling Torii to support languages in addition to >> Scala, namely Python (with PySpark), R (with SparkR), and SQL (with >> SparkSQL) >> >> == Background & Rationale == >> Apache Spark provides applications with a fast and general purpose >> distributed computing engine that supports static and streaming data, >> tabular and graph representations of data, and an extensive library of >> machine learning libraries. Consequently, a wide variety of applications >> will be written for Spark and there will be interactive applications that >> require relatively frequent function evaluations, and batch-oriented >> applications that require one-shot or only occasional evaluation. >> >> Apache Spark provides two mechanisms for applications to connect with >> Spark. The primary mechanism launches applications on Spark clusters using >> spark-submit ( >> http://spark.apache.org/docs/latest/submitting-applications.html); this >> requires developers to bundle their application code plus any dependencies >> into JAR files, and then submit them to Spark. A second mechanism is an >> ODBC/JDBC API ( >> http://spark.apache.org/docs/latest/sql-programming-guide.html#distributed-sql-engine) >> which enables applications to issue SQL queries against SparkSQL. >> >> Our experience when developing interactive applications, such as analytic >> applications integrated with Notebooks, to run against Spark was that the >> spark-submit mechanism was overly cumbersome and slow (requiring JAR >> creation and forking processes to run spark-submit), and the SQL interface >> was too limiting and did not offer easy access to components other than >> SparkSQL, such as streaming. The most promising mechanism provided by >> Apache Spark was the command-line shell ( >> http://spark.apache.org/docs/latest/programming-guide.html#using-the-shell) >> which enabled us to execute code snippets and dynamically control the tasks >> submitted to a Spark cluster. Spark does not provide the command-line >> shell as a consumable service but it provided us with the starting point >> from which we developed Torii. >> >> == Current Status == >> Torii was first developed by a small team working on an internal-IBM >> Spark-related project in July 2014. In recognition of its likely general >> utility to Spark users and developers, in November 2014 the Torii project >> was moved to GitHub and made available under the Apache License V2. >> >> == Meritocracy == >> The current developers are familiar with the meritocratic open source >> development process at Apache. As the project has gathered interest at >> GitHub the developers have actively started a process to invite additional >> developers into the project, and we have at least one new developer who is >> ready to contribute code to the project. >> >> == Community == >> We started building a community around Torii project when we moved it to >> GitHub about one year ago. Since then we have grown to about 70 people, and >> there are regular requests and suggestions from the community. We believe >> that providing Apache Spark application developers with a general-purpose >> and interactive API holds a lot of community potential, especially >> considering possible tie-in’s with Notebooks and data science community. >> >> == Core Developers == >> The core developers of the project are currently all from
[VOTE] Accept Torii into Apache Incubator
After initial discussion (under the name Spark-Kernel), please vote on the acceptance of Torii Project for incubation at the Apache Incubator. The full proposal is available at the end of this message and on the wiki at : https://wiki.apache.org/incubator/ToriiProposal Please cast your votes: [ ] +1, bring Torii into Incubator [ ] +0, I don't care either way [ ] -1, do not bring Torii into Incubator, because... Due to long weekend holiday in US, I will leave the vote open until December 1st. = Torii = == Abstract == Torii provides applications with a mechanism to interactively and remotely access Apache Spark. == Proposal == Torii enables interactive applications to access Apache Spark clusters. More specifically: * Applications can send code-snippets and libraries for execution by Spark * Applications can be deployed separately from Spark clusters and communicate with the Torii using the provided Torii client * Execution results and streaming data can be sent back to calling applications * Applications no longer have to be network connected to the workers on a Spark cluster because the Torii acts as each application’s proxy * Work has started on enabling Torii to support languages in addition to Scala, namely Python (with PySpark), R (with SparkR), and SQL (with SparkSQL) == Background & Rationale == Apache Spark provides applications with a fast and general purpose distributed computing engine that supports static and streaming data, tabular and graph representations of data, and an extensive library of machine learning libraries. Consequently, a wide variety of applications will be written for Spark and there will be interactive applications that require relatively frequent function evaluations, and batch-oriented applications that require one-shot or only occasional evaluation. Apache Spark provides two mechanisms for applications to connect with Spark. The primary mechanism launches applications on Spark clusters using spark-submit ( http://spark.apache.org/docs/latest/submitting-applications.html); this requires developers to bundle their application code plus any dependencies into JAR files, and then submit them to Spark. A second mechanism is an ODBC/JDBC API ( http://spark.apache.org/docs/latest/sql-programming-guide.html#distributed-sql-engine) which enables applications to issue SQL queries against SparkSQL. Our experience when developing interactive applications, such as analytic applications integrated with Notebooks, to run against Spark was that the spark-submit mechanism was overly cumbersome and slow (requiring JAR creation and forking processes to run spark-submit), and the SQL interface was too limiting and did not offer easy access to components other than SparkSQL, such as streaming. The most promising mechanism provided by Apache Spark was the command-line shell ( http://spark.apache.org/docs/latest/programming-guide.html#using-the-shell) which enabled us to execute code snippets and dynamically control the tasks submitted to a Spark cluster. Spark does not provide the command-line shell as a consumable service but it provided us with the starting point from which we developed Torii. == Current Status == Torii was first developed by a small team working on an internal-IBM Spark-related project in July 2014. In recognition of its likely general utility to Spark users and developers, in November 2014 the Torii project was moved to GitHub and made available under the Apache License V2. == Meritocracy == The current developers are familiar with the meritocratic open source development process at Apache. As the project has gathered interest at GitHub the developers have actively started a process to invite additional developers into the project, and we have at least one new developer who is ready to contribute code to the project. == Community == We started building a community around Torii project when we moved it to GitHub about one year ago. Since then we have grown to about 70 people, and there are regular requests and suggestions from the community. We believe that providing Apache Spark application developers with a general-purpose and interactive API holds a lot of community potential, especially considering possible tie-in’s with Notebooks and data science community. == Core Developers == The core developers of the project are currently all from IBM, from the IBM Emerging Technology team and from IBM’s recently formed Spark Technology Center. == Alignment == Apache, as the home of Apache Spark, is the most natural home for the Torii project because it was designed to work with Apache Spark and to provide capabilities for interactive applications and data science tools not provided by Spark itself. The Torii also has an affinity with Jupyter (jupyter.org) because it uses the Jupyter protocol for communications, and so Jupyter Notebooks can directly use the Torii as a kernel for communicating with Apache Spark. However, we believe that the Torii provides a
Re: [VOTE] Accept Torii into Apache Incubator
+1 (binding) — Hitesh On Nov 26, 2015, at 7:33 AM, Luciano Resendewrote: > After initial discussion (under the name Spark-Kernel), please vote on the > acceptance of Torii Project for incubation at the Apache Incubator. The > full proposal is > available at the end of this message and on the wiki at : > > https://wiki.apache.org/incubator/ToriiProposal > > Please cast your votes: > > [ ] +1, bring Torii into Incubator > [ ] +0, I don't care either way > [ ] -1, do not bring Torii into Incubator, because... > > Due to long weekend holiday in US, I will leave the vote open until > December 1st. > > > = Torii = > > == Abstract == > Torii provides applications with a mechanism to interactively and remotely > access Apache Spark. > > == Proposal == > Torii enables interactive applications to access Apache Spark clusters. > More specifically: > * Applications can send code-snippets and libraries for execution by Spark > * Applications can be deployed separately from Spark clusters and > communicate with the Torii using the provided Torii client > * Execution results and streaming data can be sent back to calling > applications > * Applications no longer have to be network connected to the workers on a > Spark cluster because the Torii acts as each application’s proxy > * Work has started on enabling Torii to support languages in addition to > Scala, namely Python (with PySpark), R (with SparkR), and SQL (with > SparkSQL) > > == Background & Rationale == > Apache Spark provides applications with a fast and general purpose > distributed computing engine that supports static and streaming data, > tabular and graph representations of data, and an extensive library of > machine learning libraries. Consequently, a wide variety of applications > will be written for Spark and there will be interactive applications that > require relatively frequent function evaluations, and batch-oriented > applications that require one-shot or only occasional evaluation. > > Apache Spark provides two mechanisms for applications to connect with > Spark. The primary mechanism launches applications on Spark clusters using > spark-submit ( > http://spark.apache.org/docs/latest/submitting-applications.html); this > requires developers to bundle their application code plus any dependencies > into JAR files, and then submit them to Spark. A second mechanism is an > ODBC/JDBC API ( > http://spark.apache.org/docs/latest/sql-programming-guide.html#distributed-sql-engine) > which enables applications to issue SQL queries against SparkSQL. > > Our experience when developing interactive applications, such as analytic > applications integrated with Notebooks, to run against Spark was that the > spark-submit mechanism was overly cumbersome and slow (requiring JAR > creation and forking processes to run spark-submit), and the SQL interface > was too limiting and did not offer easy access to components other than > SparkSQL, such as streaming. The most promising mechanism provided by > Apache Spark was the command-line shell ( > http://spark.apache.org/docs/latest/programming-guide.html#using-the-shell) > which enabled us to execute code snippets and dynamically control the tasks > submitted to a Spark cluster. Spark does not provide the command-line > shell as a consumable service but it provided us with the starting point > from which we developed Torii. > > == Current Status == > Torii was first developed by a small team working on an internal-IBM > Spark-related project in July 2014. In recognition of its likely general > utility to Spark users and developers, in November 2014 the Torii project > was moved to GitHub and made available under the Apache License V2. > > == Meritocracy == > The current developers are familiar with the meritocratic open source > development process at Apache. As the project has gathered interest at > GitHub the developers have actively started a process to invite additional > developers into the project, and we have at least one new developer who is > ready to contribute code to the project. > > == Community == > We started building a community around Torii project when we moved it to > GitHub about one year ago. Since then we have grown to about 70 people, and > there are regular requests and suggestions from the community. We believe > that providing Apache Spark application developers with a general-purpose > and interactive API holds a lot of community potential, especially > considering possible tie-in’s with Notebooks and data science community. > > == Core Developers == > The core developers of the project are currently all from IBM, from the IBM > Emerging Technology team and from IBM’s recently formed Spark Technology > Center. > > == Alignment == > Apache, as the home of Apache Spark, is the most natural home for the Torii > project because it was designed to work with Apache Spark and to provide > capabilities for interactive applications and data science tools not