Instead of Epic, we could use the target release ? Also, we have a roadmap page on the site and we should keep that up to date, or get rid of that and use roadmap on jira.
On Mon, Jan 16, 2017 at 6:20 PM <dusenberr...@gmail.com> wrote: > Now that we've had some discussion here, it would be good to transfer this > discussion into a JIRA epic, containing sub tasks. That way, we can > properly track our progress on these items and facilitate contributions > from the community. Note that some of the sub tasks may already exist as > individual issues. > > > > Would anyone in the community like to volunteer for creating these issues? > > > > - Mike > > > > -- > > > > Mike Dusenberry > > GitHub: github.com/dusenberrymw > > LinkedIn: linkedin.com/in/mikedusenberry > > > > Sent from my iPhone. > > > > > > > On Jan 4, 2017, at 6:00 PM, dusenberr...@gmail.com wrote: > > > > > > Overall, this is a good list of items that should be worked on, > particularly because it contains several user-facing items. However, to > echo what Luciano said, I'm also concerned about the timeline. At this > stage, I agree that we need to release more often, and with a more > user-oriented "product" focus as a guide for timelines. I.e. we should > orient our release timelines around items that focus on the "product" of > allowing the user to work on a wide range of ML problems in a simple and > easy manner on top of Spark. > > > > > > With that in mind, I agree that a focus on a subset of (1) and (2) would > be good for an immediate release, with a particular focus on Spark 2.0 > support as a priority. > > > > > > How about we aim for a February 1st release date for the initial items? > > > > > > -Mike > > > > > > -- > > > > > > Mike Dusenberry > > > GitHub: github.com/dusenberrymw > > > LinkedIn: linkedin.com/in/mikedusenberry > > > > > > Sent from my iPhone. > > > > > > > > >> On Jan 3, 2017, at 4:17 PM, Niketan Pansare <npan...@us.ibm.com> wrote: > > >> > > >> Hi Matthias, > > >> > > >> Thanks for the detailed roadmap. > > >> > > >> +1 for all the items with few modifications. > > >> > > >> 1) APIs and Language: > > >> * Cleanup new MLContext (matrix/frame data types, move tests, etc) > > >> >> Ensure Python and Scala MLContext have same API capability. > > >> > > >> * Remove old MLContext > > >> * Consolidate MLContext and JMLC > > >> * Full support for Scala/Python DSLs > > >> >> +1 for Python DSL except for push-down of loop structures and > functions. > > >> > > >> * Remove old file-based transform > > >> * Scala/Python wrappers for all existing algorithms > > >> * Data converters (additional formats: e.g., libsvm; performance) > > >> > > >> 2) Updated Dependencies: > > >> * Spark 2.0 support > > >> * Matrix block library (isolated jar) > > >> > > >> 3) Compiler/Runtime Features: > > >> * GPU support (full compiler and runtime support) > > >> >> Can we break this down into phases: > https://issues.apache.org/jira/browse/SYSTEMML-445 ? We can discuss the > timeline of the phases in the JIRA. > > >> > > >> * Compressed linear algebra v2 > > >> * Code generation (automatic operator fusion) > > >> * Extended parfor (full spark exploitation, micro-batch support) > > >> * Scale-up architecture (large dense blocks, numa)? > > >> > > >> 4) Tools > > >> * Extended stats (task locality, shuffle, etc) > > >> * Cloud resource advisor (extended resource optimizer)? > > >> > > >> 5) Algorithms > > >> * Graduate "staging" algorithms (robustness/performance) > > >> * Perftest: include all algorithms into automated performance tests > > >> >> via spark-submit + via Scala/Python wrappers > > >> > > >> * Simplify usage decision trees, random forest, mlogreg, msvm > > >> (preprocessing, label representation, etc) > > >> >> + command-line variable naming. For example: maxi, maxiter, etc. > > >> > > >> Thanks, > > >> > > >> Niketan Pansare > > >> IBM Almaden Research Center > > >> E-mail: npansar At us.ibm.com > > >> http://researcher.watson.ibm.com/researcher/view.php?person=us-npansar > > >> > > >> Matthias Boehm ---01/03/2017 02:44:39 PM---Yes indeed, most of (3) and > (4) can be done incrementally. For (5), some of the changes might also > > >> > > >> From: Matthias Boehm <mboe...@googlemail.com> > > >> To: dev@systemml.incubator.apache.org > > >> Date: 01/03/2017 02:44 PM > > >> Subject: Re: [DISCUSS] Roadmap SystemML 1.0 > > >> > > >> > > >> > > >> > > >> Yes indeed, most of (3) and (4) can be done incrementally. For (5), some > > >> of the changes might also modify the signature of algorithms (i.e., > > >> parameters and required input data) but it would help, for example with > > >> decision trees, as users no longer need to dummy code their inputs. > > >> > > >> Generally, I'm fine with making (3), (4), and part of (5) optional and > > >> let the "must-have" features from (1) and (2) determine the timeline. > > >> > > >> Regards, > > >> Matthias > > >> > > >> On 1/3/2017 11:27 PM, Luciano Resende wrote: > > >> > On Tue, Jan 3, 2017 at 11:50 AM, Matthias Boehm < > mboe...@googlemail.com> > > >> > wrote: > > >> > > > >> >> I'd like to initiate the discussion of a concrete roadmap for our > next > > >> >> release. According, to previous discussions, I'd think it's fair to > say > > >> >> that we agree on calling it SystemML 1.0. We should carefully plan > this > > >> >> release as it's an opportunity to change APIs and remove some older > > >> >> deprecated features. I'd like to encourage not just developers but > also the > > >> >> broader community to participate in this discussion. > > >> >> > > >> >> Personally, I think a target date of Q2/2017 is realistic. Let's > start > > >> >> with collecting the major features and changes that potentially > affect > > >> >> users. Here is an initial list, but please feel free to add and up- > or > > >> >> down-vote the individual items. > > >> >> > > >> >> 1) APIs and Language: > > >> >> * Cleanup new MLContext (matrix/frame data types, move tests, etc) > > >> >> * Remove old MLContext > > >> >> * Consolidate MLContext and JMLC > > >> >> * Full support for Scala/Python DSLs > > >> >> * Remove old file-based transform > > >> >> * Scala/Python wrappers for all existing algorithms > > >> >> * Data converters (additional formats: e.g., libsvm; performance) > > >> >> > > >> >> 2) Updated Dependencies: > > >> >> * Spark 2.0 support > > >> >> * Matrix block library (isolated jar) > > >> >> > > >> >> 3) Compiler/Runtime Features: > > >> >> * GPU support (full compiler and runtime support) > > >> >> * Compressed linear algebra v2 > > >> >> * Code generation (automatic operator fusion) > > >> >> * Extended parfor (full spark exploitation, micro-batch support) > > >> >> * Scale-up architecture (large dense blocks, numa)? > > >> >> > > >> >> 4) Tools > > >> >> * Extended stats (task locality, shuffle, etc) > > >> >> * Cloud resource advisor (extended resource optimizer)? > > >> >> > > >> >> 5) Algorithms > > >> >> * Graduate "staging" algorithms (robustness/performance) > > >> >> * Perftest: include all algorithms into automated performance tests > > >> >> * Simplify usage decision trees, random forest, mlogreg, msvm > > >> >> (preprocessing, label representation, etc) > > >> >> > > >> >> Items marked with a ? can potentially be moved out to subsequent > releases. > > >> >> > > >> >> > > >> >> Regards, > > >> >> Matthias > > >> >> > > >> > > > >> > My understanding is that most of the items in 1 and 2 are going to > break > > >> > backward compatibility, while the others can be done incrementally. > Is this > > >> > assumption correct? If so, can we finish 1 and 2 and do a 1.0 > release. and > > >> > them, continue with 3, 4, 5, etc ? as I don't think we should wait for > > >> > 2017/Q2 to do a 1.0 release. I believe in release early, release > often, > > >> > particularly to attract new users, that can help verifying and > contributing > > >> > to specific releases. > > >> > > > >> > Thoughts ? > > >> > > > >> > > >> > > >> > > >> > > -- Sent from my Mobile device