Re: New Google Summer of Code 2017 Student - Krishna Kalyan

2017-05-10 Thread Deron Eriksson
Welcome Krishna!

Deron


On Wed, May 10, 2017 at 7:20 PM, <dusenberr...@gmail.com> wrote:

> Welcome, Krishna!  Looking forward to working with you!  For a bit of my
> background related to the project, I've been heavily focused on deep
> learning by building a DML library for DL (in `scripts/nn`) and working on
> an applied DL project (in `projects/breast_cancer`).  I've also worked on
> the engine optimizer a bit, added a few new built-in ops to the engine, and
> run the perf tests previously.
>
> -Mike
>
> --
>
> Mike Dusenberry
> GitHub: github.com/dusenberrymw
> LinkedIn: linkedin.com/in/mikedusenberry
>
> Sent from my iPhone.
>
>
> > On May 6, 2017, at 3:18 AM, Arvind Surve <ac...@yahoo.com.INVALID>
> wrote:
> >
> > Welcome Krishna
> >  Arvind Surve | Spark Technology Center  | http://www.spark.tc/
> >
> >  From: Niketan Pansare <npan...@us.ibm.com>
> > To: dev@systemml.incubator.apache.org
> > Sent: Friday, May 5, 2017 3:45 PM
> > Subject: Re: New Google Summer of Code 2017 Student - Krishna Kalyan
> >
> > Welcome Krishna !!
> >
> >
> > Krishna Kalyan ---05/05/2017 03:36:59 PM---Thank you so much, Looking
> forward to work with every one in this community. Thank you for all
> >
> > From: Krishna Kalyan <krishnakaly...@gmail.com>
> > To: Nakul Jindal <naku...@gmail.com>
> > Cc: dev@systemml.incubator.apache.org
> > Date: 05/05/2017 03:36 PM
> > Subject: Re: New Google Summer of Code 2017 Student - Krishna Kalyan
> >
> >
> >
> > Thank you so much,
> > Looking forward to work with every one in this community. Thank you for
> all
> > the feedback and this amazing opportunity.
> >
> > Regards,
> > Krishna
> >
> >
> >
> >
> >
> > On May 5, 2017 19:05, "Nakul Jindal" <naku...@gmail.com> wrote:
> >
> > Hi All,
> >
> > Let us all welcome Krishna Kalyan as a student of Google Summer of Code
> to
> > work on SystemML.
> > He will be working on automating the performance testing process of
> > SystemML.
> >
> > His project proposal is attached and the JIRA tracking his project can be
> > found at https://issues.apache.org/jira/browse/SYSTEMML-1451
> >
> > He has already been active with the community (
> https://www.mail-archive.com/
> > dev@systemml.incubator.apache.org/msg01209.html) since January.
> >
> > @Krishna - Even though I am officially the mentor, I encourage you to
> > address questions to various members of the community with issues you
> > encounter throughout the project. Dig through Pull Requests and
> discussions
> > to figure out who is familiar with which components.
> >
> > (I can help a cbit with my background - I have worked on the DML grammar
> and
> > ANTLR parser layer previously and am working on the GPU backend now. I
> also
> > ran the perf tests and am somewhat familiar with the work needed to
> > automate it.)
> >
> > Welcome!
> >
> > -Nakul
> >
> >
> >
> >
> >
>



-- 
Deron Eriksson
Spark Technology Center
http://www.spark.tc/


Re: Standard code styles for DML and Java?

2017-05-03 Thread Deron Eriksson
Hi Matthias,

I like your suggestion of space indentation for DML scripts and tab
indentation for Java. I definitely support this and think this would be a
great way to go.

I also really like the idea of standardizing other aspects of our Java.

For inline formatting in Java, we might want to use // @formatter:off/on
comments  (
http://stackoverflow.com/questions/1820908/how-to-turn-off-the-eclipse-code-formatter-for-certain-sections-of-java-code),
since occasionally inline formatting can be very useful for readability
(such as DMLScript.DMLOptions).

Deron



On Tue, May 2, 2017 at 5:25 PM, Matthias Boehm <mboe...@googlemail.com>
wrote:

> thanks Deron for centralizing this discussion, as this could help to avoid
> redundancy spread across many individual JIRAs and PRs. Overall, I think it
> would be good to agree on individual style guides for DML and Java.
>
> I'm fine with using spaces for DML scripts because they are rarely changed
> once written. However, for Java, I'd strongly prefer tabs for indentation
> because tabs are (1) faster to navigate, and (2) allow to configure the dev
> environment according to subjective preferences. For inline formatting both
> should use spaces though.
>
> Finally, I would recommend to also include common inconsistency such as
> exception handling (catch all vs redundant error messages),
> hashcode/equals, unnecessary branches, etc.
>
> Regards,
> Matthias
>
>
> On 5/2/2017 7:15 PM, Deron Eriksson wrote:
>
>> Recently Matthias, Mike, and I discussed the issue of DML code style on
>> SYSTEMML-1406 (https://issues.apache.org/jira/browse/SYSTEMML-1406). We
>> also have an issue regarding Java code style on SYSTEMML-137 (
>> https://issues.apache.org/jira/browse/SYSTEMML-137).
>>
>> In the discussion on SYSTEMML-1406, it sounds like Matthias, Mike, and I
>> all see value in having a consistent style, although individual
>> preferences
>> differ. I would like to start a short discussion to see if we could apply
>> common style standards to our Java and DML files.
>>
>> WRT Java, perhaps the Google Style Guide (
>> https://google.github.io/styleguide/javaguide.html) would be a good place
>> to start.
>> https://github.com/google/styleguide/blob/gh-pages/eclipse-
>> java-google-style.xml
>> https://github.com/google/styleguide/blob/gh-pages/intellij-
>> java-google-style.xml
>>
>> We could use these Eclipse/IntelliJ Java style templates as a base and
>> modify them for any changes we agree upon (for example, tabs vs spaces for
>> indentation). We could then check these templates into our project so that
>> everyone who contributes to SystemML can apply the common style to code,
>> thus adding consistency to the project.
>>
>> WRT DML, the main issue we discussed was tabs vs spaces for indentation.
>>
>> Some options I see are:
>> 1) No official DML/Java styles
>> 2) DML/Java styles (use spaces for indents, with style guide as basis for
>> Java)
>> 3) DML/Java styles (use tabs for indents, with style guide as basis for
>> Java)
>>
>> Although I would prefer 2), I would be happy with 3) as an improvement
>> from
>> our existing 1). We could also have alternate options such as spaces for
>> DML and tabs for Java.
>>
>> Thoughts?
>> Deron
>>
>>


-- 
Deron Eriksson
Spark Technology Center
http://www.spark.tc/


Re: Podling Report Reminder - May 2017

2017-05-02 Thread Deron Eriksson
Thank Henry for the updates to the report and for signing off the report.

Deron


On Mon, May 1, 2017 at 11:26 PM, Henry Saputra <henry.sapu...@gmail.com>
wrote:

> I am changing the report a bit from:
>
> 2017-05-01 Felix Schüler (Committer and PMC)
>
> to
>
> 2017-05-01 Felix Schüler (PPMC)
>
> because being PPMC is by default as committer too, and change from PMC to
> PPMC since we are still podling.
>
> - Henry
>
> On Mon, May 1, 2017 at 2:55 PM, Deron Eriksson <deroneriks...@gmail.com>
> wrote:
>
> > Thanks Henry! Note that the original podling report template did not
> > include you as a mentor because we hadn't added you to podlings.xml yet.
> I
> > queried the general incubator list and updated podlings.xml last Friday.
> >
> > I've created an initial May podling report for SystemML (see
> > https://wiki.apache.org/incubator/May2017).
> >
> > If any PMC members would like to make updates (such as mentioning papers
> > and presentations), please feel free to make any necessary updates to the
> > wiki. If you do not have access to the wiki, please request access on the
> > general incubator list.
> >
> > Thanks
> > Deron
> >
> >
> > On Mon, May 1, 2017 at 1:59 PM, Henry Saputra <henry.sapu...@gmail.com>
> > wrote:
> >
> > > Thanks, Deron for taking stab at it
> > >
> > > On Mon, May 1, 2017 at 11:23 AM, Deron Eriksson <
> deroneriks...@gmail.com
> > >
> > > wrote:
> > >
> > > > I volunteer to create the SystemML podling report for May.
> > > >
> > > > Deron
> > > >
> > > >
> > > > On Fri, Apr 28, 2017 at 2:31 PM, Deron Eriksson <
> > deroneriks...@gmail.com
> > > >
> > > > wrote:
> > > >
> > > > > Hi,
> > > > >
> > > > > Would anyone else care to volunteer to create the SystemML podling
> > > > report?
> > > > > If there are no volunteers, I will volunteer, but since SystemML
> is a
> > > > > community effort, it is good for others to be involved in the
> > process.
> > > > Note
> > > > > that podling reports are an important part of the incubation
> process,
> > > as
> > > > > can be seen from the thread on the general incubator list
> concerning
> > > > Sirona
> > > > > (https://www.mail-archive.com/general@incubator.apache.org/
> > > msg59362.html
> > > > ).
> > > > >
> > > > > Deron
> > > > >
> > > > >
> > > > >
> > > > >
> > > > >
> > > > >
> > > > >
> > > > > On Wed, Apr 26, 2017 at 5:41 PM, <johndam...@apache.org> wrote:
> > > > >
> > > > >> Dear podling,
> > > > >>
> > > > >> This email was sent by an automated system on behalf of the Apache
> > > > >> Incubator PMC. It is an initial reminder to give you plenty of
> time
> > to
> > > > >> prepare your quarterly board report.
> > > > >>
> > > > >> The board meeting is scheduled for Wed, 17 May 2017, 10:30 am PDT.
> > > > >> The report for your podling will form a part of the Incubator PMC
> > > > >> report. The Incubator PMC requires your report to be submitted 2
> > weeks
> > > > >> before the board meeting, to allow sufficient time for review and
> > > > >> submission (Wed, May 03).
> > > > >>
> > > > >> Please submit your report with sufficient time to allow the
> > Incubator
> > > > >> PMC, and subsequently board members to review and digest. Again,
> the
> > > > >> very latest you should submit your report is 2 weeks prior to the
> > > board
> > > > >> meeting.
> > > > >>
> > > > >> Thanks,
> > > > >>
> > > > >> The Apache Incubator PMC
> > > > >>
> > > > >> Submitting your Report
> > > > >>
> > > > >> --
> > > > >>
> > > > >> Your report should contain the following:
> > > > >>
> > > > >> *   Your project name
> > > > >> *   A brief description of your project, which assumes no
> knowledge
> > of
&g

Re: [NOTICE] New Apache SystemML Committer and PPMC Member

2017-05-01 Thread Deron Eriksson
Congratulations and welcome, Felix!

Deron


On Mon, May 1, 2017 at 4:27 PM, <dusenberr...@gmail.com> wrote:

> Welcome, Felix!
>
> --
>
> Mike Dusenberry
> GitHub: github.com/dusenberrymw
> LinkedIn: linkedin.com/in/mikedusenberry
>
> Sent from my iPhone.
>
>
> > On May 1, 2017, at 4:23 PM, Niketan Pansare <npan...@us.ibm.com> wrote:
> >
> > Congratulations Felix !!
> >
> >
> > Luciano Resende ---05/01/2017 04:21:30 PM---Welcome Felix. On Mon, May
> 1, 2017 at 4:18 PM, Arvind Surve <ac...@yahoo.com.invalid>
> >
> > From: Luciano Resende <luckbr1...@gmail.com>
> > To: dev@systemml.incubator.apache.org, Arvind Surve <ac...@yahoo.com>
> > Date: 05/01/2017 04:21 PM
> > Subject: Re: [NOTICE] New Apache SystemML Committer and PPMC Member
> >
> >
> >
> >
> > Welcome Felix.
> >
> > On Mon, May 1, 2017 at 4:18 PM, Arvind Surve <ac...@yahoo.com.invalid>
> > wrote:
> >
> > > I would like to welcome Felix Schueler as a new
> > > Committer and PPMC member of Apache SystemML.
> > >
> > > Thanks for all your work, and welcome !!!
> > >
> > >  Arvind Surve | Spark Technology Center  | http://www.spark.tc/
> >
> >
> >
> >
> > --
> > Luciano Resende
> > http://twitter.com/lresende1975
> > http://lresende.blogspot.com/
> >
> >
> >
>



-- 
Deron Eriksson
Spark Technology Center
http://www.spark.tc/


Re: Podling Report Reminder - May 2017

2017-05-01 Thread Deron Eriksson
Thanks Henry! Note that the original podling report template did not
include you as a mentor because we hadn't added you to podlings.xml yet. I
queried the general incubator list and updated podlings.xml last Friday.

I've created an initial May podling report for SystemML (see
https://wiki.apache.org/incubator/May2017).

If any PMC members would like to make updates (such as mentioning papers
and presentations), please feel free to make any necessary updates to the
wiki. If you do not have access to the wiki, please request access on the
general incubator list.

Thanks
Deron


On Mon, May 1, 2017 at 1:59 PM, Henry Saputra <henry.sapu...@gmail.com>
wrote:

> Thanks, Deron for taking stab at it
>
> On Mon, May 1, 2017 at 11:23 AM, Deron Eriksson <deroneriks...@gmail.com>
> wrote:
>
> > I volunteer to create the SystemML podling report for May.
> >
> > Deron
> >
> >
> > On Fri, Apr 28, 2017 at 2:31 PM, Deron Eriksson <deroneriks...@gmail.com
> >
> > wrote:
> >
> > > Hi,
> > >
> > > Would anyone else care to volunteer to create the SystemML podling
> > report?
> > > If there are no volunteers, I will volunteer, but since SystemML is a
> > > community effort, it is good for others to be involved in the process.
> > Note
> > > that podling reports are an important part of the incubation process,
> as
> > > can be seen from the thread on the general incubator list concerning
> > Sirona
> > > (https://www.mail-archive.com/general@incubator.apache.org/
> msg59362.html
> > ).
> > >
> > > Deron
> > >
> > >
> > >
> > >
> > >
> > >
> > >
> > > On Wed, Apr 26, 2017 at 5:41 PM, <johndam...@apache.org> wrote:
> > >
> > >> Dear podling,
> > >>
> > >> This email was sent by an automated system on behalf of the Apache
> > >> Incubator PMC. It is an initial reminder to give you plenty of time to
> > >> prepare your quarterly board report.
> > >>
> > >> The board meeting is scheduled for Wed, 17 May 2017, 10:30 am PDT.
> > >> The report for your podling will form a part of the Incubator PMC
> > >> report. The Incubator PMC requires your report to be submitted 2 weeks
> > >> before the board meeting, to allow sufficient time for review and
> > >> submission (Wed, May 03).
> > >>
> > >> Please submit your report with sufficient time to allow the Incubator
> > >> PMC, and subsequently board members to review and digest. Again, the
> > >> very latest you should submit your report is 2 weeks prior to the
> board
> > >> meeting.
> > >>
> > >> Thanks,
> > >>
> > >> The Apache Incubator PMC
> > >>
> > >> Submitting your Report
> > >>
> > >> --
> > >>
> > >> Your report should contain the following:
> > >>
> > >> *   Your project name
> > >> *   A brief description of your project, which assumes no knowledge of
> > >> the project or necessarily of its field
> > >> *   A list of the three most important issues to address in the move
> > >> towards graduation.
> > >> *   Any issues that the Incubator PMC or ASF Board might wish/need to
> be
> > >> aware of
> > >> *   How has the community developed since the last report
> > >> *   How has the project developed since the last report.
> > >> *   How does the podling rate their own maturity.
> > >>
> > >> This should be appended to the Incubator Wiki page at:
> > >>
> > >> https://wiki.apache.org/incubator/May2017
> > >>
> > >> Note: This is manually populated. You may need to wait a little before
> > >> this page is created from a template.
> > >>
> > >> Mentors
> > >> ---
> > >>
> > >> Mentors should review reports for their project(s) and sign them off
> on
> > >> the Incubator wiki page. Signing off reports shows that you are
> > >> following the project - projects that are not signed may raise alarms
> > >> for the Incubator PMC.
> > >>
> > >> Incubator PMC
> > >>
> > >
> > >
> > >
> > > --
> > > Deron Eriksson
> > > Spark Technology Center
> > > http://www.spark.tc/
> > >
> > >
> >
> >
> > --
> > Deron Eriksson
> > Spark Technology Center
> > http://www.spark.tc/
> >
>



-- 
Deron Eriksson
Spark Technology Center
http://www.spark.tc/


Re: Build passed/failed messages for pull requests

2017-05-01 Thread Deron Eriksson
Would anyone else in our SystemML community care to comment?

So far we have a tie:
  3 for option 2 - Dusenberry, Jindall, and Eriksson
  3 for option 3 - Boehm, Surve, and Weidner

Deron



On Fri, Apr 28, 2017 at 12:53 PM, <dusenberr...@gmail.com> wrote:

> I would prefer option 2.
>
> --
>
> Mike Dusenberry
> GitHub: github.com/dusenberrymw
> LinkedIn: linkedin.com/in/mikedusenberry
>
> Sent from my iPhone.
>
>
> > On Apr 28, 2017, at 12:40 PM, Glenn Weidner <gweid...@us.ibm.com> wrote:
> >
> > My preference is option 3.
> >
> > Thanks,
> > Glenn
> >
> >
> > Arvind Surve ---04/28/2017 11:09:48 AM---Agree, these messages are
> distractions.  Arvind Surve | Spark Technology Center  | http://www.spark.
> >
> > From: Arvind Surve <ac...@yahoo.com.INVALID>
> > To: "dev@systemml.incubator.apache.org" <dev@systemml.incubator.
> apache.org>
> > Date: 04/28/2017 11:09 AM
> > Subject: Re: Build passed/failed messages for pull requests
> >
> >
> >
> >
> > Agree, these messages are distractions.
> >  Arvind Surve | Spark Technology Center  | http://www.spark.tc/
> >
> >  From: Matthias Boehm <mboe...@googlemail.com>
> > To: dev@systemml.incubator.apache.org
> > Sent: Friday, April 28, 2017 11:05 AM
> > Subject: Re: Build passed/failed messages for pull requests
> >
> > as I commented on one of these github comments, I'm strongly against
> > these kind of unnecessary messages because they distract from the actual
> > discussions. I already had to change my notification settings
> > accordingly - essentially I'm not watching SystemML's PR activity any
> > more.
> >
> > Regards,
> > Matthias
> >
> > On 4/28/2017 10:42 AM, Deron Eriksson wrote:
> > > Hi,
> > >
> > > When a pull request is created or another commit is pushed to that pull
> > > request, a build including running our test suite is performed
> (Jenkins at
> > > https://sparktc.ibmcloud.com/jenkins/job/SystemML-PullRequestBuilder/
> ).
> > > This is the same model that other projects such as Apache Spark use
> > > (Jenkins at
> > > https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/).
> > >
> > > A few days ago, automated build passed/failed pull request messages
> were
> > > introduced to our pull requests, following the same type of Spark
> model.
> > > A) SystemML example: https://github.com/apache/
> incubator-systemml/pull/442
> > > B) Spark example: https://github.com/apache/spark/pull/17765
> > >
> > > Personally I like these messages because for contributors that do pull
> > > requests, it automatically tells them the status of the build for their
> > > pull requests and gives them a direct link to the build/test results.
> An
> > > opposing viewpoint would be that these messages are somewhat like spam.
> > >
> > > So we should make a public decision on the mailing list what to do
> about
> > > these automated build status messages.
> > >
> > > Some options:
> > > (1) keep the automated messages exactly as they are
> > > (2) keep the automated messages, but consolidate the two messages into
> one
> > > (such as "Build successful" and "Refer to this link...").
> > > (3) get rid of the automated messages
> > >
> > > I like (2). Any other opinions or options?
> > >
> > > Thoughts?
> > >
> > > Deron
> > >
> > >
> >
> >
> >
> >
> >
>



-- 
Deron Eriksson
Spark Technology Center
http://www.spark.tc/


Re: Podling Report Reminder - May 2017

2017-05-01 Thread Deron Eriksson
I volunteer to create the SystemML podling report for May.

Deron


On Fri, Apr 28, 2017 at 2:31 PM, Deron Eriksson <deroneriks...@gmail.com>
wrote:

> Hi,
>
> Would anyone else care to volunteer to create the SystemML podling report?
> If there are no volunteers, I will volunteer, but since SystemML is a
> community effort, it is good for others to be involved in the process. Note
> that podling reports are an important part of the incubation process, as
> can be seen from the thread on the general incubator list concerning Sirona
> (https://www.mail-archive.com/general@incubator.apache.org/msg59362.html).
>
> Deron
>
>
>
>
>
>
>
> On Wed, Apr 26, 2017 at 5:41 PM, <johndam...@apache.org> wrote:
>
>> Dear podling,
>>
>> This email was sent by an automated system on behalf of the Apache
>> Incubator PMC. It is an initial reminder to give you plenty of time to
>> prepare your quarterly board report.
>>
>> The board meeting is scheduled for Wed, 17 May 2017, 10:30 am PDT.
>> The report for your podling will form a part of the Incubator PMC
>> report. The Incubator PMC requires your report to be submitted 2 weeks
>> before the board meeting, to allow sufficient time for review and
>> submission (Wed, May 03).
>>
>> Please submit your report with sufficient time to allow the Incubator
>> PMC, and subsequently board members to review and digest. Again, the
>> very latest you should submit your report is 2 weeks prior to the board
>> meeting.
>>
>> Thanks,
>>
>> The Apache Incubator PMC
>>
>> Submitting your Report
>>
>> --
>>
>> Your report should contain the following:
>>
>> *   Your project name
>> *   A brief description of your project, which assumes no knowledge of
>> the project or necessarily of its field
>> *   A list of the three most important issues to address in the move
>> towards graduation.
>> *   Any issues that the Incubator PMC or ASF Board might wish/need to be
>> aware of
>> *   How has the community developed since the last report
>> *   How has the project developed since the last report.
>> *   How does the podling rate their own maturity.
>>
>> This should be appended to the Incubator Wiki page at:
>>
>> https://wiki.apache.org/incubator/May2017
>>
>> Note: This is manually populated. You may need to wait a little before
>> this page is created from a template.
>>
>> Mentors
>> ---
>>
>> Mentors should review reports for their project(s) and sign them off on
>> the Incubator wiki page. Signing off reports shows that you are
>> following the project - projects that are not signed may raise alarms
>> for the Incubator PMC.
>>
>> Incubator PMC
>>
>
>
>
> --
> Deron Eriksson
> Spark Technology Center
> http://www.spark.tc/
>
>


-- 
Deron Eriksson
Spark Technology Center
http://www.spark.tc/


Re: Podling Report Reminder - May 2017

2017-04-28 Thread Deron Eriksson
Hi,

Would anyone else care to volunteer to create the SystemML podling report?
If there are no volunteers, I will volunteer, but since SystemML is a
community effort, it is good for others to be involved in the process. Note
that podling reports are an important part of the incubation process, as
can be seen from the thread on the general incubator list concerning Sirona
(https://www.mail-archive.com/general@incubator.apache.org/msg59362.html).

Deron







On Wed, Apr 26, 2017 at 5:41 PM, <johndam...@apache.org> wrote:

> Dear podling,
>
> This email was sent by an automated system on behalf of the Apache
> Incubator PMC. It is an initial reminder to give you plenty of time to
> prepare your quarterly board report.
>
> The board meeting is scheduled for Wed, 17 May 2017, 10:30 am PDT.
> The report for your podling will form a part of the Incubator PMC
> report. The Incubator PMC requires your report to be submitted 2 weeks
> before the board meeting, to allow sufficient time for review and
> submission (Wed, May 03).
>
> Please submit your report with sufficient time to allow the Incubator
> PMC, and subsequently board members to review and digest. Again, the
> very latest you should submit your report is 2 weeks prior to the board
> meeting.
>
> Thanks,
>
> The Apache Incubator PMC
>
> Submitting your Report
>
> --
>
> Your report should contain the following:
>
> *   Your project name
> *   A brief description of your project, which assumes no knowledge of
> the project or necessarily of its field
> *   A list of the three most important issues to address in the move
> towards graduation.
> *   Any issues that the Incubator PMC or ASF Board might wish/need to be
> aware of
> *   How has the community developed since the last report
> *   How has the project developed since the last report.
> *   How does the podling rate their own maturity.
>
> This should be appended to the Incubator Wiki page at:
>
> https://wiki.apache.org/incubator/May2017
>
> Note: This is manually populated. You may need to wait a little before
> this page is created from a template.
>
> Mentors
> ---
>
> Mentors should review reports for their project(s) and sign them off on
> the Incubator wiki page. Signing off reports shows that you are
> following the project - projects that are not signed may raise alarms
> for the Incubator PMC.
>
> Incubator PMC
>



-- 
Deron Eriksson
Spark Technology Center
http://www.spark.tc/


Build passed/failed messages for pull requests

2017-04-28 Thread Deron Eriksson
Hi,

When a pull request is created or another commit is pushed to that pull
request, a build including running our test suite is performed (Jenkins at
https://sparktc.ibmcloud.com/jenkins/job/SystemML-PullRequestBuilder/).
This is the same model that other projects such as Apache Spark use
(Jenkins at
https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/).

A few days ago, automated build passed/failed pull request messages were
introduced to our pull requests, following the same type of Spark model.
A) SystemML example: https://github.com/apache/incubator-systemml/pull/442
B) Spark example: https://github.com/apache/spark/pull/17765

Personally I like these messages because for contributors that do pull
requests, it automatically tells them the status of the build for their
pull requests and gives them a direct link to the build/test results. An
opposing viewpoint would be that these messages are somewhat like spam.

So we should make a public decision on the mailing list what to do about
these automated build status messages.

Some options:
(1) keep the automated messages exactly as they are
(2) keep the automated messages, but consolidate the two messages into one
(such as "Build successful" and "Refer to this link...").
(3) get rid of the automated messages

I like (2). Any other opinions or options?

Thoughts?

Deron


-- 
Deron Eriksson
Spark Technology Center
http://www.spark.tc/


Re: caffe and org.tensorflow licenses?

2017-04-24 Thread Deron Eriksson
Hi Niketan,

I think it's probably safest to include the tensorflow and caffe licenses
in all the artifacts that contain the generated files (the main jar and all
the artifacts that use the main jar).

Deron


On Mon, Apr 24, 2017 at 1:48 PM, Niketan Pansare <npan...@us.ibm.com> wrote:

> Hi Deron,
>
> I had same doubt and hence have created the PR https://github.com/apache/
> incubator-systemml/pull/467 few days ago. We can continue to discuss this
> on mailing list or the PR.
>
> Even though we are not including the caffe/tensorflow project, we are
> depending on their input/output format (i.e. proto files). Unfortunately,
> both caffe and tensorflow don't have maven coordinates for their
> input/output formats as they tend to use C++/Python wrappers created from
> the proto files. Hence, we have to compile their proto files into java
> classes. Personally, I think we should include their licenses into our jar.
>
> TensorFlow => Apache 2.0 license
> Caffe => https://github.com/BVLC/caffe/blob/master/LICENSE
>
> Thanks,
>
> Niketan Pansare
> IBM Almaden Research Center
> E-mail: npansar At us.ibm.com
> http://researcher.watson.ibm.com/researcher/view.php?person=us-npansar
>
> [image: Inactive hide details for Deron Eriksson ---04/24/2017 01:28:58
> PM---Hi, I see after a recent commit that the main jar file now]Deron
> Eriksson ---04/24/2017 01:28:58 PM---Hi, I see after a recent commit that
> the main jar file now gets caffe and
>
> From: Deron Eriksson <deroneriks...@gmail.com>
> To: dev@systemml.incubator.apache.org
> Date: 04/24/2017 01:28 PM
> Subject: caffe and org.tensorflow licenses?
> --
>
>
>
> Hi,
>
> I see after a recent commit that the main jar file now gets caffe and
> org.tensorflow packages added to it. This would tend to suggest that our
> main jar needs modifications to its LICENSE file (and any other artifacts
> containing the main jar would also need their LICENSES updated).
>
> However, I see at https://github.com/google/protobuf/blob/master/LICENSE
> that:
> "Code generated by the Protocol Buffer compiler is owned by the owner
> of the input file used when generating it.  This code is not
> standalone and requires a support library to be linked with it.  This
> support library is itself covered by the above license."
>
> So could someone tell me if caffe and tensorflow licenses now need to be
> added to the project?
>
> Even if they are not required, some confusion during artifact validation
> can occur since the project looks like it contains caffe and tensorflow
> classes.
>
> Deron
>
>
> --
> Deron Eriksson
> Spark Technology Center
> http://www.spark.tc/
>
>
>
>


-- 
Deron Eriksson
Spark Technology Center
http://www.spark.tc/


caffe and org.tensorflow licenses?

2017-04-24 Thread Deron Eriksson
Hi,

I see after a recent commit that the main jar file now gets caffe and
org.tensorflow packages added to it. This would tend to suggest that our
main jar needs modifications to its LICENSE file (and any other artifacts
containing the main jar would also need their LICENSES updated).

However, I see at https://github.com/google/protobuf/blob/master/LICENSE
that:
"Code generated by the Protocol Buffer compiler is owned by the owner
of the input file used when generating it.  This code is not
standalone and requires a support library to be linked with it.  This
support library is itself covered by the above license."

So could someone tell me if caffe and tensorflow licenses now need to be
added to the project?

Even if they are not required, some confusion during artifact validation
can occur since the project looks like it contains caffe and tensorflow
classes.

Deron


-- 
Deron Eriksson
Spark Technology Center
http://www.spark.tc/


Re: function default parameters

2017-04-21 Thread Deron Eriksson
BTW, that is assuming our algorithms have been converted to functions.
Deron


On Fri, Apr 21, 2017 at 5:37 PM, Deron Eriksson <deroneriks...@gmail.com>
wrote:

> Thank you Matthias. I highly agree with your idea about having a default
> specification similar to R WRT the function signatures for default values.
>
> This becomes a significant issue for some of our algorithms, where they
> might take in 10 arguments but default values are should typically be used
> for  6+ or 7+ of the arguments.
>
> Deron
>
>
> On Fri, Apr 21, 2017 at 5:25 PM, Matthias Boehm <mboe...@googlemail.com>
> wrote:
>
>> well, for arguments passed into dml scripts there is of course ifdef($b,
>> 2)
>> but for functions there is indeed no good support. At runtime level we
>> still support default parameters for scalar arguments at the tail of the
>> parameter list but I guess at one point the corresponding parser support
>> was discontinued.
>>
>> I personally would like a default specification similar to R in the
>> function signature with the corresponding function calls that bind values
>> to a subset of parameters.
>>
>> Regards,
>> Matthias
>>
>> On Fri, Apr 21, 2017 at 4:18 PM, Deron Eriksson <deroneriks...@gmail.com>
>> wrote:
>>
>> > Is there a way to set default parameter values using DML? I believe
>> both R
>> > and Python offer this capability.
>> >
>> > The only solution I could come up with using DML is to pass in a
>> variable
>> > that is NaN and cast this to a string and use this string in an if
>> > conditional statement.
>> >
>> > addone = function(double b) return (double a) {
>> > c = ''+b;
>> > if (c == 'NaN') {
>> > b = 2.0
>> > }
>> > a = b + 1;
>> > }
>> >
>> > z=0.0/0.0;
>> > x = addone(z);
>> > print(x);
>> > y = addone(4.0);
>> > print(y);
>> >
>> > Is there a cleaner way to accomplish this, or is DML lacking this R
>> > feature?
>> >
>> > Deron
>> >
>> > --
>> > Deron Eriksson
>> > Spark Technology Center
>> > http://www.spark.tc/
>> >
>>
>
>
>
> --
> Deron Eriksson
> Spark Technology Center
> http://www.spark.tc/
>
>


-- 
Deron Eriksson
Spark Technology Center
http://www.spark.tc/


Re: function default parameters

2017-04-21 Thread Deron Eriksson
Thank you Matthias. I highly agree with your idea about having a default
specification similar to R WRT the function signatures for default values.

This becomes a significant issue for some of our algorithms, where they
might take in 10 arguments but default values are should typically be used
for  6+ or 7+ of the arguments.

Deron


On Fri, Apr 21, 2017 at 5:25 PM, Matthias Boehm <mboe...@googlemail.com>
wrote:

> well, for arguments passed into dml scripts there is of course ifdef($b, 2)
> but for functions there is indeed no good support. At runtime level we
> still support default parameters for scalar arguments at the tail of the
> parameter list but I guess at one point the corresponding parser support
> was discontinued.
>
> I personally would like a default specification similar to R in the
> function signature with the corresponding function calls that bind values
> to a subset of parameters.
>
> Regards,
> Matthias
>
> On Fri, Apr 21, 2017 at 4:18 PM, Deron Eriksson <deroneriks...@gmail.com>
> wrote:
>
> > Is there a way to set default parameter values using DML? I believe both
> R
> > and Python offer this capability.
> >
> > The only solution I could come up with using DML is to pass in a variable
> > that is NaN and cast this to a string and use this string in an if
> > conditional statement.
> >
> > addone = function(double b) return (double a) {
> > c = ''+b;
> > if (c == 'NaN') {
> > b = 2.0
> > }
> > a = b + 1;
> > }
> >
> > z=0.0/0.0;
> > x = addone(z);
> > print(x);
> > y = addone(4.0);
> > print(y);
> >
> > Is there a cleaner way to accomplish this, or is DML lacking this R
> > feature?
> >
> > Deron
> >
> > --
> > Deron Eriksson
> > Spark Technology Center
> > http://www.spark.tc/
> >
>



-- 
Deron Eriksson
Spark Technology Center
http://www.spark.tc/


function default parameters

2017-04-21 Thread Deron Eriksson
Is there a way to set default parameter values using DML? I believe both R
and Python offer this capability.

The only solution I could come up with using DML is to pass in a variable
that is NaN and cast this to a string and use this string in an if
conditional statement.

addone = function(double b) return (double a) {
c = ''+b;
if (c == 'NaN') {
b = 2.0
}
a = b + 1;
}

z=0.0/0.0;
x = addone(z);
print(x);
y = addone(4.0);
print(y);

Is there a cleaner way to accomplish this, or is DML lacking this R feature?

Deron

-- 
Deron Eriksson
Spark Technology Center
http://www.spark.tc/


Re: Branch 0.14 based on SystemML-0.14 RC1 has been deleted

2017-04-13 Thread Deron Eriksson
I'm growing a bit concerned about this strategy of not having a 0.14
branch. Right now it's been 10 days since RC1 was cut, and RC3 is going to
be cut from master. I feel this is starting to affect the project moving
foward. As an example, the Apache Commons CLI PR has still not been merged
into master since it is waiting on the 0.14 validation and voting to be
finished. If it takes another total of 10 days for RC3 to be approved (by
us and the IPMC), it means that during 1/3 of our 2-month release cycle we
can't commit forward-thinking features on master.

We could easily remedy this with a 0.14 branch when RC3 is cut.

Deron




On Thu, Apr 6, 2017 at 11:40 AM, Luciano Resende <luckbr1...@gmail.com>
wrote:

> +1
>
> On Thu, Apr 6, 2017 at 9:03 AM, Arvind Surve <ac...@yahoo.com.invalid>
> wrote:
>
> > Hi,
> > Branch 0.14 based on SystemML-0.14 RC1 has been deleted.
> > Going forward unless there is immediate need for a branch, branch based
> on
> >  Release Candidate (RC) won't be created until RC build gets approved.
> > -Arvind Arvind Surve | Spark Technology Center  | http://www.spark.tc/
>
>
>
>
> --
> Luciano Resende
> http://twitter.com/lresende1975
> http://lresende.blogspot.com/
>



-- 
Deron Eriksson
Spark Technology Center
http://www.spark.tc/


Re: Dropping Java 6 and 7 support

2017-03-07 Thread Deron Eriksson
+1.

Definitely makes sense for SystemML 1.0. In the Spark documentation: "Note
that support for Java 7 is deprecated as of Spark 2.0.0 and may be removed
in Spark 2.2.0."

Deron



On Mon, Mar 6, 2017 at 11:15 PM, Berthold Reinwald <reinw...@us.ibm.com>
wrote:

> +1 on removing java 6 and 7.
>
> Regards,
> Berthold Reinwald
> IBM Almaden Research Center
> office: (408) 927 2208; T/L: 457 2208
> e-mail: reinw...@us.ibm.com
>
>
>
> From:   Matthias Boehm <mboe...@googlemail.com>
> To: dev@systemml.incubator.apache.org
> Date:   03/06/2017 10:58 PM
> Subject:Dropping Java 6 and 7 support
>
>
>
> Hi all,
>
> I'd like to drop the support for Java 6 and 7 in our SystemML 1.0 release.
> Our build still refers to a java compliance level 6, which has not been
> changed for more than 5 years now. Spark >= 1.5 anyway requires Java 7 and
> there has been some discussion on removing Java 7 as well because it
> reached end of life in April 2015. Moving to Java 8 would allow us to
> modernize the code base going forward and the 1.0 release would be the
> perfect time for this change.
>
> Regards,
> Matthias
>
>
>
>
>


-- 
Deron Eriksson
Spark Technology Center
http://www.spark.tc/


Re: Location of the website source code?

2017-03-02 Thread Deron Eriksson
Hi Henry,

On the SystemML website, there is a link in the header under Community to
Source Code. We use Git.

Deron


On Thu, Mar 2, 2017 at 10:04 AM, Arvind Surve <ac...@yahoo.com.invalid>
wrote:

> Hi Henry,
> Please find link provided below.
>
> https://github.com/apache/incubator-systemml-website
>
>
> Link is shown in systemml website drop down as well.
> -Arvind
>
> Arvind Surve | Spark Technology Center  | http://www.spark.tc/
>
>   From: Henry Saputra <henry.sapu...@gmail.com>
>  To: dev@systemml.incubator.apache.org
>  Sent: Thursday, March 2, 2017 8:54 AM
>  Subject: Location of the website source code?
>
> Hi Guys,
>
> I can not find the information about the source code location for the
> SystemML website.
>
> I believe it is in SVN or is it using Git repo?
>
> Thanks,
>
> - Henry
>
>
>
>



-- 
Deron Eriksson
Spark Technology Center
http://www.spark.tc/


Re: incubator-systemml git commit: [maven-release-plugin] prepare for next development iteration

2017-02-24 Thread Deron Eriksson
There has been talk on the general incubator list that SystemML may be
ready to graduate from the Apache incubator.
See:
https://www.mail-archive.com/general@incubator.apache.org/msg58609.html
https://www.mail-archive.com/general@incubator.apache.org/msg58614.html
https://www.mail-archive.com/general@incubator.apache.org/msg58621.html

It might be a great milestone for us in terms of versions to have 1.0.0 be
our first Apache top-level project release. If we release 1.0.0 before
graduation, the version will be 1.0.0-incubating.

If possible, with the help of our mentors, it would be great if we could do
everything we can to graduate before the end of April AND do our 1.0.0
release by the end of April.

Deron



On Fri, Feb 24, 2017 at 10:06 AM, Arvind Surve <ac...@yahoo.com.invalid>
wrote:

> We need to have next release by end of April 2017, so we should plan on
> doing RC1 around first week of April.
> At this point its uncertain if we can have changes for Release 1.0 ready
> for April release.Plan for release 1.0 is not solidified yet. I know
> Matthias you have sent list by beginning of the year with few
> responses.Lets first finalize the release 1.0 content and verify if we can
> have MUST fix/changes for Release 1.0 ready by first week of April.
> Once we agreed upon Release 1.0 tentative date, we can adjust pom file.
>
> Arvind Surve | Spark Technology Center  | http://www.spark.tc/
>
>   From: Matthias Boehm <mboe...@googlemail.com>
>  To: dev@systemml.incubator.apache.org
>  Sent: Wednesday, February 22, 2017 8:16 PM
>  Subject: Re: incubator-systemml git commit: [maven-release-plugin]
> prepare for next development iteration
>
> Could we please change the target version to 1.0 instead of 0.14 to make
> clear that master is now open for 1.0 features?
>
> Regards,
> Matthias
>
> On Mon, Feb 20, 2017 at 12:08 PM, <ac...@apache.org> wrote:
>
> > Repository: incubator-systemml
> > Updated Branches:
> >  refs/heads/master 07f26ca4e -> da5879f53
> >
> >
> > [maven-release-plugin] prepare for next development iteration
> >
> >
> > Project: http://git-wip-us.apache.org/repos/asf/incubator-systemml/repo
> > Commit: http://git-wip-us.apache.org/repos/asf/incubator-systemml/co
> > mmit/da5879f5
> > Tree: http://git-wip-us.apache.org/repos/asf/incubator-systemml/tr
> > ee/da5879f5
> > Diff: http://git-wip-us.apache.org/repos/asf/incubator-systemml/di
> > ff/da5879f5
> >
> > Branch: refs/heads/master
> > Commit: da5879f538364dfc9b3365bb6f9e0beaa7344430
> > Parents: 07f26ca
> > Author: Arvind Surve <ac...@yahoo.com>
> > Authored: Mon Feb 20 12:08:53 2017 -0800
> > Committer: Arvind Surve <ac...@yahoo.com>
> > Committed: Mon Feb 20 12:08:53 2017 -0800
> >
> > --
> >  pom.xml | 4 ++--
> >  1 file changed, 2 insertions(+), 2 deletions(-)
> > --
> >
> >
> > http://git-wip-us.apache.org/repos/asf/incubator-systemml/bl
> > ob/da5879f5/pom.xml
> > --
> > diff --git a/pom.xml b/pom.xml
> > index e9c2665..e044c05 100644
> > --- a/pom.xml
> > +++ b/pom.xml
> > @@ -25,7 +25,7 @@
> >18
> >
> >org.apache.systemml
> > -  0.13.0-incubating
> > +  0.14.0-incubating-SNAPSHOT
> >systemml
> >jar
> >SystemML
> > @@ -41,7 +41,7 @@
> >    scm:git:git@github
> > .com:apache/incubator-systemml
> >scm:git:h
> > ttps://git-wip-us.apache.org/repos/asf/incubator-systemml > eveloperConnection>
> >https://git-wip-us.apache.org/repos/asf?p=incubator-
> > systemml.git
> > -  v0.13.0-incubating-rc1
> > +  HEAD
> >
> >
> >JIRA
> >
> >
>
>
>
>



-- 
Deron Eriksson
Spark Technology Center
http://www.spark.tc/


Re: [VOTE] Apache SystemML 0.13.0-incubating (RC2)

2017-02-23 Thread Deron Eriksson
+1

Performed the following validations for artifacts at
https://dist.apache.org/repos/dist/dev/incubator/systemml/0.13.0-incubating-rc2/
:

1. -bin.tgz/-bin.zip contain disclaimer, license, notice
2. -bin.tgz/-bin.zip licenses reference all included dependencies with
correct licenses
3. -bin.tgz/-bin.zip lib/systemml-0.13.0-incubating.jar contains
disclaimer, license, notice
3. -bin.tgz/-bin.zip lib/systemml-0.13.0-incubating.jar contains antlr
runtime and wink classes
4. -bin.tgz/-bin.zip lib/systemml-0.13.0-incubating.jar license references
antlr runtime and wink
5. -python.tgz contains disclaimer, license notice
6. -python.tgz license references antlr runtime and wink with correct
licenses
7. -python.tgz systemml/systemml-java/systemml-0.13.0-incubating.jar
contains disclaimer, license, notice
8. -python.tgz systemml/systemml-java/systemml-0.13.0-incubating.jar
contains antlr runtime and wink classes
9. -python.tgz systemml/systemml-java/systemml-0.13.0-incubating.jar
license references antlr runtime and wink
10. -src.tgz/-src.zip contain disclaimer, license, notice
11. -src.tgz/-src.zip licenses reference all included projects (jquery,
etc) with correct licenses
12. -src.tgz/-src.zip contain no binaries (dll, exe, pdb, lib)
13. -src.tgz/-src.zip build project artifacts (mvn clean package -P
distribution)
14. -src.tgz/-src.zip SystemML jar runs (hello world)
15. -src.tgz/-src.zip test suite runs (mvn verify)
16. -bin.tgz/-bin.zip runStandaloneSystemML.sh (hello world)
17. -bin.tgz/-bin.zip lib/systemml-0.13.0-incubating.jar spark-submit 2.0.2
(hello world)
18. -bin.tgz/-bin.zip lib/systemml-0.13.0-incubating.jar spark-submit 2.1.0
(hello world)
19. -bin.tgz/-bin.zip lib/systemml-0.13.0-incubating.jar hadoop 2.7 (hello
world)
20. -bin.tgz/-bin.zip runStandaloneSystemML.sh (univar stats, haberman data)
21. -bin.tgz/-bin.zip lib/systemml-0.13.0-incubating.jar spark-submit 2.0.2
(univar stats, generated data)
22. -bin.tgz/-bin.zip lib/systemml-0.13.0-incubating.jar spark-submit 2.1.0
(univar stats, generated data)
23. -bin.tgz/-bin.zip lib/systemml-0.13.0-incubating.jar hadoop 2.7 default
exec mode (univar stats, generated data)
24. -bin.tgz/-bin.zip lib/systemml-0.13.0-incubating.jar hadoop 2.7 hadoop
exec mode (univar stats, generated data)
25. -bin.tgz/-bin.zip lib/systemml-0.13.0-incubating.jar MLContext
spark-shell 2.0.2 (univar stats, haberman data)
26. -bin.tgz/-bin.zip lib/systemml-0.13.0-incubating.jar MLContext
spark-shell 2.1.0 (univar stats, haberman data)



On Wed, Feb 22, 2017 at 7:23 PM, Arvind Surve <ac...@yahoo.com.invalid>
wrote:

> Please vote on releasing the following candidate as Apache SystemML
> version 0.13.0-incubating !
>
> The vote is open for at least 72 hours and passes if a majority of at
> least 3 +1 PMC votes are cast.
>
> [ ] +1 Release this package as Apache SystemML 0.13.0-incubating
> [ ] -1 Do not release this package because ...
>
> To learn more about Apache SystemML, please see http://systemml.apache.
> org/
>
> The tag to be voted on is v0.13.0-incubating-rc2 (
> ff3e741694e507f64a6b52ee71638bddecabe7af)
>
> https://github.com/apache/incubator-systemml/commit/
> ff3e741694e507f64a6b52ee71638bddecabe7af
>
> The release artifacts can be found at :
> https://dist.apache.org/repos/dist/dev/incubator/systemml/0.
> 13.0-incubating-rc2/
>
> The maven release artifacts, including signatures, digests, etc. can
> be found at:
>
> https://repository.apache.org/content/repositories/
> orgapachesystemml-1017/org/apache/systemml/systemml/0.13.0-incubating/
>
> =
> == Apache Incubator release policy ==
> =
> Please find below the guide to release management during incubation:
> http://incubator.apache.org/guides/releasemanagement.html
>
> ===
> == How can I help test this release? ==
> ===
> If you are a SystemML user, you can help us test this release by taking
> an existing Algorithm or workload and running on this release candidate,
> then
> reporting any regressions.
>
> 
> == What justifies a -1 vote for this release? ==
> 
> -1 votes should only occur for significant stop-ship bugs or legal
> related issues (e.g. wrong license, missing header files, etc). Minor bugs
> or regressions should not block this release.
>  -Arvind Arvind Surve | Spark Technology Center  | http://www.spark.tc/




-- 
Deron Eriksson
Spark Technology Center
http://www.spark.tc/


Re: Minimum required Spark version

2017-02-21 Thread Deron Eriksson
Note that MLContext has been updated to log a warning rather than throw an
exception to the user for Spark versions previous to 2.1.0.

Deron

On Mon, Feb 20, 2017 at 2:29 PM, Matthias Boehm1 <matthias.boe...@ibm.com>
wrote:

> that's a good catch Felix! I would recommend to cast this exception to a
> warning and move it to a central place like SparkExecutionContext to ensure
> consistency across all APIs and deployments.
>
> Regards,
> Matthias
>
>
> [image: Inactive hide details for Deron Eriksson ---02/20/2017 02:14:00
> PM---Hi Felix, I agree that the 2.1 hard requirement is a bit r]Deron
> Eriksson ---02/20/2017 02:14:00 PM---Hi Felix, I agree that the 2.1 hard
> requirement is a bit restrictive. If someone can
>
> From: Deron Eriksson <deroneriks...@gmail.com>
> To: dev@systemml.incubator.apache.org
> Date: 02/20/2017 02:14 PM
> Subject: Re: Minimum required Spark version
> --
>
>
>
> Hi Felix,
>
> I agree that the 2.1 hard requirement is a bit restrictive. If someone can
> validate that Spark versions less than 2.1 and greater than 2.0.* work,
> this seems like a great idea to me.
>
> Deron
>
>
> On Mon, Feb 20, 2017 at 1:43 PM, <fschue...@posteo.de> wrote:
>
> > Hi,
> >
> > the current master and 0.13 release have a hard requirement in MLContext
> > for Spark 2.1. Is this really necessary or could we set it to >= 2.0?
> Only
> > supporting the latest Spark release seems a little restrictive to me.
> >
> >
> > -Felix
> >
>
>
>
> --
> Deron Eriksson
> Spark Technology Center
> http://www.spark.tc/
>
>
>
>


-- 
Deron Eriksson
Spark Technology Center
http://www.spark.tc/


Re: Minimum required Spark version

2017-02-20 Thread Deron Eriksson
Hi Felix,

I agree that the 2.1 hard requirement is a bit restrictive. If someone can
validate that Spark versions less than 2.1 and greater than 2.0.* work,
this seems like a great idea to me.

Deron


On Mon, Feb 20, 2017 at 1:43 PM, <fschue...@posteo.de> wrote:

> Hi,
>
> the current master and 0.13 release have a hard requirement in MLContext
> for Spark 2.1. Is this really necessary or could we set it to >= 2.0? Only
> supporting the latest Spark release seems a little restrictive to me.
>
>
> -Felix
>



-- 
Deron Eriksson
Spark Technology Center
http://www.spark.tc/


Re: Proposal to add 'accuracy test suite' before 1.0 release

2017-02-17 Thread Deron Eriksson
ng the new algorithm will greatly
> > > improve the production-readiness of SystemML as well as serve as a
> usage
> > > guide too. This implies we run both the performance as well as accuracy
> > > test suite before our release. Alternative is to replace simplified
> > > algorithms with our released algorithms.
> > >
> > > Advantages of accuracy test suite approach:
> > > 1. No increase the running time of integration tests on Jenkins.
> > > 2. Accuracy test suite could use much larger datasets.
> > > 3. Accuracy test suite could include algorithms that take longer to
> > > converge (for example: Deep Learning algorithms).
> > >
> > > Advantage of replacing simplified algorithms:
> > > 1. No commit breaks any of the existing algorithms.
> > >
> > > Thanks,
> > >
> > > Niketan Pansare
> > > IBM Almaden Research Center
> > > E-mail: npansar At us.ibm.com
> > > http://researcher.watson.ibm.com/researcher/view.php?person=us-npansar
> > >
> >
> >
> >
> >
>



-- 
Deron Eriksson
Spark Technology Center
http://www.spark.tc/


Re: Remove documentation for old MLContext API

2017-02-07 Thread Deron Eriksson
Felix, thank you for removing the old docs (
https://github.com/apache/incubator-systemml/pull/377). I agree that the
old documentation was making things confusing.

I have created a PR to deprecate the old API (
https://github.com/apache/incubator-systemml/pull/378).

Deron


On Mon, Feb 6, 2017 at 11:31 PM, Berthold Reinwald <reinw...@us.ibm.com>
wrote:

> +1
>
>
> Regards,
> Berthold Reinwald
> IBM Almaden Research Center
> office: (408) 927 2208; T/L: 457 2208
> e-mail: reinw...@us.ibm.com
>
>
>
> From:   dusenberr...@gmail.com
> To: dev@systemml.incubator.apache.org
> Date:   02/02/2017 03:56 PM
> Subject:Re: Remove documentation for old MLContext API
>
>
>
> +1 for removing that old documentation.
>
> --
>
> Mike Dusenberry
> GitHub: github.com/dusenberrymw
> LinkedIn: linkedin.com/in/mikedusenberry
>
> Sent from my iPhone.
>
>
> > On Feb 2, 2017, at 3:54 PM, fschue...@posteo.de wrote:
> >
> > As a step to deprecate the old MLContext API, I suggest to remove its
> documentation for the next release (together with a deprecation of the
> actual API so that we can remove it in 1.0).
> >
> > Currently the section about the old API is placed in between up-to-date
> documentation and makes it pretty confusing to see what is old and what is
> new.
> >
> > Any objections? Alternatively we could put it all the way to the end or
> in a separate document.
> >
> > -Felix
>
>
>
>
>


-- 
Deron Eriksson
Spark Technology Center
http://www.spark.tc/


0.12.0 Votes

2017-02-06 Thread Deron Eriksson
Hi,

I see we are up to two +1 votes (Justin and Sergio) already on the general
incubator list! I've updates our NOTICE files on master to reflect the
non-blocking 2017 updates mentioned by Justin and Sergio.

Would any of our active mentors (Luciano, Henry, Reynold) be available to
review the 0.12.0 release?

Thanks!
Deron

-- 
Deron Eriksson
Spark Technology Center
http://www.spark.tc/


Pull Request Reviews

2017-02-03 Thread Deron Eriksson
Hi,

Reviewing pull requests is a great way to contribute to the success of
SystemML. If you are involved in any way with SystemML, please consider
reviewing pull requests. Everyone can review pull requests, and it is a
great way to gain experience with the project.

Thanks!
Deron


Username PRs Reviewed
mboehm7 134
dusenberrymw 112
deroneriksson 110
niketanpansare 40
gweidner 31
shirisht 26
akchinSTC 25
nakul02 23
bertholdreinwald 15
lresende 12
frreiss 12
fschueler 9
Wenpei 7
asurve 5
iyounus 4
MechCoder 3
MadisonJMyers 3
oza 2
fmakari 2
rightwaitforyou 2
ethanyxu 1
ckadner 1
petro-rudenko 1
hsaputra 1
FelixNeutatz 1
nishi-t 0
sandeep-n 0
romeokienzler 0
tgamal 0
taasawat 0
sourav-mazumder 0
kevin-bates 0
kakal 0
GrapeBaBa 0
objectadjective 0
nmanchev 0
jodersky 0
jdyer1 0
gmlewis 0
aloknsingh 0
akunft 0
ahmaurya 0


-- 
Deron Eriksson
Spark Technology Center
http://www.spark.tc/


Re: [DISCUSS] Enable Python Tests on Jenkins

2017-02-03 Thread Deron Eriksson
+1 for enabling the Python tests in the test suite.

Since we use multiple languages and it's not always easy to catch when Java
changes break Python that depends on it, I think this would be extremely
valuable.

Deron


On Fri, Feb 3, 2017 at 11:49 AM, Niketan Pansare <npan...@us.ibm.com> wrote:

>
>
> Hi all,
>
> In our master branch, we have a unit test
> org.apache.sysml.test.integration.functions.python.PythonTestRunner, that
> tests our python wrappers. However, we have explicitly disabled it via a
> flag RUN_PYTHON_TEST in that class. This is because the python tests have a
> hard dependency on Spark installation. This is why Jenkins DOES NOT test
> any of our Python wrappers and new PRs could potentially break the Python
> wrappers. I wanted to know if anyone has objection to enable the Python
> tests by default.
>
> Please note, we will have to download the appropriate Spark version on
> Jenkins (or on any dev machine which runs the integration tests) and set
> SPARK_HOME environment variable.
>
> Thanks,
>
> Niketan Pansare
> IBM Almaden Research Center
> E-mail: npansar At us.ibm.com
> http://researcher.watson.ibm.com/researcher/view.php?person=us-npansar
>



-- 
Deron Eriksson
Spark Technology Center
http://www.spark.tc/


February Podling Report

2017-01-31 Thread Deron Eriksson
Hi,

I posted our SystemML podling report for February to:
https://wiki.apache.org/incubator/February2017

Please feel free to make any additions or modifications, such as individual
efforts to help build our project community. If you don't have write access
to the wiki, please request write access or ask Mike, Luciano, or me to
make any additions or modifications.

Thanks,
Deron

-- 
Deron Eriksson
Spark Technology Center
http://www.spark.tc/


Re: SystemML Branch for any fixes related to Spark 1.6x

2017-01-13 Thread Deron Eriksson
Thank you for creating the branch Luciano. I agree with Mike's suggestion
that "branch-0.12" might be clearer.

Deron


On Fri, Jan 13, 2017 at 1:55 PM, <dusenberr...@gmail.com> wrote:

> Thanks, Luciano for creating the branch. Could we rename it to
> "branch-0.12" to better reflect that any changes that are added would only
> apply to future bug fix releases on the 0.12.x line?  This would be more in
> line with the naming scheme that Spark uses for its branches, and should
> cause less confusion.
>
> --
>
> Mike Dusenberry
> GitHub: github.com/dusenberrymw
> LinkedIn: linkedin.com/in/mikedusenberry
>
> Sent from my iPhone.
>
>
> > On Jan 13, 2017, at 1:50 PM, Luciano Resende <luckbr1...@gmail.com>
> wrote:
> >
> > We have created the following branch to track Spark 1.6 fixes :
> > origin/branch-systemml-spark-1.6
> >
> > Note that, fixes that go into master, and are also affecting 1.6, they
> > should be cherry-picked to the 1.6 branch as well.
> >
> > As for checking out, you will need to do something like the steps below
> > (your preference might change some steps)
> >
> > git checkout -b branch-systemml-spark-1.6 origin/branch-systemml-spark-
> 1.6
> > git branch --set-upstream-to origin/branch-systemml-spark-1.6
> > branch-systemml-spark-1.6
> >
> > this last one is like:
> >
> > git branch --set-upstream-to origin/my_remote_branch my_local_branch
> >
> > For creating dev branches for 1.6, first go to you local 1.6 branch and
> > continue with your regular steps such as git branch -b JIRA-222
> >
> > And good luck !!!
> >
> > --
> > Luciano Resende
> > http://twitter.com/lresende1975
> > http://lresende.blogspot.com/
>



-- 
Deron Eriksson
Spark Technology Center
http://www.spark.tc/


Re: Maven build failing

2017-01-06 Thread Deron Eriksson
Hi Sandeep,

On first guess, the error looks like an issue with the maven central
repository being down. If this happens to be the case, one possible
solution is to temporarily add a central repo mirror to your
.m2/settings.xml file, such as:


  UK
  UK Central
  http://uk.maven.org/maven2
  central


Deron


On Fri, Jan 6, 2017 at 5:26 PM, <fschue...@posteo.de> wrote:

> Hi Sandeep,
>
> it seems like you can't connect to the maven repository. Can you open the
> link to the repository in a browser? (https://repo1.maven.org/maven2/)
> Are you behind a proxy maybe?
>
> If nothing else, you can download a binary release from our website to get
> started working with SystemML: http://systemml.apache.org/download.html
> The beginners guide (http://systemml.apache.org/get-started) has another
> way of getting started if you don't need to build the source yourself.
>
> - Felix
>
>
> Am 07.01.2017 02:09 schrieb Narayanaswami, Sandeep:
>
>> Dear SystemML community,
>>
>> I attempted the following (as per the Beginners' Guide for Python
>> users<https://apache.github.io/incubator-systemml/beginners-
>> guide-python.html>):
>>
>> git checkout clone https://github.com/apache/incubator-systemml.git
>> cd incubator-systemml
>> mvn clean package -P distribution
>>
>> This resulted in:
>>
>> [INFO] Scanning for projects...
>> Downloading: https://repo1.maven.org/maven2/org/apache/apache/18/apache-
>> 18.pom
>> Downloading:
>> https://raw.github.com/niketanpansare/mavenized-jcuda/mvn-
>> repo/org/apache/apache/18/apache-18.pom
>> [ERROR] [ERROR] Some problems were encountered while processing the POMs:
>> [FATAL] Non-resolvable parent POM for
>> org.apache.systemml:systemml:0.12.0-incubating-SNAPSHOT: Could not
>> transfer artifact org.apache:apache:pom:18 from/to central
>> (https://repo1.maven.org/maven2): Connect to repo1.maven.org:443
>> [repo1.maven.org/151.101.32.209] failed: Operation timed out
>> (Connection timed out) and 'parent.relativePath' points at wrong local
>> POM @ line 22, column 10
>> @
>> [ERROR] The build could not read 1 project -> [Help 1]
>> [ERROR]
>> [ERROR]   The project
>> org.apache.systemml:systemml:0.12.0-incubating-SNAPSHOT
>> (/Users/ioz814/Projects/oss/incubator-systemml/pom.xml) has 1 error
>> [ERROR] Non-resolvable parent POM for
>> org.apache.systemml:systemml:0.12.0-incubating-SNAPSHOT: Could not
>> transfer artifact org.apache:apache:pom:18 from/to central
>> (https://repo1.maven.org/maven2): Connect to repo1.maven.org:443
>> [repo1.maven.org/151.101.32.209] failed: Operation timed out
>> (Connection timed out) and 'parent.relativePath' points at wrong local
>> POM @ line 22, column 10 -> [Help 2]
>> [ERROR]
>> [ERROR] To see the full stack trace of the errors, re-run Maven with
>> the -e switch.
>> [ERROR] Re-run Maven using the -X switch to enable full debug logging.
>> [ERROR]
>> [ERROR] For more information about the errors and possible solutions,
>> please read the following articles:
>> [ERROR] [Help 1]
>> http://cwiki.apache.org/confluence/display/MAVEN/ProjectBuildingException
>> [ERROR] [Help 2]
>> http://cwiki.apache.org/confluence/display/MAVEN/Unresolvabl
>> eModelException
>>
>> I’d appreciate any help with resolving this issue.
>>
>> Thanks,
>> Sandeep
>> 
>>
>> The information contained in this e-mail is confidential and/or
>> proprietary to Capital One and/or its affiliates and may only be used
>> solely in performance of work or services for Capital One. The
>> information transmitted herewith is intended only for use by the
>> individual or entity to which it is addressed. If the reader of this
>> message is not the intended recipient, you are hereby notified that
>> any review, retransmission, dissemination, distribution, copying or
>> other use of, or taking of any action in reliance upon this
>> information is strictly prohibited. If you have received this
>> communication in error, please contact the sender and delete the
>> material from your computer.
>>
>


-- 
Deron Eriksson
Spark Technology Center
http://www.spark.tc/


Re: GSoc 2017

2017-01-06 Thread Deron Eriksson
Hi Krishna,

Welcome! As a starter, you may want to watch some of the SystemML videos on
YouTube, such as those on the Spark Technology Center channel:

https://www.youtube.com/channel/UC8-XGglzfn5fLvsQaxKrCuA/videos

We have a 'Contributing to SystemML' page that contains some information
that can also help you get started.

http://apache.github.io/incubator-systemml/contributing-to-systemml.html

If you find a particular area of SystemML that you are interested in
contributing to (algorithms, APIs, core functionality, documentation, etc),
please feel free to ask on this email list, and I'm sure someone with
knowledge in that area can help you get started.

Deron



On Thu, Jan 5, 2017 at 3:15 PM, Krishna Kalyan <krishnakaly...@gmail.com>
wrote:

> Hello Developers,
> I am Krishna, currently a 2nd year Masters student in (MSc. in Data Mining)
> currently in Barcelona studying at Université Polytechnique de Catalogne.
> I was interested in contributing to SystemML this year under GSoc program.
> Could anyone please guide on how to go about it?. (I understand the I need
> to write a proposal)
>
> Related Experience:
> My masters is mostly focussed on data mining techniques. Before my masters,
> I was a  data engineer with IBM (India). I was responsible for managing 50
> node Hadoop Cluster for more than a year. Most of my time was spent
> optimising and writing ETL (Apache Pig) jobs.
>
> I am the most comfortable with Python followed by R and Scala.
>
> My Webpage
> kkalyan.in
>
> My Spark Pull Requests
> https://github.com/apache/spark/pulls?utf8=%E2%9C%93=is%3Apr%20author%
> 3Akrishnakalyan3%20
>
> Thank you so much,
> Krishna
>



-- 
Deron Eriksson
Spark Technology Center
http://www.spark.tc/


Re: Release cadence

2017-01-05 Thread Deron Eriksson
+1 for trying out a 1 month release cycle.

However, I highly agree with Matthias that there is a lot of overhead with
releases, so it would be good if we can work to streamline/automate the
process as much as possible. Also, it would be good to distribute the tasks
around as much as possible. This can result in cross-training and help
avoid overburdening the same contributors each month.

If the overhead slows us down too much, then we can go to a slower release
cycle.

Deron




On Thu, Jan 5, 2017 at 1:50 PM, <dusenberr...@gmail.com> wrote:

> +1 for adopting a 1 month release cycle.
>
> --
>
> Mike Dusenberry
> GitHub: github.com/dusenberrymw
> LinkedIn: linkedin.com/in/mikedusenberry
>
> Sent from my iPhone.
>
>
> > On Jan 5, 2017, at 1:35 PM, Luciano Resende <luckbr1...@gmail.com>
> wrote:
> >
> > On Thu, Jan 5, 2017 at 6:05 AM, Matthias Boehm <mboe...@googlemail.com>
> > wrote:
> >
> >> In general, I like the idea of aiming for consistent release cycles.
> >> However, every month is just too much, at least for me. There is a
> >> considerable overhead associated with each release for end-to-end
> >> performance tests, tests on different environments, code freeze for new
> >> features, etc. Hence, a too short release cycle would not be "agile" but
> >> would actually slow us down. From my perspective, a realistic release
> >> cadence would be 2-3 months, maybe a bit more for major releases.
> >>
> >>
> > 2-3 months of release cadence for an open source is probably a long
> > stretch, particular for a project that does not have very large set of
> 3rd
> > party dependencies.
> >
> > As for some of the overhead issues you mentioned, they are probably easy
> to
> > workaround:
> >
> > - code-freeze timeframe can be resolved with branches
> > - end-to-end performance regressions can be avoided by better code
> review,
> > and if you were willing to go with 2-3 months without performing these
> > tests, we could perform them only for major releases, and proactively
> > quickly build a minor release with the patch when a user report any
> > performance regression.
> >
> >
> > Anyway, I would really like to see SystemML more agile with regards to
> its
> > release process because, as I mentioned before, the release early,
> release
> > often mantra is good to increase community interest, generate more
> traffic
> > to the list as developers discuss the roadmap and release blockers, and
> > also enable users to provide feedback sooner on the areas we are
> developing.
> >
> >
> >
> > --
> > Luciano Resende
> > http://twitter.com/lresende1975
> > http://lresende.blogspot.com/
>



-- 
Deron Eriksson
Spark Technology Center
http://www.spark.tc/


Re: test suite running slowly after disable cache/sparse commit?

2016-12-08 Thread Deron Eriksson
Hi Fred,

The last two daily tests ran around ~2:56 hr, so if this number is stable,
it seems that the new tests potentially add about half an hour to the test
suite time. I would like if we could decrease the test suite time rather
than add significantly to it. In fact, personally I'd prefer if we could do
something like move the time-consuming algorithm-type tests out of the main
test suite and just run the algorithm tests daily (if this is technically
possible). That way, we could get the main test suite time to be sped up
significantly but still benefit from daily test coverage provided by the
algorithm tests. I like the idea of a short test suite time since that
makes it easier to get feedback and continue working on an issue that day.
If the tests take too long to run, it means that issues that could
potentially be solved in one day will get pushed out to another day.

Increasing the number of simultaneous Jenkins jobs allowed could help with
queued-up builds, which would be nice. Currently Jenkins runs a max of two
simultaneous jobs. Jenkins currently handles:
1) two daily builds (at noon and at midnight)
2) on-demand builds (so a developer can commit some code on a branch and
then have jenkins build/test so that a developer's machine isn't tied up)
3) pull request builds (the initial push with a PR will trigger this along
with any subsequent pushes to the branch referenced by the PR).

Today there is not a queue, but I'm the only person to trigger a PR build
today. If more than two developers are submitting PRs that day, there will
be a queue. This queue has been manageable, but if the increase in test
suite time is a permanent thing, I'd recommend bumping the simultaneous
Jenkins jobs from two to four.

Deron



On Thu, Dec 8, 2016 at 4:49 PM, Frederick R Reiss <frre...@us.ibm.com>
wrote:

> +dev list
>
> I personally don't mind letting the regression suite run overnight. The
> important thing is that we do not push changes that have not passed the
> full automated test suite. In the interest of efficiency, we shouldn't even
> be reviewing most PRs until after they have passed the automated tests.
>
> Deron, are you seeing a backlog of not-yet-started builds queueing up on
> the PR build server? If the queue is getting long, we can add additional
> machines to the Jenkins cluster.
>
> Fred
>
> [image: Inactive hide details for Deron Eriksson---12/08/2016 11:06:52
> AM---Hi Niketan,]Deron Eriksson---12/08/2016 11:06:52 AM---Hi Niketan,
>
> From: Deron Eriksson/San Francisco/IBM
> To: Niketan Pansare/Almaden/IBM@IBMUS
> Cc: Berthold Reinwald/Almaden/IBM@IBMUS, Frederick R
> Reiss/Almaden/IBM@IBMUS
> Date: 12/08/2016 11:06 AM
> Subject: Re: test suite running slowly after disable cache/sparse commit?
> --
>
>
>
> Hi Niketan,
>
> Perhaps Berthold or Fred could add a little guidance here in terms of what
> is acceptable? Having the test suite go from 2:21 to 3:41 (one pull request
> yesterday took 4:11 to complete -
> *https://sparktc.ibmcloud.com/jenkins/job/SystemML-PullRequestBuilder/909/*
> <https://sparktc.ibmcloud.com/jenkins/job/SystemML-PullRequestBuilder/909/>)
> is very serious to me. Even if the test suite runs at 3:00, this is a
> serious slowdown. It slows down our ability to validate pull requests and
> other code on jenkins.
>
> Deron
>
>
> - Original message -
> From: Niketan Pansare/Almaden/IBM
> To: Deron Eriksson/San Francisco/IBM@ibmus
> Cc: Berthold Reinwald/Almaden/IBM@ibmus, Frederick R
> Reiss/Almaden/IBM@ibmus
> Subject: Re: test suite running slowly after disable cache/sparse commit?
> Date: Thu, Dec 8, 2016 8:55 AM
>
> Hi Deron,
>
> The commit replicated application tests for disable sparse and disable
> caching. So, the test time should increase. We should increase the duration
> or reduce the number of application tests we want to test with caching and
> sparse disabled.
>
> Thanks
>
> Niketan
>
> On Dec 8, 2016, at 7:47 AM, Deron Eriksson <*de...@us.ibm.com*
> <de...@us.ibm.com>> wrote:
>
>Hi Niketan,
>
>   I noticed the daily test yesterday timed out, probably because of a
>   long-running test.
>
>   Looking at the commits from the day before (
>   *https://github.com/apache/incubator-systemml/commits/master*
>   <https://github.com/apache/incubator-systemml/commits/master>), I
>   noticed that [SYSTEMML-769] [SYSTEMML-1140] Removed -disable-caching and
>   -disable-… (
>   
> *https://github.com/apache/incubator-systemml/commit/caaaec90b61e529e50021d89f9f108230fa307a8*
>   
> <https://github.com/apache/incubator-systemml/commit/caaaec90b61e529e50021d89f9f108230fa307a8>)
>   updated some of the tests.
>
>   So I ran 

Re: Release process document for SystemML

2016-11-18 Thread Deron Eriksson
Hi Arvind,

Thank you for bringing this up. Just wanted to let you know we have
https://issues.apache.org/jira/browse/SYSTEMML-848 to address this. I would
be happy to work with Luciano in the creation of such documentation. Since
Luciano has had to do all of our release deployments by himself, I would
also be happy to be more involved in our 1.0 release. It's a great
opportunity to gain more first-hand experience with the Apache incubator
process.

I will be on vacation next week, but I can help out after that.

Deron



On Fri, Nov 18, 2016 at 12:11 PM, Acs S  wrote:

> Hi Luciano,
> Do we have document describing SystemML release process from end to end
> (Starting from build, tagging RC, and publishing images)?Once Pip Install
> artifact issue get resolved we want to create another release based on
> SystemML 0.11 to add Pip Install artifact, possibly next week.
>
> ThanksArvind


Enabled doclint javadoc checking

2016-11-18 Thread Deron Eriksson
Hi,

I have enabled javadoc doclint checking. Previously the default Java 8
doclint checking was turned off by a profile because of javadoc errors in
the project. Since these errors have been fixed, we can now turn on the
default Java 8 doclint behavior. Javadoc errors will cause the build to
fail, so please review your javadocs before committing.

FYI, doclint checking can still be easily turned off using
the ignore-doclint profile:
mvn clean package -P distribution,ignore-doclint

Deron


Re: Release artifacts

2016-11-07 Thread Deron Eriksson
Hi Luciano,

Here is my understanding of the release artifacts at a very high level.

On Sun, Nov 6, 2016 at 6:10 PM, Luciano Resende 
wrote:

> This has been causing confusion with IPMC, and I want to document my
> understanding of the release artifacts:
>
> Source Distribution
> systemml-0.11.0-incubating-src.tar.gz
> systemml-0.11.0-incubating-src.tar.gz.asc
> systemml-0.11.0-incubating-src.tar.gz.md5
> systemml-0.11.0-incubating-src.zip
> systemml-0.11.0-incubating-src.zip.asc
> systemml-0.11.0-incubating-src.zip.md5
>
>
Yes, this is correct. These artifacts are the source distributions (project
can be built and tested using these distributions).



> Binary Distribution
> systemml-0.11.0-incubating.tar.gz
> systemml-0.11.0-incubating.tar.gz.asc
> systemml-0.11.0-incubating.tar.gz.md5
> systemml-0.11.0-incubating.zip
> systemml-0.11.0-incubating.zip.asc
> systemml-0.11.0-incubating.zip.md5
>
>
These artifacts package the SystemML jar file and the DML script files
together.



> Standalone Distribution
> systemml-0.11.0-incubating-standalone.tar.gz
> systemml-0.11.0-incubating-standalone.tar.gz.asc
> systemml-0.11.0-incubating-standalone.tar.gz.md5
> systemml-0.11.0-incubating-standalone.zip
> systemml-0.11.0-incubating-standalone.zip.asc
> systemml-0.11.0-incubating-standalone.zip.md5
>
>
These artifacts package the SystemML jar file, the DML script files, and
the jars required to run SystemML in standalone mode.



> SystemML convenience jar
> systemml-0.11.0-incubating-javadoc.jar
> systemml-0.11.0-incubating-javadoc.jar.asc
> systemml-0.11.0-incubating-javadoc.jar.md5
> systemml-0.11.0-incubating-sources.jar
> systemml-0.11.0-incubating-sources.jar.asc
> systemml-0.11.0-incubating-sources.jar.md5
> systemml-0.11.0-incubating.jar
> systemml-0.11.0-incubating.jar.asc
> systemml-0.11.0-incubating.jar.md5
>
>
This is the main SystemML jar (systemml-0.11.0-incubating.jar). It can be
used to run SystemML:
  1) On Spark (spark-submit systemml-0.11.0-incubating.jar -f test.dml)
  2) On Hadoop (hadoop jar systemml-0.11.0-incubating.jar -f test.dml)
  3) As a library

The -sources.jar and -javadoc.jar are maven 'standards' to supply
accompanying source code and javadocs to the main jar file.

Deron


Re: [DRAFT] November monthly report

2016-11-02 Thread Deron Eriksson
Thank you for reviewing, Luciano.


On Wed, Nov 2, 2016 at 5:12 PM, Luciano Resende <luckbr1...@gmail.com>
wrote:

> +1, Reviewed on the wiki as a mentor
>
> On Wed, Nov 2, 2016 at 8:45 AM, Deron Eriksson <deroneriks...@gmail.com>
> wrote:
>
> > Thank you for the feedback Mike. I added the VLDB paper award.
> >
> > I added the monthly report to the Apache incubator wiki at
> > https://wiki.apache.org/incubator/November2016.
> >
> > Deron
> >
> >
> > On Tue, Nov 1, 2016 at 5:07 PM, <dusenberr...@gmail.com> wrote:
> >
> > > Looks good. We should also include the VLDB paper award.
> > >
> > > --
> > >
> > > Mike Dusenberry
> > > GitHub: github.com/dusenberrymw
> > > LinkedIn: linkedin.com/in/mikedusenberry
> > >
> > > Sent from my iPhone.
> > >
> > >
> > > > On Nov 1, 2016, at 4:43 PM, Deron Eriksson <deroneriks...@gmail.com>
> > > wrote:
> > > >
> > > > Hello,
> > > >
> > > > Here is a draft of the November monthly report due tomorrow that
> Felix
> > > and
> > > > I put together. Feedback is welcome.
> > > >
> > > > Deron
> > > >
> > > > 
> > > >
> > > > SystemML
> > > >
> > > > SystemML provides declarative large-scale machine learning (ML) that
> > > aims at
> > > > flexible specification of ML algorithms and automatic generation of
> > > hybrid
> > > > runtime plans ranging from single node, in-memory computations, to
> > > > distributed
> > > > computations running on Apache Hadoop MapReduce and Apache Spark.
> > > >
> > > > SystemML has been incubating since 2015-11-02.
> > > >
> > > > Three most important issues to address in the move towards
> graduation:
> > > >
> > > > - Grow SystemML community: increase mailing list activity,
> > > >   increase adoption of SystemML for scalable machine learning,
> > encourage
> > > >   data scientists to adopt DML and PyDML algorithm scripts, respond
> to
> > > >   user feedback to ensure SystemML meets the requirements of
> real-world
> > > >   situations, write papers, and present talks about SystemML.
> > > > - Continue to produce releases.
> > > > - Increase the diversity of our project's contributors and
> committers.
> > > >
> > > > Any issues that the Incubator PMC (IPMC) or ASF Board wish/need to be
> > > aware
> > > > of?
> > > >
> > > > NONE.
> > > >
> > > > How has the community developed since the last report?
> > > > Our mailing list from August through October had 375 messages on a
> wide
> > > > range
> > > > of topics. We have gained 4 new contributors to the main project
> since
> > > > August
> > > > 1st. Our website has been redesigned with the help of several design
> > > > engineers
> > > > and we have commits from 3 new contributors to the website project.
> On
> > > > GitHub,
> > > > the project has been starred 417 times and forked 156 times.
> > > >
> > > > Niketan Pansare gave a talk with the title "Apache SystemML -
> > Declarative
> > > > Machine Learning at Scale" on October 7th in the CS graduate seminar
> at
> > > UC
> > > > Merced. Matthias Boehm gave a talk on "Compressed Linear Algebra for
> > > Large-
> > > > Scale Machine Learning" at TU Dresden on August 30th. We presented
> the
> > > > papers
> > > > "Compressed Linear Algebra for Large-Scale Machine Learning"
> (research
> > > > paper +
> > > > poster) and "SystemML: Declarative Machine Learning on Spark"
> (industry
> > > > paper)
> > > > at VLDB'16, gave two 90 minute tutorials at the BOSS'16 workshop,
> > > > co-located
> > > > with VLDB'16, and our paper "SPOOF: Sum-Product Optimization and
> > Operator
> > > > Fusion for Large- Scale Machine Learning" has been accepted at
> CIDR'17.
> > > >
> > > > How has the project developed since the last report?
> > > > The main project has had 213 commits since August 1. The website
> > project
> > > > has
> > > > had 51 commits since August 1. Since August 1, 241 issues have been
> > > > reported
> > > > on our JIRA site and 137 issues have been resolved or closed. 79 pull
> > > > requests
> > > > have been created since August 1, and 72 pull requests have been
> > closed.
> > > >
> > > > Date of last release:
> > > >
> > > > 2016-06-15 (version 0.10.0-incubating)
> > > >
> > > > When were the last committers or PMC members elected?
> > > >
> > > > 2016-05-07 Glenn Weidner
> > > > 2016-05-07 Faraz Makari Manshadi
> > > >
> > > > 
> > >
> >
>
>
>
> --
> Luciano Resende
> http://twitter.com/lresende1975
> http://lresende.blogspot.com/
>


Generating podling monthly report stats

2016-11-02 Thread Deron Eriksson
Hi,

In case it's useful for anyone else writing a future podling report, here
are a few things that can be done to generate stats:

1) # emails
Go to http://mail-archives.apache.org/mod_mbox/incubator-systemml-dev/ and
add up the 3 previous months (200 + 103 + 72).

2) # new contributors
See who has contributed in the last 3 months:
git shortlog -n -s -e --since=2016-08-01
Total up the new contributors.

3) stars and forks
Go to https://github.com/apache/incubator-systemml. Report the # of stars
and # of forks.

4) # commits since last report
git rev-list --count master --since=2016-08-01

5) # pull requests created and closed:
Go to https://github.com/apache/incubator-systemml/pulls.

Filter on:
is:pr created:2016-08-01..2016-11-01
Report total number (open + closed) and total closed.

6) # JIRAs created and resolved/closed.
Go to https://issues.apache.org/jira/browse/SYSTEMML issues.

Total new JIRAs created:
Filter on 'Created Date: 1/Aug/16' (ie, 'Between' 1/Aug/16 and blank)

JIRAs resolved and/or closed:
Add Status filter for 'Resolved' and 'Closed'.

Note that commits and contributors can also be reported for the website
project (https://github.com/apache/incubator-systemml-website).

Deron


Re: [DRAFT] November monthly report

2016-11-02 Thread Deron Eriksson
Thank you for the feedback Mike. I added the VLDB paper award.

I added the monthly report to the Apache incubator wiki at
https://wiki.apache.org/incubator/November2016.

Deron


On Tue, Nov 1, 2016 at 5:07 PM, <dusenberr...@gmail.com> wrote:

> Looks good. We should also include the VLDB paper award.
>
> --
>
> Mike Dusenberry
> GitHub: github.com/dusenberrymw
> LinkedIn: linkedin.com/in/mikedusenberry
>
> Sent from my iPhone.
>
>
> > On Nov 1, 2016, at 4:43 PM, Deron Eriksson <deroneriks...@gmail.com>
> wrote:
> >
> > Hello,
> >
> > Here is a draft of the November monthly report due tomorrow that Felix
> and
> > I put together. Feedback is welcome.
> >
> > Deron
> >
> > 
> >
> > SystemML
> >
> > SystemML provides declarative large-scale machine learning (ML) that
> aims at
> > flexible specification of ML algorithms and automatic generation of
> hybrid
> > runtime plans ranging from single node, in-memory computations, to
> > distributed
> > computations running on Apache Hadoop MapReduce and Apache Spark.
> >
> > SystemML has been incubating since 2015-11-02.
> >
> > Three most important issues to address in the move towards graduation:
> >
> > - Grow SystemML community: increase mailing list activity,
> >   increase adoption of SystemML for scalable machine learning, encourage
> >   data scientists to adopt DML and PyDML algorithm scripts, respond to
> >   user feedback to ensure SystemML meets the requirements of real-world
> >   situations, write papers, and present talks about SystemML.
> > - Continue to produce releases.
> > - Increase the diversity of our project's contributors and committers.
> >
> > Any issues that the Incubator PMC (IPMC) or ASF Board wish/need to be
> aware
> > of?
> >
> > NONE.
> >
> > How has the community developed since the last report?
> > Our mailing list from August through October had 375 messages on a wide
> > range
> > of topics. We have gained 4 new contributors to the main project since
> > August
> > 1st. Our website has been redesigned with the help of several design
> > engineers
> > and we have commits from 3 new contributors to the website project. On
> > GitHub,
> > the project has been starred 417 times and forked 156 times.
> >
> > Niketan Pansare gave a talk with the title "Apache SystemML - Declarative
> > Machine Learning at Scale" on October 7th in the CS graduate seminar at
> UC
> > Merced. Matthias Boehm gave a talk on "Compressed Linear Algebra for
> Large-
> > Scale Machine Learning" at TU Dresden on August 30th. We presented the
> > papers
> > "Compressed Linear Algebra for Large-Scale Machine Learning" (research
> > paper +
> > poster) and "SystemML: Declarative Machine Learning on Spark" (industry
> > paper)
> > at VLDB'16, gave two 90 minute tutorials at the BOSS'16 workshop,
> > co-located
> > with VLDB'16, and our paper "SPOOF: Sum-Product Optimization and Operator
> > Fusion for Large- Scale Machine Learning" has been accepted at CIDR'17.
> >
> > How has the project developed since the last report?
> > The main project has had 213 commits since August 1. The website project
> > has
> > had 51 commits since August 1. Since August 1, 241 issues have been
> > reported
> > on our JIRA site and 137 issues have been resolved or closed. 79 pull
> > requests
> > have been created since August 1, and 72 pull requests have been closed.
> >
> > Date of last release:
> >
> > 2016-06-15 (version 0.10.0-incubating)
> >
> > When were the last committers or PMC members elected?
> >
> > 2016-05-07 Glenn Weidner
> > 2016-05-07 Faraz Makari Manshadi
> >
> > 
>


Re: SystemML 0.11.0-incubating RC4 feedback

2016-10-31 Thread Deron Eriksson
Also, to quote Justin Mclean:
"It not clear to me what exactly is the release artefacts here, for
instance why does this directory include jars? Can you please clarify."

>From this response, I think it is clear that our large number of artifacts
is causing confusion. We can in part reduce this confusion by dropping our
number of released artifacts, as addressed by
https://issues.apache.org/jira/browse/SYSTEMML-926.

Deron



On Mon, Oct 31, 2016 at 10:14 AM, Deron Eriksson <deroneriks...@gmail.com>
wrote:

> Hi,
>
> I mentioned here that the "source-release.zip" is not something I
> recognize: https://www.mail-archive.com/dev@systemml.incubator.apache.
> org/msg00960.html
>
> Is this mystery artifact created by the script to deploy release
> candidates?
>
> It should definitely be removed in my opinion.
>
> Deron
>
>
>
> On Sun, Oct 30, 2016 at 4:22 PM, Glenn Weidner <gweid...@us.ibm.com>
> wrote:
>
>> I've opened JIRA/PR (https://issues.apache.org/jira/browse/SYSTEMML-1075)
>> for review to address item [2] of https://www.mail-archive.com/g
>> ene...@incubator.apache.org/msg56892.html.
>>
>> Thanks,
>> Glenn
>>
>> [image: Inactive hide details for Glenn Weidner---10/29/2016 09:52:05
>> PM---Can the systemml-0.11.0-incubating-source-release.zip be rem]Glenn
>> Weidner---10/29/2016 09:52:05 PM---Can the 
>> systemml-0.11.0-incubating-source-release.zip
>> be removed since there already is a src artifa
>>
>> From: Glenn Weidner/Silicon Valley/IBM@IBMUS
>> To: dev@systemml.incubator.apache.org
>> Date: 10/29/2016 09:52 PM
>> Subject: Re: SystemML 0.11.0-incubating RC4 feedback
>> --
>>
>>
>>
>> Can the systemml-0.11.0-incubating-source-release.zip be removed since
>> there already is a src artifact systemml-0.11.0-incubating-src.zip?
>>
>> Thanks,
>> Glenn
>>
>> Luciano Resende ---10/29/2016 07:25:40 PM---Please see IPMC feedback on
>> our RC4:
>> *https://www.mail-archive.com/general@incubator.apache.org/msg56*
>> <https://www.mail-archive.com/general@incubator.apache.org/msg56>
>>
>> From: Luciano Resende <luckbr1...@gmail.com>
>> To: dev@systemml.incubator.apache.org
>> Date: 10/29/2016 07:25 PM
>> Subject: SystemML 0.11.0-incubating RC4 feedback
>> --
>>
>>
>>
>> Please see IPMC feedback on our RC4:
>> *https://www.mail-archive.com/general@incubator.apache.org/msg56892.html*
>> <https://www.mail-archive.com/general@incubator.apache.org/msg56892.html>
>>
>> Please help address these issues, as we are going to have to start a new
>> RC
>> vote asap.
>>
>> Also, I am pretty sure some of these issues have been addressed in the
>> past, so we need to figure out what is causing some of these regressions.
>>
>> --
>> Luciano Resende
>> *http://twitter.com/lresende1975* <http://twitter.com/lresende1975>
>> *http://lresende.blogspot.com/* <http://lresende.blogspot.com/>
>>
>>
>>
>>
>>
>>
>


Re: Podling Report Reminder - November 2016

2016-10-31 Thread Deron Eriksson
Thank you Felix! I really appreciate it!

Deron


On Mon, Oct 31, 2016 at 1:01 PM, <fschue...@posteo.de> wrote:

> I will help you, Deron!
>
> Felix
>
>
> Am 31.10.2016 20:26 schrieb Deron Eriksson:
>
>> If there are no other volunteers to help with the report, I will help. If
>> anyone else is interested in writing the monthly report, I would also be
>> very happy to provide any assistance and guidance. Contributing to the
>> project monthly report is a great way to become familiar with ASF
>> procedures and expectations.
>>
>> Deron
>>
>>
>>
>> On Sat, Oct 29, 2016 at 8:45 AM, Luciano Resende <luckbr1...@gmail.com>
>> wrote:
>>
>> Any volunteers to help with this month report ?
>>>
>>> On Sat, Oct 29, 2016 at 6:22 AM, <johndam...@apache.org> wrote:
>>>
>>> > Dear podling,
>>> >
>>> > This email was sent by an automated system on behalf of the Apache
>>> > Incubator PMC. It is an initial reminder to give you plenty of time to
>>> > prepare your quarterly board report.
>>> >
>>> > The board meeting is scheduled for Wed, 16 November 2016, 10:30 am PDT.
>>> > The report for your podling will form a part of the Incubator PMC
>>> > report. The Incubator PMC requires your report to be submitted 2 weeks
>>> > before the board meeting, to allow sufficient time for review and
>>> > submission (Wed, November 02).
>>> >
>>> > Please submit your report with sufficient time to allow the Incubator
>>> > PMC, and subsequently board members to review and digest. Again, the
>>> > very latest you should submit your report is 2 weeks prior to the board
>>> > meeting.
>>> >
>>> > Thanks,
>>> >
>>> > The Apache Incubator PMC
>>> >
>>> > Submitting your Report
>>> >
>>> > --
>>> >
>>> > Your report should contain the following:
>>> >
>>> > *   Your project name
>>> > *   A brief description of your project, which assumes no knowledge of
>>> > the project or necessarily of its field
>>> > *   A list of the three most important issues to address in the move
>>> > towards graduation.
>>> > *   Any issues that the Incubator PMC or ASF Board might wish/need to
>>> be
>>> > aware of
>>> > *   How has the community developed since the last report
>>> > *   How has the project developed since the last report.
>>> >
>>> > This should be appended to the Incubator Wiki page at:
>>> >
>>> > http://wiki.apache.org/incubator/November2016
>>> >
>>> > Note: This is manually populated. You may need to wait a little before
>>> > this page is created from a template.
>>> >
>>> > Mentors
>>> > ---
>>> >
>>> > Mentors should review reports for their project(s) and sign them off on
>>> > the Incubator wiki page. Signing off reports shows that you are
>>> > following the project - projects that are not signed may raise alarms
>>> > for the Incubator PMC.
>>> >
>>> > Incubator PMC
>>> >
>>>
>>>
>>>
>>> --
>>> Luciano Resende
>>> http://twitter.com/lresende1975
>>> http://lresende.blogspot.com/
>>>
>>>


Re: [DISCUSS] Adding tensorboard-like functionality to SystemML

2016-10-31 Thread Deron Eriksson
Hi Jeremy,

I think moving forward with visualization and design is a great idea,
especially since I feel there is currently momentum after the great design
refactoring of the project website. Mike and Jeremy, please let me know if
there's any way in which I can help.

Deron


On Fri, Oct 28, 2016 at 8:03 PM, Jeremy Anderson  wrote:

> >
> > Visualization is a good topic to bring up for the project. I would like
> to
> > add another possible option of using TensorBoard directly. I have not
> > looked into the file format used for TensorBoard, but it may be possible
> to
> > simple adopt that format, and simply write our stats to that type of
> file.
> > That would allow us to reuse that project without having to write our
> own.
>
>
> Mike, I think this is a great place to start. I'd love to collaborate from
> a design perspective, with anyone  that wants to technical side.
>
> ...
>
> Jeremy Anderson
> Github: https://github.com/objectadjective
> Twitter: https://twitter.com/ObjectAdjective
> LinkedIN: http://www.linkedin.com/in/objectadjective
>
> On 29 October 2016 at 02:46,  wrote:
>
> > Visualization is a good topic to bring up for the project. I would like
> to
> > add another possible option of using TensorBoard directly. I have not
> > looked into the file format used for TensorBoard, but it may be possible
> to
> > simple adopt that format, and simply write our stats to that type of
> file.
> > That would allow us to reuse that project without having to write our
> own.
> >
> > --
> >
> > Mike Dusenberry
> > GitHub: github.com/dusenberrymw
> > LinkedIn: linkedin.com/in/mikedusenberry
> >
> > Sent from my iPhone.
> >
> >
> > > On Oct 28, 2016, at 8:13 AM, Niketan Pansare 
> wrote:
> > >
> > > Hi Matthias,
> > >
> > > Thanks for your feedback.
> > >
> > > There is a tradeoff between keeping a feature in-house until it is
> > stable, v/s continually getting community feedback as the work is getting
> > done via PR and discussions. I am for the latter as it encourages
> community
> > feedback as well as participation.
> > >
> > > I agree that our goal should be to complete the features you mentioned
> > asap and yes, we are working hard towards making the GPU backend, the
> deep
> > learning built-in functions and the algorithm wrappers (ones that are
> > already added) to be 'non-experimental' in the 1.0 release :) ... Also,
> > like you hinted, it is important to explicitly mark the experimental
> > features in the documentation to avoid the 'bad impression'. The Python
> DSL
> > will remain experimental until there is more interest from the
> community. I
> > am fine with deleting the debugger since it is rarely used, if at all.
> > >
> > > Keeping inline with the Apache guidelines, this discussion is to allow
> > community to decide on whether SystemML community should consider adding
> > new visualization functionality (since this feature is user facing). If
> > there is no interest, we can either postpone or discard this discussion
> :)
> > >
> > > Thanks,
> > >
> > > Niketan.
> > >
> > >> On Oct 28, 2016, at 1:24 AM, Matthias Boehm 
> > wrote:
> > >>
> > >> Thanks for putting this together Niketan. However, could we please
> > >> postpone this discussion after our 1.0 release? Right now, I'm
> concerned
> > >> to see that we're adding many experimental features without really
> > >> getting them done. This includes for example, the GPU backend, the new
> > >> MLContext API, the Python DSL, the deep learning builtin functions,
> the
> > >> Scala algorithm wrappers, the old Spark debugger interface, and
> > >> compressed linear algebra. I think we should finish these features
> first
> > >> before moving on. If we're not careful about that, it would quickly
> > >> create a very bad impression for new users.
> > >>
> > >> Regards,
> > >> Matthias
> > >>
> > >>> On 10/28/2016 1:20 AM, Niketan Pansare wrote:
> > >>>
> > >>>
> > >>> Hi all,
> > >>>
> > >>> To give every context, I am working on a new deep learning API for
> > SystemML
> > >>> that is backed by the NN library (
> > >>> https://github.com/apache/incubator-systemml/tree/
> > master/scripts/staging/SystemML-NN/nn
> > >>> ). This API allows the users to express their model using Caffe
> > >>> specification and perform fit/predict similar to scikit-learn APIs. I
> > have
> > >>> created a sample notebook explaining the usage of the API:
> > >>> https://github.com/niketanpansare/incubator-systemml/blob/
> > 1b655ebeec6cdffd66b282eadc4810ecfd39e4f2/samples/jupyter-
> > notebooks/Barista-API-Demo.ipynb
> > >>> . This API also allows the user to load and store pre-trained models.
> > See
> > >>> https://github.com/niketanpansare/model_zoo/tree/
> > master/caffe/vision/vgg/ilsvrc12
> > >>>
> > >>> As part of this API, I added a mini-tensorboard like functionality
> (see
> > >>> step 6 and 7) using matplotlib. If there is 

Re: Podling Report Reminder - November 2016

2016-10-31 Thread Deron Eriksson
If there are no other volunteers to help with the report, I will help. If
anyone else is interested in writing the monthly report, I would also be
very happy to provide any assistance and guidance. Contributing to the
project monthly report is a great way to become familiar with ASF
procedures and expectations.

Deron



On Sat, Oct 29, 2016 at 8:45 AM, Luciano Resende 
wrote:

> Any volunteers to help with this month report ?
>
> On Sat, Oct 29, 2016 at 6:22 AM,  wrote:
>
> > Dear podling,
> >
> > This email was sent by an automated system on behalf of the Apache
> > Incubator PMC. It is an initial reminder to give you plenty of time to
> > prepare your quarterly board report.
> >
> > The board meeting is scheduled for Wed, 16 November 2016, 10:30 am PDT.
> > The report for your podling will form a part of the Incubator PMC
> > report. The Incubator PMC requires your report to be submitted 2 weeks
> > before the board meeting, to allow sufficient time for review and
> > submission (Wed, November 02).
> >
> > Please submit your report with sufficient time to allow the Incubator
> > PMC, and subsequently board members to review and digest. Again, the
> > very latest you should submit your report is 2 weeks prior to the board
> > meeting.
> >
> > Thanks,
> >
> > The Apache Incubator PMC
> >
> > Submitting your Report
> >
> > --
> >
> > Your report should contain the following:
> >
> > *   Your project name
> > *   A brief description of your project, which assumes no knowledge of
> > the project or necessarily of its field
> > *   A list of the three most important issues to address in the move
> > towards graduation.
> > *   Any issues that the Incubator PMC or ASF Board might wish/need to be
> > aware of
> > *   How has the community developed since the last report
> > *   How has the project developed since the last report.
> >
> > This should be appended to the Incubator Wiki page at:
> >
> > http://wiki.apache.org/incubator/November2016
> >
> > Note: This is manually populated. You may need to wait a little before
> > this page is created from a template.
> >
> > Mentors
> > ---
> >
> > Mentors should review reports for their project(s) and sign them off on
> > the Incubator wiki page. Signing off reports shows that you are
> > following the project - projects that are not signed may raise alarms
> > for the Incubator PMC.
> >
> > Incubator PMC
> >
>
>
>
> --
> Luciano Resende
> http://twitter.com/lresende1975
> http://lresende.blogspot.com/
>


Re: SystemML 0.11.0-incubating RC4 feedback

2016-10-31 Thread Deron Eriksson
Hi,

I mentioned here that the "source-release.zip" is not something I
recognize:
https://www.mail-archive.com/dev@systemml.incubator.apache.org/msg00960.html

Is this mystery artifact created by the script to deploy release candidates?

It should definitely be removed in my opinion.

Deron



On Sun, Oct 30, 2016 at 4:22 PM, Glenn Weidner  wrote:

> I've opened JIRA/PR (https://issues.apache.org/jira/browse/SYSTEMML-1075)
> for review to address item [2] of https://www.mail-archive.com/
> gene...@incubator.apache.org/msg56892.html.
>
> Thanks,
> Glenn
>
> [image: Inactive hide details for Glenn Weidner---10/29/2016 09:52:05
> PM---Can the systemml-0.11.0-incubating-source-release.zip be rem]Glenn
> Weidner---10/29/2016 09:52:05 PM---Can the 
> systemml-0.11.0-incubating-source-release.zip
> be removed since there already is a src artifa
>
> From: Glenn Weidner/Silicon Valley/IBM@IBMUS
> To: dev@systemml.incubator.apache.org
> Date: 10/29/2016 09:52 PM
> Subject: Re: SystemML 0.11.0-incubating RC4 feedback
> --
>
>
>
> Can the systemml-0.11.0-incubating-source-release.zip be removed since
> there already is a src artifact systemml-0.11.0-incubating-src.zip?
>
> Thanks,
> Glenn
>
> Luciano Resende ---10/29/2016 07:25:40 PM---Please see IPMC feedback on
> our RC4: *https://www.mail-archive.com/general@incubator.apache.org/msg56*
> 
>
> From: Luciano Resende 
> To: dev@systemml.incubator.apache.org
> Date: 10/29/2016 07:25 PM
> Subject: SystemML 0.11.0-incubating RC4 feedback
> --
>
>
>
> Please see IPMC feedback on our RC4:
> *https://www.mail-archive.com/general@incubator.apache.org/msg56892.html*
> 
>
> Please help address these issues, as we are going to have to start a new RC
> vote asap.
>
> Also, I am pretty sure some of these issues have been addressed in the
> past, so we need to figure out what is causing some of these regressions.
>
> --
> Luciano Resende
> *http://twitter.com/lresende1975* 
> *http://lresende.blogspot.com/* 
>
>
>
>
>
>


Re: Couple of questions on website contents

2016-10-25 Thread Deron Eriksson
Hi Luciano,

Since the current website updates are major improvements, I have gone ahead
and published the new updates. I think we can now start publishing more
frequently since important parts of the codebase have stabilized.

Deron


On Tue, Oct 25, 2016 at 5:40 PM, Deron Eriksson <deroneriks...@gmail.com>
wrote:

> Hi Luciano,
>
> Several updates to the website were merged today. I think we're at the
> point where we can publish the new website updates. Do you agree?
>
> Deron
>
>
> On Tue, Oct 25, 2016 at 11:02 AM, Jason Azares <jason.aza...@gmail.com>
> wrote:
>
>> Hi Luciano,
>>
>> Initial page:
>> >  - What's the intention of the section just above the social banner ? I
>> > noticed it was actually a copy of a section from the community page,
>> but it
>> > looks like the content was duplicated and not extracted to a banner,
>> and I
>> > have changed the one in community to what I think it better clarifies
>> the
>> > mailing list, but I am not sure if that's the same intent of the banner
>> on
>> > the initial page.
>>
>>
>> Thanks for bringing this point to our attention. The content on the
>> initial
>> page is different from the community page. We wanted to have a call to
>> action to get users to subscribe to the mailing list. We are currently
>> designing this section and will send a pull request once completed.
>>
>> Navigation Menu:
>> > - The community navigation seems to have gone wild with a few
>> duplications.
>> > We have source code and github links, which are both the same. We also
>> have
>> > the community get involved link that includes a list of committers using
>> > the new design format, but there is also a link to project committers
>> that
>> > include the old page listing all committers.
>>
>>
>> Dexter is currently working to resolve this issue. He will send his
>> updates
>> once they are finished.
>>
>> Hope this clears things up!
>>
>> Best,
>> Jason
>>
>> On Mon, Oct 24, 2016 at 7:09 PM, Luciano Resende <luckbr1...@gmail.com>
>> wrote:
>>
>> > I have a few questions on the contents of the website in the master
>> branch
>> > :
>> >
>> > Initial page:
>> >
>> >  - What's the intention of the section just above the social banner ? I
>> > noticed it was actually a copy of a section from the community page,
>> but it
>> > looks like the content was duplicated and not extracted to a banner,
>> and I
>> > have changed the one in community to what I think it better clarifies
>> the
>> > mailing list, but I am not sure if that's the same intent of the banner
>> on
>> > the initial page.
>> >
>> > Navigation Menu:
>> > - The community navigation seems to have gone wild with a few
>> duplications.
>> > We have source code and github links, which are both the same. We also
>> have
>> > the community get involved link that includes a list of committers using
>> > the new design format, but there is also a link to project committers
>> that
>> > include the old page listing all committers.
>> >
>> >
>> > Once we resolve the items above (which are more like cleanups), I think
>> we
>> > might be at a point where we could publish these latest updates to the
>> live
>> > website.
>> >
>> > Thoughts ?
>> >
>> > --
>> > Luciano Resende
>> > http://twitter.com/lresende1975
>> > http://lresende.blogspot.com/
>> >
>>
>
>


Re: Couple of questions on website contents

2016-10-25 Thread Deron Eriksson
Hi Luciano,

Several updates to the website were merged today. I think we're at the
point where we can publish the new website updates. Do you agree?

Deron


On Tue, Oct 25, 2016 at 11:02 AM, Jason Azares 
wrote:

> Hi Luciano,
>
> Initial page:
> >  - What's the intention of the section just above the social banner ? I
> > noticed it was actually a copy of a section from the community page, but
> it
> > looks like the content was duplicated and not extracted to a banner, and
> I
> > have changed the one in community to what I think it better clarifies the
> > mailing list, but I am not sure if that's the same intent of the banner
> on
> > the initial page.
>
>
> Thanks for bringing this point to our attention. The content on the initial
> page is different from the community page. We wanted to have a call to
> action to get users to subscribe to the mailing list. We are currently
> designing this section and will send a pull request once completed.
>
> Navigation Menu:
> > - The community navigation seems to have gone wild with a few
> duplications.
> > We have source code and github links, which are both the same. We also
> have
> > the community get involved link that includes a list of committers using
> > the new design format, but there is also a link to project committers
> that
> > include the old page listing all committers.
>
>
> Dexter is currently working to resolve this issue. He will send his updates
> once they are finished.
>
> Hope this clears things up!
>
> Best,
> Jason
>
> On Mon, Oct 24, 2016 at 7:09 PM, Luciano Resende 
> wrote:
>
> > I have a few questions on the contents of the website in the master
> branch
> > :
> >
> > Initial page:
> >
> >  - What's the intention of the section just above the social banner ? I
> > noticed it was actually a copy of a section from the community page, but
> it
> > looks like the content was duplicated and not extracted to a banner, and
> I
> > have changed the one in community to what I think it better clarifies the
> > mailing list, but I am not sure if that's the same intent of the banner
> on
> > the initial page.
> >
> > Navigation Menu:
> > - The community navigation seems to have gone wild with a few
> duplications.
> > We have source code and github links, which are both the same. We also
> have
> > the community get involved link that includes a list of committers using
> > the new design format, but there is also a link to project committers
> that
> > include the old page listing all committers.
> >
> >
> > Once we resolve the items above (which are more like cleanups), I think
> we
> > might be at a point where we could publish these latest updates to the
> live
> > website.
> >
> > Thoughts ?
> >
> > --
> > Luciano Resende
> > http://twitter.com/lresende1975
> > http://lresende.blogspot.com/
> >
>


Re: [VOTE] SystemML New Logo Ideas

2016-10-25 Thread Deron Eriksson
+1 sounds great to me too.


On Tue, Oct 25, 2016 at 12:44 PM,  wrote:

> +1 that sounds great to me.
>
> --
>
> Mike Dusenberry
> GitHub: github.com/dusenberrymw
> LinkedIn: linkedin.com/in/mikedusenberry
>
> Sent from my iPhone.
>
>
> > On Oct 25, 2016, at 10:45 AM, Madison Myers 
> wrote:
> >
> > I agree!
> > +1 to using both. I think, like you suggested, that using #1 for headers
> > and #4 for other uses sounds fantastic.
> >
> > On Tue, Oct 25, 2016 at 10:36 AM, Jason Azares 
> > wrote:
> >
> >> Hey guys,
> >>
> >> Branding wise, we also feel that #1 and #4 are the best choices. It's
> great
> >> that we're all on the same page. To answer the question of pros and
> cons of
> >> each logo, here is a quick list:
> >>
> >> Logo 1:
> >>
> >>
> >>   - More versatile because of its scalability; We think logo 4 will be
> >>  hard to discern once sized down; Logo 1 looks cleaner in website
> >> headers
> >>  with text
> >>  - Relevant because it has a matrix bracket
> >>  - It's a simplified version of the robot. Think of it as the batman
> >>  signal and the robot is batman.
> >>
> >> Logo 4:
> >>
> >>
> >>   - More original because it has a personality
> >>  - Diverse in the actions it can perform because it can move,
> animate,
> >>  and be customized based on intent and use
> >>  - The robot is kind of cute and approachable
> >>
> >> Our suggestion is to use both. Logo 1 is the simplified version of the
> >> robot. Logo 4 is the personification of the logo used to explain
> concepts.
> >>
> >> We'd love to hear your thoughts!
> >>
> >> Regards,
> >> Jason and the design team
> >>
> >> P.S. In general, here are our guidelines for creating a great logo:
> >>
> >>   - *original* - something that stands out from competitors
> >>   - *relevant* - reflects the brand's mission and values
> >>   - *versatile* - look good in black and white, in different colors and
> >>   sizes depending on context (e.g. billboards, websites, t-shirts, toys,
> >>   business cards, etc)
> >>   - *memorable* - easily recognizable everywhere (e.g. mickey mouse,
> nike)
> >>   - *timeless* - not just based on what's currently popular
> >>
> >>
> >>
> >>> On Tue, Oct 25, 2016 at 9:47 AM,  wrote:
> >>>
> >>> Looks like there is a large amount of support for both #1 and #4.
> Design
> >>> team, could you provide some more thoughts on the pros and cons for
> each,
> >>> and perhaps any thoughts on ways the icons could be used in various
> >> project
> >>> materials?
> >>>
> >>> --
> >>>
> >>> Mike Dusenberry
> >>> GitHub: github.com/dusenberrymw
> >>> LinkedIn: linkedin.com/in/mikedusenberry
> >>>
> >>> Sent from my iPhone.
> >>>
> >>>
>  On Oct 25, 2016, at 9:41 AM, Acs S  wrote:
> 
>  I like #4 as well.
>  +1 on #4.
> 
>  -Arvind
> 
>  From: Berthold Reinwald 
>  To: dev@systemml.incubator.apache.org
>  Sent: Monday, October 24, 2016 12:34 AM
>  Subject: Re: [VOTE] SystemML New Logo Ideas
> 
>  +1 on #4.
> 
>  Regards,
>  Berthold Reinwald
>  IBM Almaden Research Center
>  office: (408) 927 2208; T/L: 457 2208
>  e-mail: reinw...@us.ibm.com
> 
> 
> 
>  From:  Luciano Resende 
>  To:dev@systemml.incubator.apache.org
>  Date:  10/21/2016 04:37 PM
>  Subject:Re: [VOTE] SystemML New Logo Ideas
> 
> 
> 
>  On Fri, Oct 21, 2016 at 11:27 AM, Frederick R Reiss <
> >> frre...@us.ibm.com>
>  wrote:
> 
> > These are awesome! I'm more a fan of option #4 myself.
> >
> >
>  I like option $4 myself as well.
> 
> 
>  --
>  Luciano Resende
>  http://twitter.com/lresende1975
>  http://lresende.blogspot.com/
> 
> 
> 
> 
> 
> >>>
> >>
> >
> >
> >
> > --
> > *Madison J. Myers*
> > *UC Berkeley, Master of Information & Data Science '17*
> >
> > *King's College London, MA Political Science '14*
> > *New York University, BA Political Science '12*
> >
> >   -
> >  LinkedIn 
>


Re: Local versions of Linear Algebra Operators in DML

2016-10-24 Thread Deron Eriksson
Would it be acceptable for a user to receive a log warning if the user uses
an operation that is currently only implemented for single node? My concern
is that there is an expectation for operations to be distributed with
SystemML, and if an operation is not currently distributed, the user needs
to made aware of this.

Thoughts?

Deron


On Mon, Oct 24, 2016 at 10:38 AM, Nakul Jindal  wrote:

> Hi,
>
> There is an initial implementation and PR.
> https://github.com/apache/incubator-systemml/pull/273
>
> -Nakul
>
>
> > On Oct 24, 2016, at 12:59 AM, Berthold Reinwald 
> wrote:
> >
> > Thanks, Imran. I think it is a good idea to start off with the DML-bodied
> > function implementation. This will hold until we can have a built in
> > implementation.
> >
> > We prototyped an implementation of distributed Cholesky as a DML bodied
> > function as well. For performance optimization, as the matrix becomes
> > "small" enough, we switched over and exploit a single node
> implementation.
> >
> > Adding a new svd() built in function that initially routes to a local
> > library is fine. I don't know whether Apache commons math has an
> > implementation that can be re-used.
> >
> > I object renaming the functions or changing the externals. Eventually
> > distributed instructions need to be added to these implementations, and
> > there are open jiras for it.
> >
> > Regards,
> > Berthold Reinwald
> > IBM Almaden Research Center
> > office: (408) 927 2208; T/L: 457 2208
> > e-mail: reinw...@us.ibm.com
> >
> >
> >
> > From:   Niketan Pansare/Almaden/IBM@IBMUS
> > To: dev@systemml.incubator.apache.org
> > Date:   10/21/2016 01:14 PM
> > Subject:Re: Local versions of Linear Algebra Operators in DML
> >
> >
> >
> > I am also comfortable with option (2) ... "with a plan to implement its
> > distributed version"
> >
> > Thanks,
> >
> > Niketan Pansare
> > IBM Almaden Research Center
> > E-mail: npansar At us.ibm.com
> > http://researcher.watson.ibm.com/researcher/view.php?person=us-npansar
> >
> > Matthias Boehm ---10/21/2016 01:00:51 PM---thanks Nakul for reaching out
> > before starting work on this. Actually, the introduction of these CP-
> >
> > From: Matthias Boehm 
> > To: dev@systemml.incubator.apache.org
> > Date: 10/21/2016 01:00 PM
> > Subject: Re: Local versions of Linear Algebra Operators in DML
> >
> >
> >
> > thanks Nakul for reaching out before starting work on this. Actually,
> > the introduction of these CP-only builtin functions was a big mistake
> > because (as you already mentioned) they mistakenly suggest that we
> > provide distributed operations for them too. The intend was to support
> > them in later versions with our own local and distributed
> > implementations. So far, this had low priority though because these
> > O(n^3) operations are seldom used over large data. However, a while
> > back, we lost potential users who were specifically interested in
> > distributed eigen - so there are still use cases.
> >
> > Despite the good intentions behind the renaming, I would strongly argue
> > against it. First, it would unnecessarily lose compatibility with R
> > syntax. Second, it would defeat our clean abstraction by exposing
> > explicit local operations.
> >
> > This leaves us with two options here: (1) you could use an external
> > (java-implemented) function, which gives you virtually the same runtime
> > behavior but a clear separation via an explicit registration, or (2) add
> > it to the list of CP-only operations (with a plan to implement its
> > distributed version) but name it 'svd' as in R.
> >
> >
> > Regards,
> > Matthias
> >
> >
> >> On 10/21/2016 9:34 PM, Nakul Jindal wrote:
> >> Hi,
> >>
> >> Imran was planning on implementing a distributed SVD as a DML bodied
> >> function.
> >> The algorithm is described in the paper titled "A Distributed and
> >> Incremental SVD Algorithm for Agglomerative Data Analysis on Large
> >> Networks" available at https://arxiv.org/abs/1601.07010.
> >>
> >> This algorithm requires the availability of a local SVD function, which
> > we
> >> currently do not have in SystemML.
> >> Seeing as how there are other linear algebra functions (eigen, lu, qr,
> >> cholesky) in DML that reroute to Apache Common Math and only operate in
> >> standalone/CP mode, would it be ok to add "svd" to this set?
> >>
> >> Also, since these operations are local and not distributed and the
> >> documentation doesn't make it clear that these operations wont operate
> > in
> >> distributed mode, would it make sense to rename them to "local_eigen",
> >> "local_qr", "local_cholesky", etc?
> >> Obviously, this change would go into the version after 0.11.
> >>
> >> I understand that the ideal solution to this problem is to have a
> >> distributed version of the aforementioned linear algebra routines, but
> > for
> >> the time being, would it be ok to go ahead do the rename, while also
> >> introducing a "local_svd" ?

Re: use of systemml-0.10.0.incubating.jar

2016-10-21 Thread Deron Eriksson
Hi James,

Thank you for the great questions! I think some of the issues that you are
experiencing are usage issues from a failure on our part to convey this
information clearly. The good news is that a tremendous amount of effort
and focus is currently being directed towards fixing our website and
documentation. We also have very significant upcoming releases (we are just
finishing with our 0.11.0 voting).

1)

Here is some background to help.

The main jar ("systemml-0.10.0.incubating.jar") is typically used to
perform scalable machine learning across a Spark or Hadoop cluster. Spark
and Hadoop both have a large number of jars packaged with them (from a
maven viewpoint these are treated as provided dependencies). In addition,
SystemML has some additional libraries that it needs (wink, some antlr,
etc) that are not provided by Spark and Hadoop, so these libraries are
treated by SystemML as compile-scope dependencies and included in the main
jar so that if you would like to run SystemML on Spark or Hadoop, you only
need to include the single SystemML jar, as in these examples:
   $SPARK_HOME/bin/spark-submit systemml-0.10.0.incubating.jar -s
"print('hello world');" -exec hybrid_spark
   hadoop jar systemml-0.10.0.incubating.jar -s "print('hello world');"
So, I think the compile-scope dependencies haven't been shaded because
typically the main jar runs on Spark or Hadoop rather than being treated as
a library.

I think shading to change the namespaces so as to avoid namespace
collisions is a great idea in case the SystemML jar is being used as a
library.

2)

One of the ideas regarding SystemML is the ability to easily customize
scalable machine learning algorithms. We have .tar.gz and .zip artifacts
that can be unpacked that offer the scripts as text files that can easily
be modified. However, we also package them into the jar files in case
someone wants to run them and not really modify them. The Connection class
is part of the JMLC API (see
http://apache.github.io/incubator-systemml/jmlc.html), one of multiple APIs
that can be used to run SystemML. This API is fairly specialized and I
believe if you want to access a script in the jar using this API that you
need to do a getResourceAsStream and read the script as an InputStream.

However, if you would like to use a programmatic API to SystemML, I would
recommend the new SystemML MLContext API (0.10.0 contains an old MLContext
API and the very soon to be released 0.11.0 contains the completely
redesigned MLContext API). The new MLContext API features many conveniences
such as ScriptFactory.dmlFromResource() which lets you easily read a DML
file from the SystemML jar. For more information about this API, see
http://apache.github.io/incubator-systemml/spark-mlcontext-programming-guide.html

3)

As a Java developer with a lot of maven experience, my first inclination
when working with SystemML was to try to use the main jar as a library, and
I believe you are having the same experience I did. Because of the way the
project is structured, using SystemML as a library isn't perhaps as easy as
it should be.

Here are the steps that I just tried out to use the latest SystemML project
as a library (using the new MLContext API):

A) Check out the latest project and install the snapshot artifacts in local
maven repo:
mvn clean install -P distribution -DskipTests

B) Create a basic Java maven example project with the SystemML snapshot
dependency. Since SystemML treats most dependencies as provided scope, I'll
re-specify the Spark dependencies with default (compile) scope in my
example project's pom.xml.


org.apache.systemml
systemml
0.12.0-SNAPSHOT


org.apache.spark
spark-core_2.10
1.4.1


org.apache.spark
spark-sql_2.10
1.4.1


org.apache.spark
spark-mllib_2.10
1.4.1


C) Create a Java class to run an algorithm on SystemML using the new
MLContext API. This example reads the Univar-Stats.dml script from the jar
file and runs the Haberman dataset on the algorithm. It outputs the results
to the console for viewing.

package org.apache.systemml.example;

import java.util.ArrayList;
import java.util.List;

import org.apache.spark.SparkConf;
import org.apache.spark.api.java.JavaRDD;
import org.apache.spark.api.java.JavaSparkContext;
import org.apache.sysml.api.mlcontext.MLContext;
import org.apache.sysml.api.mlcontext.Script;
import org.apache.sysml.api.mlcontext.ScriptFactory;

public class MLContextExample {

public static void main(String[] args) throws Exception {
SparkConf conf = new
SparkConf().setAppName("MLContextExample").setMaster("local");
JavaSparkContext sc = new JavaSparkContext(conf);
MLContext ml = new MLContext(sc);

Script uni =
ScriptFactory.dmlFromResource("/scripts/algorithms/Univar-Stats.dml");
String habermanUrl = "
http://archive.ics.uci.edu/ml/machine-learning-databases/haberman/haberman.data
";
uni.in("A", new java.net.URL(habermanUrl));
List list = new ArrayList();
list.add("1.0,1.0,1.0,2.0");
JavaRDD typesRDD = sc.parallelize(list);

Re: Local versions of Linear Algebra Operators in DML

2016-10-21 Thread Deron Eriksson
Hi Nakul,

+1
I think having some clear characteristic to distinguish operations that
only operate locally is a great idea. Otherwise, how would a user know that
these operations are only local and not distributed? Adding this naming
convention for local operations sounds reasonable to me so that we don't
anger users who expect an operation to be distributed when in actuality it
only currently runs locally.

Deron



On Fri, Oct 21, 2016 at 12:34 PM, Nakul Jindal  wrote:

> Hi,
>
> Imran was planning on implementing a distributed SVD as a DML bodied
> function.
> The algorithm is described in the paper titled "A Distributed and
> Incremental SVD Algorithm for Agglomerative Data Analysis on Large
> Networks" available at https://arxiv.org/abs/1601.07010.
>
> This algorithm requires the availability of a local SVD function, which we
> currently do not have in SystemML.
> Seeing as how there are other linear algebra functions (eigen, lu, qr,
> cholesky) in DML that reroute to Apache Common Math and only operate in
> standalone/CP mode, would it be ok to add "svd" to this set?
>
> Also, since these operations are local and not distributed and the
> documentation doesn't make it clear that these operations wont operate in
> distributed mode, would it make sense to rename them to "local_eigen",
> "local_qr", "local_cholesky", etc?
> Obviously, this change would go into the version after 0.11.
>
> I understand that the ideal solution to this problem is to have a
> distributed version of the aforementioned linear algebra routines, but for
> the time being, would it be ok to go ahead do the rename, while also
> introducing a "local_svd" ?
>
>
> Niketan, Berthold, Matthias, Sasha - Any thoughts?
>
> Thanks,
> Nakul Jindal
>


rc3 source-release.zip artifact

2016-10-20 Thread Deron Eriksson
The 0.11.0 rc3 artifacts are located at:
https://dist.apache.org/repos/dist/dev/incubator/systemml/0.11.0-incubating-rc3/

I see the following artifact:
systemml-0.11.0-incubating-source-release.zip

I do not recognize this artifact. Can anyone tell me what this artifact is?
Can it be removed?

Deron


Re: [VOTE] Apache SystemML 0.11.0-incubating (RC3)

2016-10-19 Thread Deron Eriksson
OK, so I think it's my understanding that for the 'src' release for rc3,
the pom is using Spark 1.4 and the test suite passes for Spark 1.4, so this
issue being discussed regarding test cases on Spark 1.6 is not a blocker
for this release since the 'src' release builds and all tests pass.

If this is not correct, could someone please correct me?

Deron


On Wed, Oct 19, 2016 at 11:17 AM, Luciano Resende <luckbr1...@gmail.com>
wrote:

> if tests are consistently failing, then we should cancel the RC and either
> fix the test or mark it as @ignored.
>
> Intermittent fails might be ok, but it's a community decision.
>
> On Wed, Oct 19, 2016 at 10:50 AM, Deron Eriksson <deroneriks...@gmail.com>
> wrote:
>
> > I believe that for an Apache release, our test suite is supposed to pass
> > (although I'm pretty sure random test fails can be ignored).
> >
> > See 2.1 of Release Check List here:
> > http://incubator.apache.org/guides/releasemanagement.html#check-list
> >
> > "2.1 Build is successful including automated tests.
> > The expanded source archive is expected to build and pass tests."
> >
> > Luciano, do you happen to know if some test failures are acceptable since
> > our test suite is so enormous (6300+ tests)?
> >
> > Deron
> >
> >
> >
> > On Wed, Oct 19, 2016 at 3:24 AM, Glenn Weidner <gweid...@us.ibm.com>
> > wrote:
> >
> > > It's a nice-to-have but not a release blocker.
> > >
> > > Thanks,
> > > Glenn
> > >
> > > [image: Inactive hide details for Niketan Pansare---10/18/2016 05:38:26
> > > PM---Glenn: Would you prefer to have https://github.com/apache/]
> Niketan
> > > Pansare---10/18/2016 05:38:26 PM---Glenn: Would you prefer to have
> > > https://github.com/apache/incubator-systemml/pull/269 in 0.11 releas
> > >
> > > From: Niketan Pansare/Almaden/IBM@IBMUS
> > > To: dev@systemml.incubator.apache.org
> > > Date: 10/18/2016 05:38 PM
> > > Subject: Re: [VOTE] Apache SystemML 0.11.0-incubating (RC3)
> > > --
> > >
> > >
> > >
> > > Glenn: Would you prefer to have
> > > *https://github.com/apache/incubator-systemml/pull/269*
> > > <https://github.com/apache/incubator-systemml/pull/269> in 0.11
> release
> > ?
> > >
> > > Thanks,
> > >
> > > Niketan Pansare
> > > IBM Almaden Research Center
> > > E-mail: npansar At us.ibm.com
> > > *http://researcher.watson.ibm.com/researcher/view.php?
> person=us-npansar*
> > > <http://researcher.watson.ibm.com/researcher/view.php?
> person=us-npansar>
> > >
> > > Luciano Resende ---10/17/2016 09:06:30 PM---Please note the minor
> > > correction on the RC tag name (the actual tag hash is correct):
> > >
> > > From: Luciano Resende <luckbr1...@gmail.com>
> > > To: dev@systemml.incubator.apache.org
> > > Date: 10/17/2016 09:06 PM
> > > Subject: Re: [VOTE] Apache SystemML 0.11.0-incubating (RC3)
> > > --
> > >
> > >
> > >
> > > Please note the minor correction on the RC tag name (the actual tag
> hash
> > is
> > > correct):
> > >
> > > The tag to be voted on is v0.11.0-incubating-rc3 (
> > > 1baebfde400134b3af6d373c254ee084a6d28cc3)
> > >
> > >
> > > And off course, my +1
> > >
> > >
> > > On Sat, Oct 15, 2016 at 12:27 PM, Luciano Resende <
> luckbr1...@gmail.com>
> > > wrote:
> > >
> > > >
> > > > Please vote on releasing the following candidate as Apache SystemML
> > > > version 0.11.0-incubating !
> > > >
> > > > The vote is open for at least 72 hours and passes if a majority of at
> > > > least 3 +1 PMC votes are cast.
> > > >
> > > > [ ] +1 Release this package as Apache SystemML 0.11.0-incubating
> > > > [ ] -1 Do not release this package because ...
> > > >
> > > > To learn more about Apache SystemML, please see
> > > > *http://systemml.apache.org/* <http://systemml.apache.org/>
> > > >
> > > > The tag to be voted on is v0.11.0-incubating-rc1 (
> > > > 1baebfde400134b3af6d373c254ee084a6d28cc3)
> > > >
> > > > *https://github.com/apache/incubator-systemml/tree/1baebfde40*
> > > <https://github.com/apache/incubator-systemml/tree/1baebfde40>
>

Re: [VOTE] Apache SystemML 0.11.0-incubating (RC3)

2016-10-19 Thread Deron Eriksson
I believe that for an Apache release, our test suite is supposed to pass
(although I'm pretty sure random test fails can be ignored).

See 2.1 of Release Check List here:
http://incubator.apache.org/guides/releasemanagement.html#check-list

"2.1 Build is successful including automated tests.
The expanded source archive is expected to build and pass tests."

Luciano, do you happen to know if some test failures are acceptable since
our test suite is so enormous (6300+ tests)?

Deron



On Wed, Oct 19, 2016 at 3:24 AM, Glenn Weidner  wrote:

> It's a nice-to-have but not a release blocker.
>
> Thanks,
> Glenn
>
> [image: Inactive hide details for Niketan Pansare---10/18/2016 05:38:26
> PM---Glenn: Would you prefer to have https://github.com/apache/]Niketan
> Pansare---10/18/2016 05:38:26 PM---Glenn: Would you prefer to have
> https://github.com/apache/incubator-systemml/pull/269 in 0.11 releas
>
> From: Niketan Pansare/Almaden/IBM@IBMUS
> To: dev@systemml.incubator.apache.org
> Date: 10/18/2016 05:38 PM
> Subject: Re: [VOTE] Apache SystemML 0.11.0-incubating (RC3)
> --
>
>
>
> Glenn: Would you prefer to have
> *https://github.com/apache/incubator-systemml/pull/269*
>  in 0.11 release ?
>
> Thanks,
>
> Niketan Pansare
> IBM Almaden Research Center
> E-mail: npansar At us.ibm.com
> *http://researcher.watson.ibm.com/researcher/view.php?person=us-npansar*
> 
>
> Luciano Resende ---10/17/2016 09:06:30 PM---Please note the minor
> correction on the RC tag name (the actual tag hash is correct):
>
> From: Luciano Resende 
> To: dev@systemml.incubator.apache.org
> Date: 10/17/2016 09:06 PM
> Subject: Re: [VOTE] Apache SystemML 0.11.0-incubating (RC3)
> --
>
>
>
> Please note the minor correction on the RC tag name (the actual tag hash is
> correct):
>
> The tag to be voted on is v0.11.0-incubating-rc3 (
> 1baebfde400134b3af6d373c254ee084a6d28cc3)
>
>
> And off course, my +1
>
>
> On Sat, Oct 15, 2016 at 12:27 PM, Luciano Resende 
> wrote:
>
> >
> > Please vote on releasing the following candidate as Apache SystemML
> > version 0.11.0-incubating !
> >
> > The vote is open for at least 72 hours and passes if a majority of at
> > least 3 +1 PMC votes are cast.
> >
> > [ ] +1 Release this package as Apache SystemML 0.11.0-incubating
> > [ ] -1 Do not release this package because ...
> >
> > To learn more about Apache SystemML, please see
> > *http://systemml.apache.org/* 
> >
> > The tag to be voted on is v0.11.0-incubating-rc1 (
> > 1baebfde400134b3af6d373c254ee084a6d28cc3)
> >
> > *https://github.com/apache/incubator-systemml/tree/1baebfde40*
> 
> > 0134b3af6d373c254ee084a6d28cc3
> >
> > The release artifacts can be found at :
> >
> > *https://dist.apache.org/repos/dist/dev/incubator/systemml/0*
> .
> > 11.0-incubating-rc3/
> >
> > The maven release artifacts, including signatures, digests, etc. can be
> > found at:
> >
> >
> *https://repository.apache.org/content/repositories/orgapachesystemml-1009/*
> 
> >
> >
> > =
> > == Apache Incubator release policy ==
> > =
> > Please find below the guide to release management during incubation:
> > *http://incubator.apache.org/guides/releasemanagement.html*
> 
> >
> > ===
> > == How can I help test this release? ==
> > ===
> > If you are a SystemML user, you can help us test this release by taking
> an
> > existing Algorithm or workload and running on this release candidate,
> then
> > reporting any regressions.
> >
> > 
> > == What justifies a -1 vote for this release? ==
> > 
> > -1 votes should only occur for significant stop-ship bugs or legal
> > related issues (e.g. wrong license, missing header files, etc). Minor
> bugs
> > or regressions should not block this release.
> >
> >
>
> --
> Luciano Resende
> *http://twitter.com/lresende1975* 
> *http://lresende.blogspot.com/* 
>
>
>
>
>
>


Re: Enhancing SystemML JavaDocs

2016-09-30 Thread Deron Eriksson
I do not see how these automatically generated javadocs are useful. For
instance:

/**
*
* @param pb
* @param ec
* @return
* @throws DMLRuntimeException
*/
public static double getTimeEstimate(ProgramBlock pb, ExecutionContext ec,
boolean recursive)
throws DMLRuntimeException

Here, someone has automatically generated a javadoc comment. The developer
has failed to correct the missing 'recursive' parameter. If a developer has
not created a blank javadoc comment in the first place, then the recursive
parameter mistake never would have been made because there never would have
been a blank javadoc comment to update in the first place.

If automatically generated javadoc comments are decided to be part of our
coding standard, then they should be applied to all methods, not just
random methods.

Deron



On Fri, Sep 30, 2016 at 1:10 PM, Matthias Boehm <mbo...@us.ibm.com> wrote:

> actually, I would prefer to leave the empty (automatically generated)
> javadoc comments - at least in eclipse, this provides a better overview of
> parameters and exceptions.
>
> Regards,
> Matthias
>
> [image: Inactive hide details for Deron Eriksson ---09/30/2016 12:35:30
> PM---Hi Luciano, I am definitely in favor of fixing these javad]Deron
> Eriksson ---09/30/2016 12:35:30 PM---Hi Luciano, I am definitely in favor
> of fixing these javadocs. I created
>
> From: Deron Eriksson <deroneriks...@gmail.com>
> To: dev@systemml.incubator.apache.org
> Date: 09/30/2016 12:35 PM
> Subject: Re: Enhancing SystemML JavaDocs
> --
>
>
>
> Hi Luciano,
>
> I am definitely in favor of fixing these javadocs. I created
> https://issues.apache.org/jira/browse/SYSTEMML-842 two months ago to
> address this issue.
>
> However, there are so many problems with the javadocs that I think
> addressing this should probably be broken down into separate java packages.
> I would estimate that it's probably at least 20 hours of total work.
> Therefore, it would be good if this could be addressed by multiple people.
>
> I would be in favor of:
> (1) removing existing useless javadoc comments that are just automatically
> generated by IDEs, since these serve no useful purpose.
> (2) fix incorrect existing javadoc comments (for example, if the comments
> are wrong about the actual method parameters).
>
> I like the idea of this being a blocker for 1.0 since this will force this
> unpleasant but needed task to be accomplished.
>
> Deron
>
>
>
>
> On Fri, Sep 30, 2016 at 11:54 AM, Luciano Resende <luckbr1...@gmail.com>
> wrote:
>
> > Currently we have a bunch of wrong, incomplete or obsolete javadocs on
> our
> > APIs, and this continue o grow because we have the following
> configuration
> > in our build:
> >
> > 
> > 
> > ignore-doclint-warnings-for-javadocs-on-java-8
> > 
> > [1.8,)
> > 
> > 
> > -Xdoclint:none
> > 
> > 
> >
> >
> > I know we are very close to 0.11 release to fix this, but I would like to
> > make this issue as blocker for our next release (1.0 release), and thus
> > would like to get everybody to give it a try by removing this
> configuration
> > and trying to build SystemML and fix a few of the Javadoc issues.
> >
> > If we can get a few PRs per week, we can fix this very quick for the next
> > release.
> >
> > Thoughts ?
> >
> >
> > --
> > Luciano Resende
> > http://twitter.com/lresende1975
> > http://lresende.blogspot.com/
> >
>
>
>


Re: Enhancing SystemML JavaDocs

2016-09-30 Thread Deron Eriksson
Hi Luciano,

I am definitely in favor of fixing these javadocs. I created
https://issues.apache.org/jira/browse/SYSTEMML-842 two months ago to
address this issue.

However, there are so many problems with the javadocs that I think
addressing this should probably be broken down into separate java packages.
I would estimate that it's probably at least 20 hours of total work.
Therefore, it would be good if this could be addressed by multiple people.

I would be in favor of:
(1) removing existing useless javadoc comments that are just automatically
generated by IDEs, since these serve no useful purpose.
(2) fix incorrect existing javadoc comments (for example, if the comments
are wrong about the actual method parameters).

I like the idea of this being a blocker for 1.0 since this will force this
unpleasant but needed task to be accomplished.

Deron




On Fri, Sep 30, 2016 at 11:54 AM, Luciano Resende 
wrote:

> Currently we have a bunch of wrong, incomplete or obsolete javadocs on our
> APIs, and this continue o grow because we have the following configuration
> in our build:
>
> 
> 
> ignore-doclint-warnings-for-javadocs-on-java-8
> 
> [1.8,)
> 
> 
> -Xdoclint:none
> 
> 
>
>
> I know we are very close to 0.11 release to fix this, but I would like to
> make this issue as blocker for our next release (1.0 release), and thus
> would like to get everybody to give it a try by removing this configuration
> and trying to build SystemML and fix a few of the Javadoc issues.
>
> If we can get a few PRs per week, we can fix this very quick for the next
> release.
>
> Thoughts ?
>
>
> --
> Luciano Resende
> http://twitter.com/lresende1975
> http://lresende.blogspot.com/
>


Re: [VOTE] Apache SystemML 0.11.0-incubating (RC1)

2016-09-28 Thread Deron Eriksson
-1, SYSTEMML-963 and SYSTEMML-967 are potential needed license fixes. Glenn
and I should have these issues addressed by tomorrow.

Deron


On Wed, Sep 28, 2016 at 3:16 PM, Luciano Resende 
wrote:

> On Wed, Sep 28, 2016 at 3:14 PM, Matthias Boehm  wrote:
>
> > -1, unfortunately, SYSTEMML-964 and SYSTEMML-968 are blocking the release
> > right now but we should be able to resolve them by tomorrow.
> >
> > Regards,
> > Matthias
> >
>
> Thanks Matthias.
>
> Others, please make sure you review this RC as well, so we get all issues
> fixed by RC2.
>
>
> --
> Luciano Resende
> http://twitter.com/lresende1975
> http://lresende.blogspot.com/
>


Re: Continuing development on the website

2016-09-28 Thread Deron Eriksson
Hello Luciano,

I would prefer doing fixes on master for the website. However, I do not
feel strongly about this issue.

Anyone else any thoughts?

Deron


On Wed, Sep 28, 2016 at 1:28 PM, Luciano Resende 
wrote:

> I have created a PR [1] porting the Jekyll based website to use the new
> design contributed via SYSTEMML-892. Currently it has a few small issues
> that needs to be resolved before we can start using it as the source of the
> website again.
>
> Now, to resolve these issues, we can either create a branch and, as a
> community, address the remaining issues, or tag the current version of
> master, and start doing development/fixes on master.
>
> What would the preference of the community ?
>
>
> [1] https://github.com/apache/incubator-systemml-website/pull/2
> [2] https://issues.apache.org/jira/browse/SYSTEMML-892
>
> --
> Luciano Resende
> http://twitter.com/lresende1975
> http://lresende.blogspot.com/
>


Re: [Discuss} SystemML Roadmap page

2016-09-23 Thread Deron Eriksson
+1

Great idea, Berthold. Adding a roadmap to the project will be a very
welcome addition to the project, both for users and developers.

Deron


On Fri, Sep 23, 2016 at 12:10 PM, Berthold Reinwald 
wrote:

> In the spirit of other Apache projects, we should publish a Roadmap page.
> The page should be clear on the immediate timeline, point to some future
> projects, and also summarize the past to demonstrate continuum. It is not
> a replacement for jiras or release notes, but just a single place for
> people to go to and see what happened in the past and what will happen
> going forward.
>
>
> SystemML Roadmap
>
>
> SystemML Release Timeline
> =
>
> Oct. 2016: SystemML 0.11.0 on Spark 1.x
>
> Nov. 2016: SystemML 0.11.1 on Spark 1.x/2.x
>
> Dec. 2016: SystemML 1.0 on Spark 1.x/2.x
>
>
> Next SystemML 0.11.x
> 
> - Features
>   -- SystemML frames
>   -- New MLContext API
>   -- Transform functions based on SystemML frames
> - Bug fixes
> - Experimental Features / algorithms
>   -- New built-in functions for deep learning (convolution and pooling)
>   -- Deep learning library (DML bodied functions)
>   -- Python DSL integration
>   -- GPU support
>   -- Compressed Linear Algebra
>   -- New Algorithms
>  --- Lasso
>  --- kNN
>  --- Lanczos
>  --- PPCA
>  --- Deep Learning: CNN (Lenet), RBM
>
>
> Planned for future SystemML 1.0
> ---
> - Rigorous performance and scalability testing (bug fixes)
> - Remove deprecated APIs
> - Remove deprecated functions
>
>
> Planned for future Releases
> ---
> - Completion of prior experimental features
> - New algorithms: Non-linear SVMs, solvers, decomposition, inversion, etc.
> - DSLs (e.g. Scala, Python) and common DSL architecture
> - R interfaces: R DSL and R wrappers
> - Native Zeppelin Notebook support
> - Code generation
> - Sum product optimizations
> - Tree-based data structures
> - Global dataflow optimizations
>
>
> Prior Releases
> ==
>
>
> SystemML 0.10.0-incubating (released in June, 2016) (link to release notes
> (
> https://github.com/apache/incubator-systemml-website/
> blob/master/0.10.0-incubating/release_notes.md
> ))
> --
> - Different types of Spark Matrix Blocks: MCSR, CSR, COO
> - SystemML Frame support in JMLC/CP
> - Initial Deep Learning support
> - API/Scripts: parser error handling, SystemML configuration handling,
> include algorithms in SystemML jar, print matrix
> - New fused operator: wdivmm with variations
> - Performance Features: cache-conscious operations, more multithreaded
> operations, new simplications rewrites
> - New Algorithms: kNN
> - Documentation: javadocs, Jupyter/Zeppeling notebook examples
>
>
> SystemML 0.9.0-incubating (released in Jan. 2016) (link to release notes (
> https://github.com/apache/incubator-systemml-website/
> blob/master/0.9.0-incubating/release_notes.md
> ))
> -
> - Improvements to MLContext and MLPipeline wrappers
> - New converter utilities for RDDs and DataFrames)
> - New Optimizations for Spark Backend, e.g. eager RDD caching and
> repartitioning, RDD checkpointing, on-demand creation of SparkContext
> - New Runtime Operators for mmult, multihreaded readers and operators.
> - New Algoriths: ALS, Cubic Splines
> - Online documentation
>
>
> Regards,
> Berthold Reinwald
> IBM Almaden Research Center
> office: (408) 927 2208; T/L: 457 2208
> e-mail: reinw...@us.ibm.com
>
>


Re: Simplification of MLContext and related APIs

2016-09-12 Thread Deron Eriksson
Hi Matthias,

Great! I would be very happy to see BinaryBlockMatrix incorporated into
Matrix and BinaryBlockFrame incorporated into Frame since this would be a
welcome simplification of the API. Reducing the API to the essential
concepts is a big win for our users. This would have already happened if I
had the depth of knowledge of SystemML required to make this happen in a
reasonable timeframe.

I would definitely approve of further extracting Matrix and Frame to a
common type if this can be done in a way that feels natural for the end
user. At this point I can't really explain it further, but if I expect to
get back a matrix of numbers, I want this to feel natural, and if I get
back a frame consisting of columns of different data types, I want this to
feel natural too. I want our end users to put in data and get out results
in a minumum number of steps that feel intuitive. By the way, I think we
are getting very close, which is a great sign!

Deron


On Mon, Sep 12, 2016 at 2:21 PM, Matthias Boehm <mbo...@us.ibm.com> wrote:

> great - then we're all on the same page. Let me just clarify two aspects:
> First, I think we do need abstract frame/matrix data types at API level,
> but just one type that is used consistently across MLContext and all DSLs
> we're about to add. Second, relying on a common compilation chain does not
> directly affect users but ensures consistent behavior across all APIs.
>
> So the bottom line is, we're going to remove MatrixObject/FrameObject and
> other internal structures from API level, remove the 
> BinaryBlockMatrix/BinaryBlockFrame
> types, and try to consolidate the various Matrix/Frame objects as well as
> replicated compilation chains.
>
> Regards,
> Matthias
>
> [image: Inactive hide details for Deron Eriksson ---09/12/2016 01:56:55
> PM---Feel free to not expose MatrixObject and FrameObject. I am]Deron
> Eriksson ---09/12/2016 01:56:55 PM---Feel free to not expose MatrixObject
> and FrameObject. I am fine with that. The only reason MatrixObj
>
> From: Deron Eriksson <deroneriks...@gmail.com>
> To: dev@systemml.incubator.apache.org
> Date: 09/12/2016 01:56 PM
> Subject: Re: Simplification of MLContext and related APIs
> --
>
>
>
> Feel free to not expose MatrixObject and FrameObject. I am fine with that.
> The only reason MatrixObject and FrameObject are exposed is that I felt if
> the new MLContext API did not expose them, there would be complaints from
> existing committers that these objects were not available. I can't see
> anyone outside of SystemML core developers caring about MatrixObject and
> FrameObject or even for that matter ever even using these classes. Users
> want DataFrames, DataSets, RDDs, 2D arrays, CSV files, or practically
> anything but a MatrixObject or FrameObject.
>
> If you remove entities such as Matrix and Frame, you have the older
> MLContext API. Perhaps users who don't wish to use objects such as Matrix
> and Frame can use the older API since these suggestions are already built
> into the old API?
>
> Deron
>
>
> On Mon, Sep 12, 2016 at 1:22 PM, Mike Dusenberry <dusenberr...@gmail.com>
> wrote:
>
> > I also agree that internal data structures shouldn't be exposed to a
> user.
> > However, I think we definitely need to keep the `Matrix` and `Frame`
> types
> > in the API, in agreement with Arvind.  The main purpose of SystemML for a
> > user is to allow for machine learning algorithms involving matrices to be
> > run on a given system (laptop, Spark cluster, etc.).  Anything involving
> a
> > compilation chain directly is noise for our ML users.  Thus it's quite
> > useful for SystemML to expose a `Matrix` type with a limited API as is
> > currently done in MLContext.  This allows a user to interact with
> SystemML
> > via these `Matrix` objects which abstractly represent the core data
> > structure of a SystemML script.  Furthermore, these Matrix objects can be
> > used as subsequent input to an additional script, or can be converted to
> a
> > DataFrame once the user is ready to continue interacting with Spark.  As
> > Arvind mentioned, this just allows the DML `Matrix` type to be
> effectively
> > exposed at the API level as well.  Additionally, we plan to unify this
> > `Matrix` type with the lazy matrix types we are creating in the Python
> and
> > Scala DSLs, thus allowing `Matrix` to be the equivalent of matrices in
> > DML.  The similar argument exists for `Frame` as well.
> >
> > I think that limiting the exposure of internal structures to users could
> be
> > useful, but removing `Matrix` & `Frame` and instead having a user deal
> > directly with compilation chains would be a step backward

Re: Build failed in Jenkins: SystemML-DailyTest #495

2016-09-09 Thread Deron Eriksson
Earlier I updated MLContextProxy's getActiveMLContext to throw an
MLContextException rather than return null if there wasn't an active
MLContext (this is related to Spark Shell support). Unfortunately I didn't
update SparkExecutionContext's initSparkContext to handle the new
exception. I've made a change and I'll run the test suite and verify the
tests work and then commit a hotfix if that fixes it.

Deron


On Fri, Sep 9, 2016 at 4:01 PM, <jenk...@spark.tc> wrote:

> See <https://sparktc.ibmcloud.com/jenkins/job/SystemML-
> DailyTest/495/changes>
>
> Changes:
>
> [Deron Eriksson] [SYSTEMML-902] Improve FrameObject toString output
>
> [Deron Eriksson] [SYSTEMML-900] Rename DataFrame ID column
>
> [Deron Eriksson] [SYSTEMML-901] Improve unavailable MLContext message
>
> --
> [...truncated 152218 lines...]
> >org.junit.runners.ParentRunner.access$000(ParentRunner.java:53)
> >org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:229)
> >org.junit.runners.ParentRunner.run(ParentRunner.java:309)
> >org.apache.maven.surefire.junitcore.pc.ParallelComputerBuilder$PC$2.
> run(ParallelComputerBuilder.java:491)
> >org.junit.runner.JUnitCore.run(JUnitCore.java:160)
> >org.junit.runner.JUnitCore.run(JUnitCore.java:138)
> >org.apache.maven.surefire.junitcore.JUnitCoreWrapper.createRequestAndRun(
> JUnitCoreWrapper.java:113)
> >org.apache.maven.surefire.junitcore.JUnitCoreWrapper.
> executeEager(JUnitCoreWrapper.java:85)
> >org.apache.maven.surefire.junitcore.JUnitCoreWrapper.
> execute(JUnitCoreWrapper.java:54)
> >org.apache.maven.surefire.junitcore.JUnitCoreProvider.
> invoke(JUnitCoreProvider.java:134)
> >org.apache.maven.surefire.booter.ForkedBooter.
> invokeProviderInSameClassLoader(ForkedBooter.java:200)
> >org.apache.maven.surefire.booter.ForkedBooter.runSuitesInProcess(
> ForkedBooter.java:153)
> >org.apache.maven.surefire.booter.ForkedBooter.main(ForkedBooter.java:103)
>   AppendMatrixTest.testMapAppendOutBlock2SparseSP
> :115->commonAppendTest:259->AutomatedTestBase.runTest:1258 failed to run
> script ./src/test/scripts/functions/append/AppendMatrixTest.dml
> exception: org.apache.sysml.api.DMLException: No MLContext object is
> currently active. Have you created one? Hint: in Scala, 'val ml = new
> MLContext(sc)'
> message: No MLContext object is currently active. Have you created one?
> Hint: in Scala, 'val ml = new MLContext(sc)'
> stack trace:
> >org.apache.sysml.api.DMLScript.executeScript(DMLScript.java:378)
> >org.apache.sysml.api.DMLScript.main(DMLScript.java:199)
> >org.apache.sysml.test.integration.AutomatedTestBase.
> runTest(AutomatedTestBase.java:1238)
> >org.apache.sysml.test.integration.functions.append.AppendMatrixTest.
> commonAppendTest(AppendMatrixTest.java:259)
> >org.apache.sysml.test.integration.functions.append.AppendMatrixTest.
> testMapAppendOutBlock2SparseSP(AppendMatrixTest.java:115)
> >sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> >sun.reflect.NativeMethodAccessorImpl.invoke(
> NativeMethodAccessorImpl.java:62)
> >sun.reflect.DelegatingMethodAccessorImpl.invoke(
> DelegatingMethodAccessorImpl.java:43)
> >java.lang.reflect.Method.invoke(Method.java:497)
> >org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(
> FrameworkMethod.java:47)
> >org.junit.internal.runners.model.ReflectiveCallable.run(
> ReflectiveCallable.java:12)
> >org.junit.runners.model.FrameworkMethod.invokeExplosively(
> FrameworkMethod.java:44)
> >org.junit.internal.runners.statements.InvokeMethod.
> evaluate(InvokeMethod.java:17)
> >org.junit.internal.runners.statements.RunBefores.
> evaluate(RunBefores.java:26)
> >org.junit.internal.runners.statements.RunAfters.evaluate(
> RunAfters.java:27)
> >org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:271)
> >org.junit.runners.BlockJUnit4ClassRunner.runChild(
> BlockJUnit4ClassRunner.java:70)
> >org.junit.runners.BlockJUnit4ClassRunner.runChild(
> BlockJUnit4ClassRunner.java:50)
> >org.junit.runners.ParentRunner$3.run(ParentRunner.java:238)
> >org.apache.maven.surefire.junitcore.pc.Scheduler$1.run(
> Scheduler.java:318)
> >org.apache.maven.surefire.junitcore.pc.InvokerStrategy.
> schedule(InvokerStrategy.java:41)
> >org.apache.maven.surefire.junitcore.pc.Scheduler.
> schedule(Scheduler.java:274)
> >org.junit.runners.ParentRunner.runChildren(ParentRunner.java:236)
> >org.junit.runners.ParentRunner.access$000(ParentRunner.java:53)
> >org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:229)
> >org.junit.runners.ParentRunner.run(ParentRunner.java:309)
> >org.junit.runners.Suite.runC

Re: [DISCUSS] Apache SystemML Release 1.0.0

2016-08-25 Thread Deron Eriksson
+1

Deron

On Thu, Aug 25, 2016 at 4:56 PM, Luciano Resende <luckbr1...@gmail.com>
wrote:

> On Thu, Aug 25, 2016 at 4:01 PM, Deron Eriksson <deroneriks...@gmail.com>
> wrote:
>
> > Luciano,
> >
> > Yes, I like the idea of the next release being SystemML 1.0. Given the
> > significance of the version, it would be a good idea to not rush the
> > release so that we can make this a truly great release.
> >
> > Deron
> >
> >
> >
> I am not suggesting to rush a release, just that we would not call 0.11 and
> 1.0. The time frame is based on when we are ready.
>
> --
> Luciano Resende
> http://twitter.com/lresende1975
> http://lresende.blogspot.com/
>


Re: [DISCUSS] Apache SystemML Release 1.0.0

2016-08-25 Thread Deron Eriksson
Luciano,

Yes, I like the idea of the next release being SystemML 1.0. Given the
significance of the version, it would be a good idea to not rush the
release so that we can make this a truly great release.

Deron


On Thu, Aug 25, 2016 at 3:45 PM, Luciano Resende 
wrote:

> On Thu, Aug 25, 2016 at 10:59 AM, Matthias Boehm 
> wrote:
>
> > I'm still not fully convinced that we need to drop Spark 1.x support,
> > instead of supporting both 1.x and 2.x. I would appreciate if we could
> > first conclude the discussion around migrating to Spark 2.0.
> >
> > Furthermore, I think that creating a dependency to Spark versioning would
> > unnecessarily complicate our own release process. I would rather use
> major
> > releases as an opportunity to cleanup APIs and drop certain language
> > features. And this is unlikely to coincide with Spark's releases. From my
> > perspective it would be even more confusing for a user to release a major
> > version for a relatively minor change as support for a new Spark version.
> >
> > Regards,
> > Matthias
> >
> >
> I will leave the discussion about Spark 1.x and 2.x for the original
> thread.
>
> Trying to bring the topic back to the original subject, are we in consensus
> that we could call the next release it SystemML 1.0 ?
>
>
>
>
> --
> Luciano Resende
> http://twitter.com/lresende1975
> http://lresende.blogspot.com/
>


Re: [DISCUSS] SystemML with Spark 2.0 support and roadmap

2016-08-23 Thread Deron Eriksson
To simplify release candidate validation, I would like to propose that the
distribution profile only builds the following 7 (out of the current
included 10) artifacts:

systemml-0.11.0-incubating-SNAPSHOT-javadoc.jar
systemml-0.11.0-incubating-SNAPSHOT-sources.jar
systemml-0.11.0-incubating-SNAPSHOT-src.tar.gz
systemml-0.11.0-incubating-SNAPSHOT-src.zip
systemml-0.11.0-incubating-SNAPSHOT-standalone.tar.gz (rename w/o
"-standalone")
systemml-0.11.0-incubating-SNAPSHOT-standalone.zip (rename w/o
"-standalone")
systemml-0.11.0-incubating-SNAPSHOT.jar

The following could still be built using maven profiles but would not be in
the distribution profile:

systemml-0.11.0-incubating-SNAPSHOT-standalone.jar
systemml-0.11.0-incubating-SNAPSHOT.tar.gz (also rename)
systemml-0.11.0-incubating-SNAPSHOT.zip (also rename)

This would decrease the number of our artifacts by 30% which means that we
can validate the release faster, and the release candidate will also be
more likely to pass external validation/voting.

Deron


On Thu, Aug 18, 2016 at 12:05 AM, Berthold Reinwald <reinw...@us.ibm.com>
wrote:

> This makes sense. Couple of comments.
>
> - wrt SystemML on Spark 1.x, SystemML 0.11 target date should be the
> latest in Oct. Ideally it should be earlier - maybe September - depending
> on community demand as it will contain bug fixes for features introduced
> in 0.10. The master branch will stay on Spark 1.x til then.  New features
> can be integrated but as they are still partial, they will be marked as
> experimental.
>
> - wrt SystemML on Spark 2.0, as the timeline is fairly short, instead of
> bi-monthly I'd suggest more frequent sync with the branch on Spark 2.0
>
> - Comes early October, we should switch the master to Spark 2.0, and
> immediately create a release for 2.0.
>
> - wrt Roadmap 0.11 items, except for Frame support and MLContext and may
> be DL lib, all the other features should be marked experimental.
>
> Regards,
> Berthold Reinwald
> IBM Almaden Research Center
> office: (408) 927 2208; T/L: 457 2208
> e-mail: reinw...@us.ibm.com
>
>
>
> From:   dusenberr...@gmail.com
> To: dev@systemml.incubator.apache.org
> Date:   08/17/2016 04:52 PM
> Subject:Re: [DISCUSS] SystemML with Spark 2.0 support and roadmap
>
>
>
> +1
>
> This sounds like a good plan to allow us to continue supporting the Spark
> 1.x line in the short term, with a plan for moving to Spark 2.x support
> soon.
>
> -Mike
>
> --
>
> Mike Dusenberry
> GitHub: github.com/dusenberrymw
> LinkedIn: linkedin.com/in/mikedusenberry
>
> Sent from my iPhone.
>
>
> > On Aug 17, 2016, at 2:59 PM, Deron Eriksson <deroneriks...@gmail.com>
> wrote:
> >
> > +1
> > Continuing to support Spark 1.4/1.6 for now while setting a cutover date
> > for 2.0 sounds like a great idea. This allows for the creation of a
> really
> > solid release for 1.x, which greatly benefits SystemML users using Spark
> > 1.x. It also gives these users a general date that they can use to plan
> > migration to Spark 2.0 when that becomes the SystemML standard so that
> they
> > can benefit from the latest improvements to the project.
> >
> > Deron
> >
> >
> >> On Wed, Aug 17, 2016 at 2:32 PM, Acs S <ac...@yahoo.com.invalid> wrote:
> >>
> >> Seems, mail is not retaining format. I am attaching same text through
> PDF
> >> file.
> >> If there is any other better option please let me know.
> >>
> >>
> >> -Arvind
> >>
> >>
> >>
> >>
> >>
> >> - Forwarded Message -
> >> *From:* Acs S <ac...@yahoo.com.INVALID>
> >> *To:* Dev <dev@systemml.incubator.apache.org>
> >> *Sent:* Wednesday, August 17, 2016 2:18 PM
> >> *Subject:* [DISCUSS] SystemML with Spark 2.0 support and roadmap
> >>
> >>
> >> Spark 2.0 has released, we need to support SystemML on Spark 2.0 to be
> >> uptodate with latest version of Spark. This brings us a challenge to
> >> support our consumers until they move to Spark 2.0.Based on some
> >> brainstorming, I can propose following options to keep SystemML being
> >> supported on latest Spark version quickly.
> >>
> >> Supporting SystemML on Spark 1.x We can continue to support SystemML on
> >> Spark 1.x code base for short period of time by adding fixes and
> features
> >> on main branch.  We will release SystemML with support to Spark 1.x
> next
> >> version (0.11) around beginning of Oct 2016 (Lets target for Oct 1st
> 2016)
> >> Supporting SystemML on Spark 2.0 (Preview co

Re: Fw: [DISCUSS] SystemML with Spark 2.0 support and roadmap

2016-08-17 Thread Deron Eriksson
+1
Continuing to support Spark 1.4/1.6 for now while setting a cutover date
for 2.0 sounds like a great idea. This allows for the creation of a really
solid release for 1.x, which greatly benefits SystemML users using Spark
1.x. It also gives these users a general date that they can use to plan
migration to Spark 2.0 when that becomes the SystemML standard so that they
can benefit from the latest improvements to the project.

Deron


On Wed, Aug 17, 2016 at 2:32 PM, Acs S  wrote:

> Seems, mail is not retaining format. I am attaching same text through PDF
> file.
> If there is any other better option please let me know.
>
>
> -Arvind
>
>
>
>
>
> - Forwarded Message -
> *From:* Acs S 
> *To:* Dev 
> *Sent:* Wednesday, August 17, 2016 2:18 PM
> *Subject:* [DISCUSS] SystemML with Spark 2.0 support and roadmap
>
>
> Spark 2.0 has released, we need to support SystemML on Spark 2.0 to be
> uptodate with latest version of Spark. This brings us a challenge to
> support our consumers until they move to Spark 2.0.Based on some
> brainstorming, I can propose following options to keep SystemML being
> supported on latest Spark version quickly.
>
> Supporting SystemML on Spark 1.x We can continue to support SystemML on
> Spark 1.x code base for short period of time by adding fixes and features
> on main branch.  We will release SystemML with support to Spark 1.x next
> version (0.11) around beginning of Oct 2016 (Lets target for Oct 1st 2016)
> Supporting SystemML on Spark 2.0 (Preview code) For exploiters of Spark
> 2.0, we can make SystemML on Spark 2.0 immediately based on branch created
> on top of latest master branch code. Glen has some prototype code to
> transform SystemML code to be compatible with Spark 2.0, he can merge his
> code with new branch targeted to support SystemML on Spark 2.0 This would
> be "Preview" version code, and we can update it on frequent basis (on
> bi-monthy basis).  Supporting SystemML on Spark 2.0 We will have full
> support of SystemML on Spark 2.0 before end of year 2016. We will formalize
> release date by end of Sept 2016. At the same time we will discuss if we
> can move support of SystemML on Spark 1.x to maintenance mode (Only
> required bug fixes will be merged from main branch) or we need to support
> both SystemML on Spark 2.0 and Spark 1.x for some additional time.
> SystemML Roadmap 0.11 (on Spark 1.x) (Targeted to Oct 1st 2016) - Deep
> Learning (Library of Network layers?) - Frame - New MLContext API-
> Python DSL integration (Preview) - Compressed Linear Algebra (Preview) -
> Hydra R integration - New Algorithms (?)
> 0.12 (Spark 2.0)  (Targeted to 4Q 2016) - GPU support (Local
> mode/Distributed mode?)
> - New Algorithms (?)
> Please feel free to comment on support and roadmap points.
>
>
> -Arvind
>
>


Re: Calling System ML from sparkR

2016-08-17 Thread Deron Eriksson
Hi Sourav,

Great question. Work is currently being performed by Alok Singh (see
https://issues.apache.org/jira/browse/SYSTEMML-860) regarding this topic.

Deron


On Mon, Aug 15, 2016 at 9:31 AM, Sourav Mazumder <
sourav.mazumde...@gmail.com> wrote:

> Hi,
>
> Is there any work going on to call System ML dml scripts form SparkR using
> R syntax ?
>
> I understand it was possible using BigR (available in IBM Big Insights).
>
> Wondering whether something similar can be achieved from Spark R.
>
> Regards,
> Sourav
>


Re: [DISCUSS] Migration to Spark 2.0.0

2016-08-16 Thread Deron Eriksson
Hi Glenn,

I am fine with this approach. If this approach is taken, I would like to
set the documentation version in _config.yml to 0.10.x before the project
is tagged (I recently set it to 0.11).

Deron


On Thu, Aug 11, 2016 at 3:40 PM, Glenn Weidner <gweid...@us.ibm.com> wrote:

> I would like to propose an alternative to supporting Spark 2.0 and Spark
> 1.x within single stream.
>
> 1) Capture snapshot and establish label of current Apache SystemML master
> which includes new features added since 0.10.0 release.
>
> 2) After step 1 completed, enable master to move forward with support for
> Spark 2.x only.
>
> This is similar to what Fred initially proposed except step 1 would not
> involve a separate release. The 0.11 release of Apache SystemML would be
> compatible for Spark 2.0 and Scala 2.11.
>
> Thanks,
> Glenn
>
> [image: Inactive hide details for Glenn Weidner---08/08/2016 03:33:43
> PM---As a preliminary experiment in attempt to compile against bo]Glenn
> Weidner---08/08/2016 03:33:43 PM---As a preliminary experiment in attempt
> to compile against both Spark 2.0.0 and Spark 1.6.2 from same
>
> From: Glenn Weidner/Silicon Valley/IBM@IBMUS
> To: dev@systemml.incubator.apache.org
> Date: 08/08/2016 03:33 PM
> Subject: Re: [DISCUSS] Migration to Spark 2.0.0
> --
>
>
>
> As a preliminary experiment in attempt to compile against both Spark 2.0.0
> and Spark 1.6.2 from same code base, I made another set of changes for
> comparison against previous proposed changes for [SYSTEMML-776].
> This experimental set can be viewed here:
>
> *https://github.com/gweidner/incubator-systemml/commit/0611f0c197e4a0e816b3325093168bc5162d62c0*
> <https://github.com/gweidner/incubator-systemml/commit/0611f0c197e4a0e816b3325093168bc5162d62c0>
>
> This compiles against Spark 2.0.0 and Spark 1.6.2 except for fit/transform
> overrides in LogisticRegression.scala due to:
> SPARK-14500 Accept Dataset[] instead of DataFrame in MLlib APIs
>
> Detailed code comments and suggestions to try out can be made in the
> branch commit instead of this mail thread.
>
> Thanks,
> Glenn
>
> Deron Eriksson ---08/05/2016 02:02:10 PM---I am open to the idea of
> supporting Spark 2 and Spark<2 concurrently if someone shows that it can be
>
> From: Deron Eriksson <deroneriks...@gmail.com>
> To: dev@systemml.incubator.apache.org
> Date: 08/05/2016 02:02 PM
> Subject: Re: [DISCUSS] Migration to Spark 2.0.0
> --
>
>
>
> I am open to the idea of supporting Spark 2 and Spark<2 concurrently if
> someone shows that it can be accomplished with minimal inconvenience.
>
> However, I would lean towards Fred's approach (Spark 1.6 release followed
> shortly by a Spark 2 release). If possible, I want to be able to focus most
> of our efforts towards the future rather than the past.
>
> Deron
>
>
> On Thu, Aug 4, 2016 at 10:59 AM, Luciano Resende <luckbr1...@gmail.com>
> wrote:
>
> > That was going to be my suggestion... In Zeppelin, we just introduced
> > support for different versions of scala and added support for spark 2.0
> > based on profiles and a bit of reflections...
> >
> > Do we have to do anything related to Scala versions as well ?
> >
> > On Thursday, August 4, 2016, Matthias Boehm <mbo...@us.ibm.com> wrote:
> >
> > > I would recommend to start an investigation if we could support both
> the
> > > 1.x and 2.x lines with a single code base. It seems feasible to
> refactor
> > > the code a bit, compile against 2.0 (or with profiles), and run on
> either
> > > 1.6 or 2.0. For example, by creating a wrapper that implements both
> > > Iterable and Iterator, we could overcome the Iterator API change as
> shown
> > > by our LazyIterableIterator which did not require any change in related
> > > functions. Btw, we did the same for MRv1 and Yarn by ensuring that on
> > MRv1,
> > > we don't touch Yarn related APIs. Similarly on Spark, we already
> support
> > > both legacy and >=1.6 memory management. I think this kind of platform
> > > independence is really valuable but it obviously adds complexity.
> > >
> > > Regards,
> > > Matthias
> > >
> > >
> > > [image: Inactive hide details for Niketan Pansare---08/03/2016 05:15:21
> > > PM---I am in favor of having one more release against Spark 1.6]Niketan
> > > Pansare---08/03/2016 05:15:21 PM---I am in favor of having one more
> > release
> > > against Spark 1.6. Since default scala version for Spark 1.
> > >
> > > From: Niketan Pans

Draft for August monthly report

2016-08-02 Thread Deron Eriksson
Here is a draft I created for the August monthly report. Feedback welcome.

Deron

-
SystemML

SystemML provides declarative large-scale machine learning (ML) that aims at
flexible specification of ML algorithms and automatic generation of hybrid
runtime plans ranging from single node, in-memory computations, to
distributed computations running on Apache Hadoop MapReduce and Apache
Spark.

SystemML has been incubating since 2015-11-02.

Three most important issues to address in the move towards graduation:

  - Grow SystemML community: increase mailing list activity,
increase adoption of SystemML for scalable machine learning, encourage
data scientists to adopt DML and PyDML algorithm scripts, respond to
user feedback to ensure SystemML meets the requirements of real-world
situations, write papers, and present talks about SystemML.
  - Continue to produce releases.
  - Increase the diversity of our project's contributors and committers.

Any issues that the Incubator PMC (IPMC) or ASF Board wish/need to be
aware of?

  NONE.

How has the community developed since the last report?

  Our mailing list from May through July had 237 messages including a wide
  range of topics. We have gained 6 new contributors since May 1st. On
GitHub,
  the project has been starred 365 times and forked 129 times. Fred Reiss
  spoke at Spark Summit on June 7 about building custom machine learning
  algorithms with SystemML. Mike Dusenberry presented about SystemML on May
19
  at Datapalooza Denver.

How has the project developed since the last report?

  We produced our second Apache release, version 0.10.0-incubating. The
  project has had 205 commits since May 1. 187 issues have been reported on
  our JIRA site and 127 issues have been resolved. 70 pull requests have
been
  created since May 1, and 63 of these have been closed.

Date of last release:

  2016-06-15 (version 0.10.0-incubating)

When were the last committers or PMC members elected?

  2016-05-07 Glenn Weidner
  2016-05-07 Faraz Makari Manshadi


Re: [DISCUSS] Migration to Spark 2.0.0

2016-08-02 Thread Deron Eriksson
I would definitely be in favor of moving to Spark 2.0 as early as possible.
This will allow SystemML to be current with cutting edge Spark. It would be
nice to focus our efforts on the latest Spark.

Deron


On Tue, Aug 2, 2016 at 12:05 PM,  wrote:

> I'm in favor of moving to Spark 2.0 now, meaning that our upcoming release
> would include both new features and 2.0 support.  0.10 has plenty of
> functionality for any existing 1.x users.
>
> -Mike
>
> --
>
> Mike Dusenberry
> GitHub: github.com/dusenberrymw
> LinkedIn: linkedin.com/in/mikedusenberry
>
> Sent from my iPhone.
>
>
> > On Aug 2, 2016, at 11:44 AM, Glenn Weidner  wrote:
> >
> >
> >
> > In the "[DISCUSS] SystemML 0.11 release" thread, native frame support and
> > API updates such as new MLContext were identified as main new features
> for
> > the release.  In addition, support for Spark 2.0.0 was targeted.
> > Note code changes required for Spark 2.0.0 are not backward compatible to
> > earlier Spark versions (e.g., 1.6.2) so starting separate mail thread for
> > anyone to raise objections/alternatives for migrating to Spark 2.0.0.
> >
> > One possible option is to do a release to include the new Apache SystemML
> > features before migrating to Spark 2.0.0.  However, it seems better to
> have
> > the next Apache SystemML release compatible with latest Spark version
> > 2.0.0.  The Apache SystemML 0.10 release from June can be used with
> earlier
> > versions of Spark.
> >
> > Regards,
> > Glenn
>


Re: print a value in a frame?

2016-07-05 Thread Deron Eriksson
Hi Matthias,

Perhaps I'm doing something wrong? If I try the following:

M = read($Min, data_type='frame', format='csv'); print(as.scalar(M[1,1]));

where the csv file is "one, two\nthree four", I am seeing:

Caused by: java.lang.ClassCastException:
org.apache.sysml.runtime.controlprogram.caching.FrameObject cannot be cast
to org.apache.sysml.runtime.controlprogram.caching.MatrixObject
at
org.apache.sysml.hops.recompile.LiteralReplacement.replaceLiteralValueTypeCastRightIndexing(LiteralReplacement.java:306)


Deron



On Sun, Jul 3, 2016 at 1:11 PM, Matthias Boehm <mbo...@us.ibm.com> wrote:

> quick correction: I meant to say, option 2 because you have a frame of
> strings (option 3 is only possible if you have numeric/boolean data). Btw,
> it's fixed now - so please go ahead and give it a try. Thanks.
>
>
> Regards,
> Matthias
>
> [image: Inactive hide details for Deron Eriksson ---06/29/2016 01:40:01
> PM---Thanks for the quick reply. I'll use the toString() for no]Deron
> Eriksson ---06/29/2016 01:40:01 PM---Thanks for the quick reply. I'll use
> the toString() for now (for a unit test).
>
> From: Deron Eriksson <deroneriks...@gmail.com>
> To: dev@systemml.incubator.apache.org
> Date: 06/29/2016 01:40 PM
> Subject: Re: print a value in a frame?
> --
>
>
>
> Thanks for the quick reply. I'll use the toString() for now (for a unit
> test).
>
> Deron
>
> On Wed, Jun 29, 2016 at 1:28 PM, Matthias Boehm <mbo...@us.ibm.com> wrote:
>
> > option 3 is possible but probably needs a fix. Alternatively, you can use
> > print(toString(M)) which is implemented similar to the matrix toString().
> >
> > Regards,
> > Matthias
> >
> > [image: Inactive hide details for Deron Eriksson ---06/29/2016 01:23:41
> > PM---How do I print a value in a frame? Suppose I have the foll]Deron
> > Eriksson ---06/29/2016 01:23:41 PM---How do I print a value in a frame?
> > Suppose I have the following 2x2 csv file:
> >
> > From: Deron Eriksson <deroneriks...@gmail.com>
> > To: dev@systemml.incubator.apache.org
> > Date: 06/29/2016 01:23 PM
> > Subject: print a value in a frame?
> > --
>
> >
> >
> >
> > How do I print a value in a frame?
> >
> > Suppose I have the following 2x2 csv file:
> > one,two
> > three,four
> >
> > I read it in with:
> > M = read($Min, data_type='frame', format='csv');
> >
> > (1)
> > If I try:
> >   print(M[1,1]);
> > I get:
> >   ERROR: null -- line 1, column 0 -- print statement can only print
> scalars
> >
> > (2)
> > If I try:
> >   print(as.scalar(M[1,1]));
> > I get:
> >   ERROR: null -- line 2, column 7 -- Expecting matrix parameter for
> > function CAST_AS_SCALAR
> >
> > (3)
> > If I try:
> >   print(as.scalar(as.matrix(M[1,1])));
> > I get:
> >
> >
> >
> file:/.../src/test/scripts/org/apache/sysml/api/mlcontext/one-two-three-four.csv
> > not a SequenceFile
> >
> > Thanks,
> > Deron
> >
> >
> >
>
>
>


Re: Contribute to SYSTEMML

2016-07-05 Thread Deron Eriksson
Hello Garvit,

You might also find "Contributing to SystemML" to be useful:
http://apache.github.io/incubator-systemml/contributing-to-systemml.html

Deron


On Tue, Jul 5, 2016 at 2:26 PM, Nakul Jindal  wrote:

> Hi Garvit,
>
> A good place to get started is to look at the SystemML JIRA site:
> https://issues.apache.org/jira/browse/SYSTEMML <
> https://issues.apache.org/jira/browse/SYSTEMML>
>
> In addition to the documentation at:
> https://apache.github.io/incubator-systemml <
> https://apache.github.io/incubator-systemml>
>
> there is documentation in the docs/ directory:
> https://github.com/apache/incubator-systemml/tree/master/docs <
> https://github.com/apache/incubator-systemml/tree/master/docs>
>
>
> If you haven’t already done so, I’d encourage you to run SystemML in
> standalone mode.
> Run a simple matrix multiply, stick this code in a file:
>
> X = rand(rows=20, cols=100, min=0, max=4, pdf="uniform", sparsity=0.3)
> Y = rand(rows=100, cols=20, min=0, max=4, pdf="uniform", sparsity=0.3)
> S = X %*% Y
> print (toString(S))
>
> call the file “matmul.dml” and run it in the standalone mode. Play with
> the number of rows, columns, sparsity, min, max.
> You can look up the various built-in instruction functions in the language
> reference.
> https://apache.github.io/incubator-systemml/dml-language-reference.html <
> https://apache.github.io/incubator-systemml/dml-language-reference.html>
>
>
> Then setup your dev environment (Eclipse or IntelliJ).
> https://apache.github.io/incubator-systemml/developer-tools-systemml.html
>  >
>
> Also try setting up Spark locally on your machine and running SystemML in
> Spark Mode.
> http://spark.apache.org/docs/latest/spark-standalone.html <
> http://spark.apache.org/docs/latest/spark-standalone.html>
>
>
> You can get editor support in Vim and Atom:
> Vim : https://github.com/nakul02/vim-dml <
> https://github.com/nakul02/vim-dml>
> Atom : https://atom.io/packages/language-dml <
> https://atom.io/packages/language-dml>
> (Installing in Atom -
> http://flight-manual.atom.io/using-atom/sections/atom-packages/ <
> http://flight-manual.atom.io/using-atom/sections/atom-packages/>)
>
>
> -Nakul
>
>
>
> > On Jul 3, 2016, at 7:35 AM, Garvit Bansal 
> wrote:
> >
> > Hi All,
> >
> > My name is Garvit and I like to contribute in SYSTEMML. I found this
> > project interesting and want to make contribution to it. Can somebody
> help
> > me how to start with, bug fixes or something like that which will help me
> > to get in flow.
> >
> > Garvit
> >
> > --
> > *Garvit Bansal,*
> > Software Developer at Flipkart
> > B.Tech in Computer Science and Engineering
> > The LNM Institute of Information Technology
> > Jaipur, Rajasthan-302001
> > *Mobile: *9886384276
> > *Website: *www.garvitbansal.xyz
> >
> > 
>
>


Re: print a value in a frame?

2016-06-29 Thread Deron Eriksson
Thanks for the quick reply. I'll use the toString() for now (for a unit
test).

Deron

On Wed, Jun 29, 2016 at 1:28 PM, Matthias Boehm <mbo...@us.ibm.com> wrote:

> option 3 is possible but probably needs a fix. Alternatively, you can use
> print(toString(M)) which is implemented similar to the matrix toString().
>
> Regards,
> Matthias
>
> [image: Inactive hide details for Deron Eriksson ---06/29/2016 01:23:41
> PM---How do I print a value in a frame? Suppose I have the foll]Deron
> Eriksson ---06/29/2016 01:23:41 PM---How do I print a value in a frame?
> Suppose I have the following 2x2 csv file:
>
> From: Deron Eriksson <deroneriks...@gmail.com>
> To: dev@systemml.incubator.apache.org
> Date: 06/29/2016 01:23 PM
> Subject: print a value in a frame?
> --
>
>
>
> How do I print a value in a frame?
>
> Suppose I have the following 2x2 csv file:
> one,two
> three,four
>
> I read it in with:
> M = read($Min, data_type='frame', format='csv');
>
> (1)
> If I try:
>   print(M[1,1]);
> I get:
>   ERROR: null -- line 1, column 0 -- print statement can only print scalars
>
> (2)
> If I try:
>   print(as.scalar(M[1,1]));
> I get:
>   ERROR: null -- line 2, column 7 -- Expecting matrix parameter for
> function CAST_AS_SCALAR
>
> (3)
> If I try:
>   print(as.scalar(as.matrix(M[1,1])));
> I get:
>
>
> file:/.../src/test/scripts/org/apache/sysml/api/mlcontext/one-two-three-four.csv
> not a SequenceFile
>
> Thanks,
> Deron
>
>
>


print a value in a frame?

2016-06-29 Thread Deron Eriksson
How do I print a value in a frame?

Suppose I have the following 2x2 csv file:
one,two
three,four

I read it in with:
M = read($Min, data_type='frame', format='csv');

(1)
If I try:
   print(M[1,1]);
I get:
   ERROR: null -- line 1, column 0 -- print statement can only print scalars

(2)
If I try:
   print(as.scalar(M[1,1]));
I get:
   ERROR: null -- line 2, column 7 -- Expecting matrix parameter for
function CAST_AS_SCALAR

(3)
If I try:
   print(as.scalar(as.matrix(M[1,1])));
I get:

 
file:/.../src/test/scripts/org/apache/sysml/api/mlcontext/one-two-three-four.csv
not a SequenceFile

Thanks,
Deron


Re: Build failed in Jenkins: SystemML-DailyTest #338

2016-06-24 Thread Deron Eriksson
It might be nice to create a new Jenkins job that we can manually modify
and start to test specific test failures like this in the Jenkins
environment. We could go into 'configure' and set the test to execute (mvn
clean test -Dtest=MyTest, or something like that).

Deron


On Fri, Jun 24, 2016 at 12:00 PM, Glenn Weidner  wrote:

> No problem - glad to help. There may be related item that I have not been
> able to resolve where 2 test failures observed after on-demand build 181
> completed:
>
> *09:56:22* Running
> org.apache.sysml.test.integration.functions.transform.TransformFrameEncodeDecodeTest
> *09:56:23* Tests run: 4, Failures: 0, Errors: 2, Skipped: 0, Time
> elapsed: 21.572 sec <<< FAILURE! - in
> org.apache.sysml.test.integration.functions.transform.TransformFrameEncodeDecodeTest
> *09:56:23* 
> testHomesDummycodeSparkCSV(org.apache.sysml.test.integration.functions.transform.TransformFrameEncodeDecodeTest)
>  Time elapsed: 11.657 sec  <<< ERROR!
> *09:56:23* java.lang.RuntimeException: java.io.IOException: Failed
> parallel read of text csv input.
> *09:56:23*  at java.util.concurrent.FutureTask.report(FutureTask.java:122)
> *09:56:23*  at java.util.concurrent.FutureTask.get(FutureTask.java:192)
> *09:56:23*  at
> org.apache.sysml.runtime.io.FrameReaderTextCSVParallel.readCSVFrameFromHDFS(FrameReaderTextCSVParallel.java:108)
> *09:56:23*  at
> org.apache.sysml.runtime.io.FrameReaderTextCSV.readFrameFromHDFS(FrameReaderTextCSV.java:96)
> *09:56:23*  at
> org.apache.sysml.runtime.io.FrameReader.readFrameFromHDFS(FrameReader.java:83)
> *09:56:23*  at
> org.apache.sysml.test.integration.functions.transform.TransformFrameEncodeDecodeTest.runTransformTest(TransformFrameEncodeDecodeTest.java:132)
> *09:56:23*  at
> org.apache.sysml.test.integration.functions.transform.TransformFrameEncodeDecodeTest.testHomesDummycodeSparkCSV(TransformFrameEncodeDecodeTest.java:79)
> *09:56:23* Caused by: java.lang.IndexOutOfBoundsException: Index: 1,
> Size: 1
> *09:56:23*  at java.util.ArrayList.rangeCheck(ArrayList.java:653)
> *09:56:23*  at java.util.ArrayList.get(ArrayList.java:429)
> *09:56:23*  at
> org.apache.sysml.runtime.io.FrameReaderTextCSV.readCSVFrameFromInputSplit(FrameReaderTextCSV.java:185)
> *09:56:23*  at
> org.apache.sysml.runtime.io.FrameReaderTextCSVParallel$ReadRowsTask.call(FrameReaderTextCSVParallel.java:226)
> *09:56:23*  at java.util.concurrent.FutureTask.run(FutureTask.java:266)
> *09:56:23*  at
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
> *09:56:23*  at
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
> *09:56:23*  at java.lang.Thread.run(Thread.java:745)
> *09:56:23*
> *09:56:23* 
> testHomesRecodeSparkCSV(org.apache.sysml.test.integration.functions.transform.TransformFrameEncodeDecodeTest)
>  Time elapsed: 4.671 sec  <<< ERROR!
> *09:56:23* java.lang.RuntimeException: java.io.IOException: Failed
> parallel read of text csv input.
> *09:56:23*  at java.util.concurrent.FutureTask.report(FutureTask.java:122)
> *09:56:23*  at java.util.concurrent.FutureTask.get(FutureTask.java:192)
> *09:56:23*  at
> org.apache.sysml.runtime.io.FrameReaderTextCSVParallel.readCSVFrameFromHDFS(FrameReaderTextCSVParallel.java:108)
> *09:56:23*  at
> org.apache.sysml.runtime.io.FrameReaderTextCSV.readFrameFromHDFS(FrameReaderTextCSV.java:96)
> *09:56:23*  at
> org.apache.sysml.runtime.io.FrameReader.readFrameFromHDFS(FrameReader.java:83)
> *09:56:23*  at
> org.apache.sysml.test.integration.functions.transform.TransformFrameEncodeDecodeTest.runTransformTest(TransformFrameEncodeDecodeTest.java:132)
> *09:56:23*  at
> org.apache.sysml.test.integration.functions.transform.TransformFrameEncodeDecodeTest.testHomesRecodeSparkCSV(TransformFrameEncodeDecodeTest.java:69)
> *09:56:23* Caused by: java.lang.IndexOutOfBoundsException: Index: 1,
> Size: 1
> *09:56:23*  at java.util.ArrayList.rangeCheck(ArrayList.java:653)
> *09:56:23*  at java.util.ArrayList.get(ArrayList.java:429)
> *09:56:23*  at
> org.apache.sysml.runtime.io.FrameReaderTextCSV.readCSVFrameFromInputSplit(FrameReaderTextCSV.java:185)
> *09:56:23*  at
> org.apache.sysml.runtime.io.FrameReaderTextCSVParallel$ReadRowsTask.call(FrameReaderTextCSVParallel.java:226)
> *09:56:23*  at java.util.concurrent.FutureTask.run(FutureTask.java:266)
> *09:56:23*  at
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
> *09:56:23*  at
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
> *09:56:23*  at java.lang.Thread.run(Thread.java:745)
>
>
> I can't reproduce the failure as the 150 transform tests all passed when
> run on my local development environment including when run in parallel from
> command shell or under Eclipse as suite.
>
> Running
> org.apache.sysml.test.integration.functions.transform.TransformFrameEncodeDecodeTest
> Tests run: 4, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 12.164 sec
> - in
> 

Release notes for 0.10.0

2016-06-13 Thread Deron Eriksson
Hi,

We need to put together a page of release notes for the 0.10.0 release
(similar to http://systemml.apache.org/0.9.0-incubating/release_notes.html).
Could SystemML committers and contributors please respond to this thread
with what they feel are the major updates/improvements in this release?

Thanks!
Deron


Re: [VOTE] Apache SystemML 0.10.0-incubating (RC2)

2016-06-02 Thread Deron Eriksson
Thank you Tatsuya, I created
https://issues.apache.org/jira/browse/SYSTEMML-746 for this issue.

Deron


On Wed, Jun 1, 2016 at 11:56 PM, Tatsuya Nishiyama <
nishiyama.tats...@lab.ntt.co.jp> wrote:

> I forgot to tell in my previous voting mail. The Quick Start Guide may need
> to be fixed before official release, because l2-svm-predict.dml in the
> document was already modified and the file (l2-svm-confusion.csv) generated
> by the script already was different.
>
> In particular, the Quick Start Guide is written now as following:
> ```
> ...
> The generated file l2-svm-confusion.csv should contain the following
> confusion matrix of this form:
>
> |t1 t2|
> |t3 t4|
> The model correctly predicted label 1 t1 times
> The model incorrectly predicted label 1 as opposed to label 2 t2 times
> The model incorrectly predicted label 2 as opposed to label 1 t3 times
> The model correctly predicted label 2 t4 times.
> If the confusion matrix looks like this .
>
> 107.0,38.0
> 0.0,2.0
> ...
> ```
>
> In above descriptions, the confusion matrix is 2x2 matrix. However,
> l2-svm-predict.dml generates 3x3 matrix as the confusion matrix now.
>
> -Original Message-
> From: dusenberr...@gmail.com [mailto:dusenberr...@gmail.com]
> Sent: Thursday, June 02, 2016 5:08 AM
> To: dev@systemml.incubator.apache.org
> Subject: Re: [VOTE] Apache SystemML 0.10.0-incubating (RC2)
>
> +1
>
> Tested the main JAR with a PySpark Jupyter notebook.
>
> --
>
> Mike Dusenberry
> GitHub: github.com/dusenberrymw
> LinkedIn: linkedin.com/in/mikedusenberry
>
> Sent from my iPhone.
>
>
> > On Jun 1, 2016, at 12:16 PM, Deron Eriksson <deroneriks...@gmail.com>
> wrote:
> >
> > +1, but please note following findings:
> >
> > 1. Is the *source-release.zip artifact unnecessary, since we have
> > src.tar.gz and src.zip artifacts? Also, it contains the Hadoop binaries.
> > So, it can't be used as the "source release" artifact.
> > 2. No standalone uberjar is present (I am happy with this since no one
> > to my knowledge is using it and the LICENSE/NOTICE may need updating.
> > I would like to remove this artifact forever.) 3. No in-memory jar is
> > present (I am happy with this too since this artifact is not very
> > lightweight as it was probably initially meant to be.)
> >
> > Deron
> >
> >
> >
> >
> >
> > On Wed, Jun 1, 2016 at 10:01 AM, Frederick R Reiss
> > <frre...@us.ibm.com>
> > wrote:
> >
> >> +1
> >>
> >> Sent from my iPhone using IBM Verse
> >>
> >> On Jun 1, 2016, 9:31:36 AM, reinw...@us.ibm.com wrote:
> >>
> >> From: reinw...@us.ibm.com
> >> To: dev@systemml.incubator.apache.org
> >> Cc:
> >> Date: Jun 1, 2016 9:31:36 AM
> >> Subject: Re: [VOTE] Apache SystemML 0.10.0-incubating (RC2)
> >>
> >>
> >>   +1
> >>  Regards,
> >>  Berthold Reinwald
> >>  IBM Almaden Research Center
> >>  office: (408) 927 2208; T/L: 457 2208
> >>  e-mail: reinw...@us.ibm.com
> >>  From:   Shirish Tatikonda
> >>  To: dev@systemml.incubator.apache.org
> >>  Date:   06/01/2016 12:47 AM
> >>  Subject:Re: [VOTE] Apache SystemML 0.10.0-incubating (RC2)
> >>  +1
> >>>  On Jun 1, 2016 12:40 AM, "Matthias Boehm"  wrote:
> >>> +1, but if there is a third rc, let us please create a branch or cut
> >> the
> >>> release as of today to ensure no new features are leaking in.
> >>>
> >>> Regards,
> >>> Matthias
> >>>
> >>> [image: Inactive hide details for Luciano Resende ---05/31/2016
> >> 10:05:48
> >>> PM---Please vote on releasing the following candidate as
> >>> Apach]Luciano Resende ---05/31/2016 10:05:48 PM---Please vote on
> >>> releasing the
> >>  following
> >>> candidate as Apache SystemML version 0.10.0-incubating !
> >>>
> >>> From: Luciano Resende
> >>> To: dev@systemml.incubator.apache.org
> >>> Date: 05/31/2016 10:05 PM
> >>> Subject: [VOTE] Apache SystemML 0.10.0-incubating (RC2)
> >>> --
> >>>
> >>>
> >>>
> >>> Please vote on releasing the following candidate as Apache SystemML
> >>  version
> >>> 0.10.0-incubating !
> >>>
> >>> The vote is open for at least 72 hours and will close on Saturda

Re: Executing DMLScript in Eclipse on Windows

2016-05-27 Thread Deron Eriksson
Perfect, that was exactly what I was looking for.

Deron


On Fri, May 27, 2016 at 1:08 PM, Matthias Boehm <mbo...@us.ibm.com> wrote:

> just put the following parameters into the VM arguments of your run
> configuration:
>
> -Dhadoop.home.dir=\src\test\config\hadoop_bin_windows
> -Djava.library.path=\src\test\config\hadoop_bin_windows\bin
>
> Regards,
> Matthias
>
> [image: Inactive hide details for Deron Eriksson ---05/27/2016 12:14:34
> PM---Hi, On OS X during development, I usually run SystemML usi]Deron
> Eriksson ---05/27/2016 12:14:34 PM---Hi, On OS X during development, I
> usually run SystemML using an Eclipse debug
>
> From: Deron Eriksson <deroneriks...@gmail.com>
> To: dev@systemml.incubator.apache.org
> Date: 05/27/2016 12:14 PM
> Subject: Executing DMLScript in Eclipse on Windows
> --
>
>
>
> Hi,
>
> On OS X during development, I usually run SystemML using an Eclipse debug
> configuration that uses DMLScript as the main class.
>
> I tried doing the same thing on Windows, and when I do this, I see the
> following winutils.exe exception:
> 16/05/27 12:01:54 ERROR util.Shell: Failed to locate the winutils binary in
> the hadoop binary path
> java.io.IOException: Could not locate executable null\bin\winutils.exe in
> the Hadoop binaries.
>at org.apache.hadoop.util.Shell.getQualifiedBinPath(Shell.java:318)
>
> If I build the project and run the standalone package (which has the
> lib/hadoop/bin/ directory), I do not see this exception.
>
> Does anyone have any advice about the best way to run DMLScript in Eclipse
> on Windows without the exception? BTW, I usually use the
> maven-eclipse-plugin to create my classpath.
>
> Thanks,
> Deron
>
>
>


Re: [VOTE] Apache SystemML 0.10.0-incubating (RC1)

2016-05-26 Thread Deron Eriksson
+1

Deron


On Wed, May 25, 2016 at 4:28 PM,  wrote:

> Great, I created SYSTEMML-710 to track this. Would be good as a simple
> starter task as well.
>
> --
>
> Mike Dusenberry
> GitHub: github.com/dusenberrymw
> LinkedIn: linkedin.com/in/mikedusenberry
>
> Sent from my iPhone.
>
>
> > On May 25, 2016, at 1:16 PM, Luciano Resende 
> wrote:
> >
> >> On Wed, May 25, 2016 at 12:47 PM,  wrote:
> >>
> >> +1
> >>
> >> I ran scripts in Jupyter and Zeppelin using both the Scala and Python
> >> MLContext APIs in order to test our notebook integration.
> >>
> >> One thing to note is that our Python API file, `SystemML.py`, is not
> >> included with the main distribution. We can work around this, and it
> should
> >> not block this release, but we'll want to it add it for the next
> release.
> > Please raise a jira for this.
> >
> >
> >
> > --
> > Luciano Resende
> > http://twitter.com/lresende1975
> > http://lresende.blogspot.com/
>


Re: [VOTE] Apache SystemML 0.10.0-incubating (RC1)

2016-05-25 Thread Deron Eriksson
Hi Luciano,

10 of the 11 artifacts look good to me for LICENSE and NOTICE (as far as I
can tell they reflect the artifact contents). However the standalone
uberjar (systemml-0.10.0-incubating-standalone.jar) does not have the
correct LICENSE and NOTICE (one of my commits in the last week must have
affected this, since it worked last week such as at commit e2cb4db...).

Can we remove this artifact from the release candidate, or is this a
showstopper? I have not heard of anyone using this uberjar.

Deron


On Wed, May 25, 2016 at 1:16 PM, Luciano Resende 
wrote:

> On Wed, May 25, 2016 at 12:47 PM,  wrote:
>
> > +1
> >
> > I ran scripts in Jupyter and Zeppelin using both the Scala and Python
> > MLContext APIs in order to test our notebook integration.
> >
> > One thing to note is that our Python API file, `SystemML.py`, is not
> > included with the main distribution. We can work around this, and it
> should
> > not block this release, but we'll want to it add it for the next release.
> >
> >
> Please raise a jira for this.
>
>
>
> --
> Luciano Resende
> http://twitter.com/lresende1975
> http://lresende.blogspot.com/
>


Re: [VOTE] Apache SystemML 0.10.0-incubating (RC1)

2016-05-25 Thread Deron Eriksson
I may have found an issue with the standalone (uber) jar LICENSE and
NOTICE. Investigating.

Deron


On Wed, May 25, 2016 at 12:47 PM,  wrote:

> +1
>
> I ran scripts in Jupyter and Zeppelin using both the Scala and Python
> MLContext APIs in order to test our notebook integration.
>
> One thing to note is that our Python API file, `SystemML.py`, is not
> included with the main distribution. We can work around this, and it should
> not block this release, but we'll want to it add it for the next release.
>
> --
>
> Mike Dusenberry
> GitHub: github.com/dusenberrymw
> LinkedIn: linkedin.com/in/mikedusenberry
>
> Sent from my iPhone.
>
>
> > On May 24, 2016, at 12:47 AM, Matthias Boehm  wrote:
> >
> > +1
> >
> > In detail, I ran our performance testsuite on Spark 1.6.1 for data sizes
> {80MB, 800MB, 8GB, 80GB}, sparse/dense, intercept 0/1/2, and the algorithm
> classes binomial (Mlogreg, L2SVM, MSVM), multinomial (Mlogreg, MSVM, Naive
> Bayes), regression (LinregCG, LinregDS, GLM poisson-log, GLM gamma-log, GLM
> binomal-probit), clustering (Kmeans), and statistics (Univariate,
> Bivariate). The good news is that there are no compiler/runtime issues and
> performance is as expected.
> >
> > On thing, however, is that we get quite often exceptions like the one
> below in the logs - these are output in log level WARN and apparently
> caused by a race condition in the stop call on the spark context (
> https://issues.apache.org/jira/browse/SPARK-12967). Unfortunately, it's
> only fixed in 2.0 and there is nothing we can do about it.
> >
> > java.lang.IllegalStateException: RpcEnv already stopped.
> > at
> org.apache.spark.rpc.netty.Dispatcher.postMessage(Dispatcher.scala:159)
> > at org.apache.spark.rpc.netty.Dispatcher.postToAll(Dispatcher.scala:109)
> >
> > Regards,
> > Matthias
> >
> >
> > Luciano Resende ---05/20/2016 10:45:17 PM---Please vote on releasing the
> following candidate as Apache SystemML version 0.10.0-incubating !
> >
> > From: Luciano Resende 
> > To: dev@systemml.incubator.apache.org
> > Date: 05/20/2016 10:45 PM
> > Subject: [VOTE] Apache SystemML 0.10.0-incubating (RC1)
> >
> >
> >
> >
> > Please vote on releasing the following candidate as Apache SystemML
> version
> > 0.10.0-incubating !
> >
> > The vote is open for at least 72 hours and will close on Saturday,
> > Wednesday 25 and passes if a majority of at least 3 +1 PMC votes are
> cast.
> >
> > [ ] +1 Release this package as Apache SystemML 0.10.0-incubating
> > [ ] -1 Do not release this package because ...
> >
> > To learn more about Apache SystemML, please see
> http://systemml.apache.org/
> >
> > The tag to be voted on is v0.10.0-incubating-rc1
> > (ddf0e0941afe5d9c2cc7c574a6983aadd98c1fc3)
> >
> >
> https://github.com/apache/incubator-systemml/tree/ddf0e0941afe5d9c2cc7c574a6983aadd98c1fc3
> >
> > The release artifacts can be found at :
> >
> >
> https://dist.apache.org/repos/dist/dev/incubator/systemml/0.10.0-incubating-rc1/
> >
> > The maven release artifacts, including signatures, digests, etc. can be
> > found at:
> >
> >
> https://repository.apache.org/content/repositories/orgapachesystemml-1005/
> >
> >
> > =
> > == Apache Incubator release policy ==
> > =
> > Please find below the guide to release management during incubation:
> > http://incubator.apache.org/guides/releasemanagement.html
> >
> > ===
> > == How can I help test this release? ==
> > ===
> > If you are a SystemML user, you can help us test this release by taking
> an
> > existing Algorithm or workload and running on this release candidate,
> then
> > reporting any regressions.
> >
> > 
> > == What justifies a -1 vote for this release? ==
> > 
> > -1 votes should only occur for significant stop-ship bugs or legal
> related
> > issues (e.g. wrong license, missing header files, etc). Minor bugs or
> > regressions should not block this release.
> >
> >
> > --
> > Luciano Resende
> > http://twitter.com/lresende1975
> > http://lresende.blogspot.com/
> >
> >
>


Re: Location for release validation checklist?

2016-05-25 Thread Deron Eriksson
Hi Luciano,

A very basic checklist has been created at
https://issues.apache.org/jira/browse/SYSTEMML-708

Thanks,
Deron


On Wed, May 25, 2016 at 11:54 AM, Luciano Resende <luckbr1...@gmail.com>
wrote:

> Great, so, once you push your document, I will update with the build
> release portion.
>
> On Wed, May 25, 2016 at 11:03 AM, Deron Eriksson <deroneriks...@gmail.com>
> wrote:
>
> > Hi,
> >
> > Checklist on a jira sounds good to me. Sorry if I misunderstood the
> > previous comment.
> >
> > Deron
> >
> >
> > On Wed, May 25, 2016 at 10:07 AM, Luciano Resende <luckbr1...@gmail.com>
> > wrote:
> >
> > > On Wed, May 25, 2016 at 9:06 AM, <dusenberr...@gmail.com> wrote:
> > >
> > > > Another possibility would be to create a new JIRA issue for each
> > release
> > > > (candidate) and track what had been tested there. If we do that, then
> > we
> > > > could include just the generic instructions in a markdown file in our
> > > repo.
> > > >
> > > >
> > > Yes, this was my suggestion, put the checklist on a jira for each
> > release.
> > >
> > >
> > > --
> > > Luciano Resende
> > > http://twitter.com/lresende1975
> > > http://lresende.blogspot.com/
> > >
> >
>
>
>
> --
> Luciano Resende
> http://twitter.com/lresende1975
> http://lresende.blogspot.com/
>


Re: missing release candidate checksums?

2016-05-25 Thread Deron Eriksson
and maybe:
? systemml-0.10.0-incubating.pom.md5

Also, the previous release had sha1 checksums. Do we need those too or is
that overkill?

Deron


On Wed, May 25, 2016 at 10:16 AM, Luciano Resende <luckbr1...@gmail.com>
wrote:

> On Tue, May 24, 2016 at 5:20 PM, Deron Eriksson <deroneriks...@gmail.com>
> wrote:
>
> > Hi,
> >
> > I noticed that not all the artifacts at
> >
> >
> https://dist.apache.org/repos/dist/dev/incubator/systemml/0.10.0-incubating-rc1/
> > have md5 checksums.
> >
> > Also, the previous release (see
> >
> >
> https://repo1.maven.org/maven2/org/apache/systemml/systemml/0.9.0-incubating/
> > )
> > featured sha1 checksums but the current release candidate doesn't have
> sha1
> > checksums.
> >
> > Deron
> >
>
>
> Are these the only ones missing ?
>
> ?   systemml-0.10.0-incubating-inmemory.jar.md5
> ?   systemml-0.10.0-incubating-javadoc.jar.md5
> ?   systemml-0.10.0-incubating-sources.jar.md5
> ?   systemml-0.10.0-incubating-standalone.jar.md5
> ?   systemml-0.10.0-incubating.jar.md5
>
>
> --
> Luciano Resende
> http://twitter.com/lresende1975
> http://lresende.blogspot.com/
>


missing release candidate checksums?

2016-05-24 Thread Deron Eriksson
Hi,

I noticed that not all the artifacts at
https://dist.apache.org/repos/dist/dev/incubator/systemml/0.10.0-incubating-rc1/
have md5 checksums.

Also, the previous release (see
https://repo1.maven.org/maven2/org/apache/systemml/systemml/0.9.0-incubating/)
featured sha1 checksums but the current release candidate doesn't have sha1
checksums.

Deron


Re: branch for 0.10?

2016-05-23 Thread Deron Eriksson
I wanted to do some building and testing of 0.10. Should I be doing that
against the master head?

Deron


On Mon, May 23, 2016 at 10:37 AM, Luciano Resende <luckbr1...@gmail.com>
wrote:

> Do you have any specific thing to fix on 0.10 rc ? Otherwise I was going to
> create the branch when we needed it
>
>
> On Mon, May 23, 2016 at 10:31 AM, Deron Eriksson <deroneriks...@gmail.com>
> wrote:
>
> > Should there be a branch-0.10 on GitHub? The current pom.xml on master is
> > referring to the 0.11.0-SNAPSHOT.
> >
> > Deron
> >
>
>
>
> --
> Luciano Resende
> http://twitter.com/lresende1975
> http://lresende.blogspot.com/
>


Re: Fw: 'hello world' tests of the main distributions

2016-05-21 Thread Deron Eriksson
Hi Berthold,

I agree that those examples are simply a way to test the integrity of the
various build artifacts at only the most basic level. I wanted to post that
information in case it could help save anyone a few minutes when verifying
our latest release candidates.

The SystemML.jar appears in target/ as part of the build process (it is
identical to the other main jar), since it is included with that name in a
couple of the build artifacts. However, it's not included with the release
candidate files (
https://dist.apache.org/repos/dist/dev/incubator/systemml/0.10.0-incubating-rc1/),
so it can probably stay. It would be fine to delete it at the end of the
build process, if someone wants to tackle it.

Deron


On Sat, May 21, 2016 at 1:04 AM, Berthold Reinwald <reinw...@us.ibm.com>
wrote:

> resending due to 'text/html' issue.
>
>
> Regards,
> Berthold Reinwald
> IBM Almaden Research Center
> office: (408) 927 2208; T/L: 457 2208
> e-mail: reinw...@us.ibm.com
> - Forwarded by Berthold Reinwald/Almaden/IBM on 05/21/2016 12:59 AM
> -
>
> From:   Berthold Reinwald/Almaden/IBM
> To: dev@systemml.incubator.apache.org
> Date:   05/21/2016 12:39 AM
> Subject:Re: 'hello world' tests of the main distributions
>
>
>
> Thanks, Deron.
>
> we should include this in the build process.
> this is useful to test the integrity of the jar files, but nothing beyond
> that.
> aren't systemml-0.10.0-incubating-SNAPSHOT.jar and SystemML.jar identical,
> and we should drop one of these artifacts.
>
> Regards,
> Berthold Reinwald
> IBM Almaden Research Center
> office: (408) 927 2208; T/L: 457 2208
> e-mail: reinw...@us.ibm.com
>
>
> -Deron Eriksson <deroneriks...@gmail.com> wrote: -
> To: dev@systemml.incubator.apache.org
> From: Deron Eriksson <deroneriks...@gmail.com>
> Date: 05/20/2016 03:33PM
> Subject: 'hello world' tests of the main distributions
>
> Hi,
>
> We have a test suite of 5000+ tests but I don't think we have a standard
> way of testing the distribution artifacts once they are built. I just did
> some 'hello world' tests of the various distribution artifacts to be sure
> that I could run a 'hello world' DML script using the various
> distributions. In case it's helpful to others, here are the various things
> I did (on OS X).
>
> # build distribution artifacts
> mvn clean package -P distribution
>
> cd target
>
> # verify jar works
> java -cp ./lib/*:systemml-0.10.0-incubating-SNAPSHOT.jar
> org.apache.sysml.api.DMLScript -s "print('hello world');"
>
> # verify SystemML.jar works
> java -cp ./lib/*:SystemML.jar org.apache.sysml.api.DMLScript -s
> "print('hello world');"
>
> # verify standalone jar works
> java -jar systemml-0.10.0-incubating-SNAPSHOT-standalone.jar -s
> "print('hello world');"
>
> # verify src works
> tar -xvzf systemml-0.10.0-incubating-SNAPSHOT-src.tar.gz
> cd systemml-0.10.0-incubating-SNAPSHOT-src
> mvn clean package -P distribution
> cd target/
> java -cp ./lib/*:systemml-0.10.0-incubating-SNAPSHOT.jar
> org.apache.sysml.api.DMLScript -s "print('hello world');"
> java -cp ./lib/*:SystemML.jar org.apache.sysml.api.DMLScript -s
> "print('hello world');"
> java -jar systemml-0.10.0-incubating-SNAPSHOT-standalone.jar -s
> "print('hello world');"
> cd ..
> cd ..
>
> # verify in-memory jar works
> echo "import org.apache.sysml.api.jmlc.*;public class JMLCEx {public
> static
> void main(String[] args) throws Exception {Connection conn = new
> Connection();PreparedScript script = conn.prepareScript(\"print('hello
> world');\", new String[]{}, new String[]{},
> false);script.executeScript();}}" > JMLCEx.java
> javac -cp systemml-0.10.0-incubating-SNAPSHOT-inmemory.jar JMLCEx.java
> java -cp .:systemml-0.10.0-incubating-SNAPSHOT-inmemory.jar JMLCEx
>
> # verify standalone tar.gz works
> tar -xvzf systemml-0.10.0-incubating-SNAPSHOT-standalone.tar.gz
> cd systemml-0.10.0-incubating-SNAPSHOT-standalone
> echo "print('hello world');" > hello.dml
> ./runStandaloneSystemML.sh hello.dml
> cd ..
>
> # verify distrib tar.gz works
> tar -xvzf systemml-0.10.0-incubating-SNAPSHOT.tar.gz
> cd systemml-0.10.0-incubating-SNAPSHOT
> java -cp ../lib/*:SystemML.jar org.apache.sysml.api.DMLScript -s
> "print('hello world');"
>
> export SPARK_HOME=/Users/deroneriksson/spark-1.5.1-bin-hadoop2.6
> $SPARK_HOME/bin/spark-submit SystemML.jar -s "print('hello world');" -exec
> hybrid_spark
>
> hadoop jar SystemML.jar -s "print('hello world');"
> cd ..
>
> Deron
>
>
>


'hello world' tests of the main distributions

2016-05-20 Thread Deron Eriksson
Hi,

We have a test suite of 5000+ tests but I don't think we have a standard
way of testing the distribution artifacts once they are built. I just did
some 'hello world' tests of the various distribution artifacts to be sure
that I could run a 'hello world' DML script using the various
distributions. In case it's helpful to others, here are the various things
I did (on OS X).

# build distribution artifacts
mvn clean package -P distribution

cd target

# verify jar works
java -cp ./lib/*:systemml-0.10.0-incubating-SNAPSHOT.jar
org.apache.sysml.api.DMLScript -s "print('hello world');"

# verify SystemML.jar works
java -cp ./lib/*:SystemML.jar org.apache.sysml.api.DMLScript -s
"print('hello world');"

# verify standalone jar works
java -jar systemml-0.10.0-incubating-SNAPSHOT-standalone.jar -s
"print('hello world');"

# verify src works
tar -xvzf systemml-0.10.0-incubating-SNAPSHOT-src.tar.gz
cd systemml-0.10.0-incubating-SNAPSHOT-src
mvn clean package -P distribution
cd target/
java -cp ./lib/*:systemml-0.10.0-incubating-SNAPSHOT.jar
org.apache.sysml.api.DMLScript -s "print('hello world');"
java -cp ./lib/*:SystemML.jar org.apache.sysml.api.DMLScript -s
"print('hello world');"
java -jar systemml-0.10.0-incubating-SNAPSHOT-standalone.jar -s
"print('hello world');"
cd ..
cd ..

# verify in-memory jar works
echo "import org.apache.sysml.api.jmlc.*;public class JMLCEx {public static
void main(String[] args) throws Exception {Connection conn = new
Connection();PreparedScript script = conn.prepareScript(\"print('hello
world');\", new String[]{}, new String[]{},
false);script.executeScript();}}" > JMLCEx.java
javac -cp systemml-0.10.0-incubating-SNAPSHOT-inmemory.jar JMLCEx.java
java -cp .:systemml-0.10.0-incubating-SNAPSHOT-inmemory.jar JMLCEx

# verify standalone tar.gz works
tar -xvzf systemml-0.10.0-incubating-SNAPSHOT-standalone.tar.gz
cd systemml-0.10.0-incubating-SNAPSHOT-standalone
echo "print('hello world');" > hello.dml
./runStandaloneSystemML.sh hello.dml
cd ..

# verify distrib tar.gz works
tar -xvzf systemml-0.10.0-incubating-SNAPSHOT.tar.gz
cd systemml-0.10.0-incubating-SNAPSHOT
java -cp ../lib/*:SystemML.jar org.apache.sysml.api.DMLScript -s
"print('hello world');"

export SPARK_HOME=/Users/deroneriksson/spark-1.5.1-bin-hadoop2.6
$SPARK_HOME/bin/spark-submit SystemML.jar -s "print('hello world');" -exec
hybrid_spark

hadoop jar SystemML.jar -s "print('hello world');"
cd ..

Deron


Re: Starting a SystemML 0.10 release?

2016-05-20 Thread Deron Eriksson
Hi Luciano,

The fix is tested and merged. I believe we should be good to go now to cut
an RC.

Deron


On Fri, May 20, 2016 at 3:13 PM, Deron Eriksson <deroneriks...@gmail.com>
wrote:

> Thank you Luciano. I have a fix. I will test with your latest update to
> the Apache parent pom and then merge.
>
> Deron
>
>
> On Fri, May 20, 2016 at 2:58 PM, Luciano Resende <luckbr1...@gmail.com>
> wrote:
>
>> Ok, thanks for letting me know... I was about to cut a RC1, so I will wait
>> a bit more...
>>
>> On Fri, May 20, 2016 at 2:54 PM, Deron Eriksson <deroneriks...@gmail.com>
>> wrote:
>>
>> > BTW, I am investigating. I believe the issue is caused because of my
>> update
>> > to add an assembly (for the proper LICENSE and NOTICE) for the main jar.
>> >
>> > Deron
>> >
>> >
>> > On Fri, May 20, 2016 at 2:24 PM, Deron Eriksson <
>> deroneriks...@gmail.com>
>> > wrote:
>> >
>> > > Hi,
>> > >
>> > > Luciano, before starting the RC, I think I found an issue. The current
>> > > SystemML.jar apparently doesn't specify a main class in the
>> MANIFEST.MF
>> > > (which it did in the previous 0.9.0-incubating release).  Therefore
>> > > something like:
>> > >$SPARK_HOME/bin/spark-submit SystemML.jar -s "print('hello
>> world');"
>> > > -exec hybrid_spark
>> > > will generate the following error rather than executing:
>> > >   Error: No main class set in JAR; please specify one with --class
>> > >
>> > > Deron
>> > >
>> > >
>> > > On Thu, May 19, 2016 at 3:22 PM, Deron Eriksson <
>> deroneriks...@gmail.com
>> > >
>> > > wrote:
>> > >
>> > >> Hi,
>> > >>
>> > >> PR162 and PR167 have been merged. Thank you Glenn for all the help
>> > >> testing PR167.
>> > >>
>> > >> Deron
>> > >>
>> > >>
>> > >> On Thu, May 19, 2016 at 12:43 AM, Matthias Boehm <mbo...@us.ibm.com>
>> > >> wrote:
>> > >>
>> > >>> sounds good to me - in addition to PR167, I'd also like to get PR162
>> > >>> into this release. Furthermore, it would be good to run our full
>> > >>> performance testsuite (at least up to 80GB) but this could be done
>> on
>> > the
>> > >>> RC too. Thanks guys for taking care of the release again.
>> > >>>
>> > >>> Regards,
>> > >>> Matthias
>> > >>>
>> > >>>
>> > >>> [image: Inactive hide details for Luciano Resende ---05/18/2016
>> > 06:06:46
>> > >>> PM---On Wed, May 18, 2016 at 5:49 PM, Deron Eriksson
>> <deroneri]Luciano
>> > >>> Resende ---05/18/2016 06:06:46 PM---On Wed, May 18, 2016 at 5:49 PM,
>> > Deron
>> > >>> Eriksson <deroneriks...@gmail.com> wrote:
>> > >>>
>> > >>> From: Luciano Resende <luckbr1...@gmail.com>
>> > >>> To: dev@systemml.incubator.apache.org
>> > >>> Date: 05/18/2016 06:06 PM
>> > >>> Subject: Re: Starting a SystemML 0.10 release?
>> > >>> --
>> > >>>
>> > >>>
>> > >>>
>> > >>> On Wed, May 18, 2016 at 5:49 PM, Deron Eriksson <
>> > deroneriks...@gmail.com
>> > >>> >
>> > >>> wrote:
>> > >>>
>> > >>> > Hi,
>> > >>> >
>> > >>> > I've looked over all the release packages and the NOTICE and
>> LICENSES
>> > >>> are
>> > >>> > looking much better. I believe we have addressed all of the issues
>> > >>> brought
>> > >>> > up during the 0.9 release and have fixed many additional issues.
>> > >>> >
>> > >>>
>> > >>> Great, thanks for helping here.
>> > >>>
>> > >>>
>> > >>> >
>> > >>> > Are we about ready for our next release, 0.10? I believe it would
>> be
>> > >>> nice
>> > >>> > for PR167 (https://github.com/apache/incubator-systemml/pull/167)
>> to
>> > >>> be
>> > >>> > included since it updates the dml script packaging. Should any
>> other
>> > >>> > updates be included? Does anyone have any additional concerns?
>> > >>> >
>> > >>> >
>> > >>> Please let me know when this is in then.
>> > >>>
>> > >>>
>> > >>> > Luciano, would you be available as RM for this SystemML release?
>> > >>> >
>> > >>>
>> > >>> Sure.
>> > >>>
>> > >>>
>> > >>> >
>> > >>> > Deron
>> > >>> >
>> > >>>
>> > >>>
>> > >>>
>> > >>> --
>> > >>> Luciano Resende
>> > >>> http://twitter.com/lresende1975
>> > >>> http://lresende.blogspot.com/
>> > >>>
>> > >>>
>> > >>>
>> > >>
>> > >
>> >
>>
>>
>>
>> --
>> Luciano Resende
>> http://twitter.com/lresende1975
>> http://lresende.blogspot.com/
>>
>
>


Re: Starting a SystemML 0.10 release?

2016-05-20 Thread Deron Eriksson
Thank you Luciano. I have a fix. I will test with your latest update to the
Apache parent pom and then merge.

Deron


On Fri, May 20, 2016 at 2:58 PM, Luciano Resende <luckbr1...@gmail.com>
wrote:

> Ok, thanks for letting me know... I was about to cut a RC1, so I will wait
> a bit more...
>
> On Fri, May 20, 2016 at 2:54 PM, Deron Eriksson <deroneriks...@gmail.com>
> wrote:
>
> > BTW, I am investigating. I believe the issue is caused because of my
> update
> > to add an assembly (for the proper LICENSE and NOTICE) for the main jar.
> >
> > Deron
> >
> >
> > On Fri, May 20, 2016 at 2:24 PM, Deron Eriksson <deroneriks...@gmail.com
> >
> > wrote:
> >
> > > Hi,
> > >
> > > Luciano, before starting the RC, I think I found an issue. The current
> > > SystemML.jar apparently doesn't specify a main class in the MANIFEST.MF
> > > (which it did in the previous 0.9.0-incubating release).  Therefore
> > > something like:
> > >$SPARK_HOME/bin/spark-submit SystemML.jar -s "print('hello world');"
> > > -exec hybrid_spark
> > > will generate the following error rather than executing:
> > >   Error: No main class set in JAR; please specify one with --class
> > >
> > > Deron
> > >
> > >
> > > On Thu, May 19, 2016 at 3:22 PM, Deron Eriksson <
> deroneriks...@gmail.com
> > >
> > > wrote:
> > >
> > >> Hi,
> > >>
> > >> PR162 and PR167 have been merged. Thank you Glenn for all the help
> > >> testing PR167.
> > >>
> > >> Deron
> > >>
> > >>
> > >> On Thu, May 19, 2016 at 12:43 AM, Matthias Boehm <mbo...@us.ibm.com>
> > >> wrote:
> > >>
> > >>> sounds good to me - in addition to PR167, I'd also like to get PR162
> > >>> into this release. Furthermore, it would be good to run our full
> > >>> performance testsuite (at least up to 80GB) but this could be done on
> > the
> > >>> RC too. Thanks guys for taking care of the release again.
> > >>>
> > >>> Regards,
> > >>> Matthias
> > >>>
> > >>>
> > >>> [image: Inactive hide details for Luciano Resende ---05/18/2016
> > 06:06:46
> > >>> PM---On Wed, May 18, 2016 at 5:49 PM, Deron Eriksson
> <deroneri]Luciano
> > >>> Resende ---05/18/2016 06:06:46 PM---On Wed, May 18, 2016 at 5:49 PM,
> > Deron
> > >>> Eriksson <deroneriks...@gmail.com> wrote:
> > >>>
> > >>> From: Luciano Resende <luckbr1...@gmail.com>
> > >>> To: dev@systemml.incubator.apache.org
> > >>> Date: 05/18/2016 06:06 PM
> > >>> Subject: Re: Starting a SystemML 0.10 release?
> > >>> --
> > >>>
> > >>>
> > >>>
> > >>> On Wed, May 18, 2016 at 5:49 PM, Deron Eriksson <
> > deroneriks...@gmail.com
> > >>> >
> > >>> wrote:
> > >>>
> > >>> > Hi,
> > >>> >
> > >>> > I've looked over all the release packages and the NOTICE and
> LICENSES
> > >>> are
> > >>> > looking much better. I believe we have addressed all of the issues
> > >>> brought
> > >>> > up during the 0.9 release and have fixed many additional issues.
> > >>> >
> > >>>
> > >>> Great, thanks for helping here.
> > >>>
> > >>>
> > >>> >
> > >>> > Are we about ready for our next release, 0.10? I believe it would
> be
> > >>> nice
> > >>> > for PR167 (https://github.com/apache/incubator-systemml/pull/167)
> to
> > >>> be
> > >>> > included since it updates the dml script packaging. Should any
> other
> > >>> > updates be included? Does anyone have any additional concerns?
> > >>> >
> > >>> >
> > >>> Please let me know when this is in then.
> > >>>
> > >>>
> > >>> > Luciano, would you be available as RM for this SystemML release?
> > >>> >
> > >>>
> > >>> Sure.
> > >>>
> > >>>
> > >>> >
> > >>> > Deron
> > >>> >
> > >>>
> > >>>
> > >>>
> > >>> --
> > >>> Luciano Resende
> > >>> http://twitter.com/lresende1975
> > >>> http://lresende.blogspot.com/
> > >>>
> > >>>
> > >>>
> > >>
> > >
> >
>
>
>
> --
> Luciano Resende
> http://twitter.com/lresende1975
> http://lresende.blogspot.com/
>


Re: Starting a SystemML 0.10 release?

2016-05-19 Thread Deron Eriksson
Hi,

PR162 and PR167 have been merged. Thank you Glenn for all the help testing
PR167.

Deron


On Thu, May 19, 2016 at 12:43 AM, Matthias Boehm <mbo...@us.ibm.com> wrote:

> sounds good to me - in addition to PR167, I'd also like to get PR162 into
> this release. Furthermore, it would be good to run our full performance
> testsuite (at least up to 80GB) but this could be done on the RC too.
> Thanks guys for taking care of the release again.
>
> Regards,
> Matthias
>
>
> [image: Inactive hide details for Luciano Resende ---05/18/2016 06:06:46
> PM---On Wed, May 18, 2016 at 5:49 PM, Deron Eriksson <deroneri]Luciano
> Resende ---05/18/2016 06:06:46 PM---On Wed, May 18, 2016 at 5:49 PM, Deron
> Eriksson <deroneriks...@gmail.com> wrote:
>
> From: Luciano Resende <luckbr1...@gmail.com>
> To: dev@systemml.incubator.apache.org
> Date: 05/18/2016 06:06 PM
> Subject: Re: Starting a SystemML 0.10 release?
> --
>
>
>
> On Wed, May 18, 2016 at 5:49 PM, Deron Eriksson <deroneriks...@gmail.com>
> wrote:
>
> > Hi,
> >
> > I've looked over all the release packages and the NOTICE and LICENSES are
> > looking much better. I believe we have addressed all of the issues
> brought
> > up during the 0.9 release and have fixed many additional issues.
> >
>
> Great, thanks for helping here.
>
>
> >
> > Are we about ready for our next release, 0.10? I believe it would be nice
> > for PR167 (https://github.com/apache/incubator-systemml/pull/167) to be
> > included since it updates the dml script packaging. Should any other
> > updates be included? Does anyone have any additional concerns?
> >
> >
> Please let me know when this is in then.
>
>
> > Luciano, would you be available as RM for this SystemML release?
> >
>
> Sure.
>
>
> >
> > Deron
> >
>
>
>
> --
> Luciano Resende
> http://twitter.com/lresende1975
> http://lresende.blogspot.com/
>
>
>


Re: Starting a SystemML 0.10.0 release ?

2016-05-07 Thread Deron Eriksson
Hi,

I believe the js and css issues have been addressed (see
src/assembly/source/LICENSE) for the src packages. The standalone license
also features the js/css licenses but I believe this is not necessary
because the documentation is currently not packaged into any artifact
except the src distributions.

The antlr-generated src/main/java/org/apache/sysml/parser/*.java files now
have Apache headers so they should be fine.

The Zeppelin tar.gz example has been replaced by a json file, so that has
also been addressed.

I believe there are some remaining license/notice/disclaimer issues for the
build artifacts that we generate. Please see SYSTEMML-659, -660, -661,
-662, -663, -664, -665, and -667. Perhaps a license expert (such as
Luciano) can look at these and close any that are not issues? Note that our
large number of distribution artifacts (11!) makes it quite difficult to
track these down.

Perhaps someone can look into Justin's points [4] and [8] and verify that
everything is correct?

Deron

On Fri, May 6, 2016 at 3:39 PM,  wrote:

> This sounds good to me, and those of us at IBM agreed during a recent
> meeting that we are okay with this approach if the community is.  Deron
> noted that there are a few licensing issues left over from last time, a few
> of which relate to specific existing algorithms that may have been pulled
> from a textbook.  He sent out an email to the dev list a few days ago
> ("Fwd: [Vote] Apache SystemML 0.9.0."), but no responses yet.
>
> Deron also opened several JIRAs with the remaining tasks.  We don't want
> to block any other work though with creating a release.
>
> --
>
> Mike Dusenberry
> GitHub: github.com/dusenberrymw
> LinkedIn: linkedin.com/in/mikedusenberry
>
> Sent from my iPhone.
>
>
> > On May 6, 2016, at 1:52 PM, Luciano Resende 
> wrote:
> >
> > I will be presenting SystemML next week at ApacheCon and would be great
> if
> > we could have a release available or in progress.
> >
> > I know there was some initial discussion around a 0.9.1 release, but how
> > about starting a 0.10.0 release out of trunk ?
> >
> > If nobody objects to it, I can volunteer to be the RM for the release.
> >
> >
> > Thoughts ?
> >
> > --
> > Luciano Resende
> > http://twitter.com/lresende1975
> > http://lresende.blogspot.com/
>


Re: remove castAsScalar?

2016-04-22 Thread Deron Eriksson
In that case, perhaps I could create JIRAs to:
1) replace all castAsScalar's in the project with as.scalar's
2) if castAsScalar is used in a DML file, issue a log warning such as
'castAsScalar has been deprecated, please replace with as.scalar'
3) update docs to say castAsScalar has been deprecated.

That way, we maintain backwards compatibility with older DML outside the
project while replacing the castAsScalar's in the project.

Deron



On Thu, Apr 21, 2016 at 5:42 PM, Matthias Boehm <mbo...@us.ibm.com> wrote:

> Let's be careful not to unnecessarily break backwards compatibility. How
> about we collect all instances of language builtin functions that we want
> to remove and clean them up with our 1.0 release later this year? There are
> other instances like ppred that do not exist in R and meanwhile redundant
> in DML (but still heavily used).
>
> Regards,
> Matthias
>
> [image: Inactive hide details for Deron Eriksson ---04/21/2016 05:33:56
> PM---Hi, In the ongoing discussion concerning printing a matrix]Deron
> Eriksson ---04/21/2016 05:33:56 PM---Hi, In the ongoing discussion
> concerning printing a matrix (at
>
> From: Deron Eriksson <deroneriks...@gmail.com>
> To: dev@systemml.incubator.apache.org
> Date: 04/21/2016 05:33 PM
> Subject: remove castAsScalar?
> --
>
>
>
> Hi,
>
> In the ongoing discussion concerning printing a matrix (at
> https://github.com/apache/incubator-systemml/pull/120), I noticed that
> castAsScalar was introduced to the language as a mistake. It has been
> replaced by as.scalar but castAsScalar has been kept around until now for
> historical reasons. Since it is redundant and we are an open source
> project, can we now go ahead and remove it, since having two ways to
> accomplish the same thing (as.scalar and castAsScalar) can be confusing to
> new users?
>
> Deron
>
>
>


remove castAsScalar?

2016-04-21 Thread Deron Eriksson
Hi,

In the ongoing discussion concerning printing a matrix (at
https://github.com/apache/incubator-systemml/pull/120), I noticed that
castAsScalar was introduced to the language as a mistake. It has been
replaced by as.scalar but castAsScalar has been kept around until now for
historical reasons. Since it is redundant and we are an open source
project, can we now go ahead and remove it, since having two ways to
accomplish the same thing (as.scalar and castAsScalar) can be confusing to
new users?

Deron


Re: Fw: Updating documentation for notebook

2016-04-11 Thread Deron Eriksson
Hi Niketan,

I think a separate section for Notebooks is a great idea since, as you
point out, they are hidden under the MLContext section. Also, I really like
the idea of making it as easy as possible for a new user to try out
SystemML in a Notebook. Very good points.

Tutorials for all the algorithms using real-world data would be fantastic.
To me, I would also like to see single-line algorithm invocations (possibly
with generated data) that could be copy/pasted that work with no
modifications needed by the user. This would probably mean either including
small sets of example data in the project, or allowing the reading of data
from URLs.

It would be nice to take something like these 5 commands:
---
$ wget
https://raw.githubusercontent.com/apache/incubator-systemml/master/scripts/datagen/genRandData4Univariate.dml
$ $SPARK_HOME/bin/spark-submit $SYSTEMML_HOME/SystemML.jar -f
genRandData4Univariate.dml -exec hybrid_spark -args 100 100 10 1 2 3 4
uni.mtx
$ echo '1' > uni-types.csv
$ echo '{"rows": 1, "cols": 1, "format": "csv"}' > uni-types.csv.mtd
$ $SPARK_HOME/bin/spark-submit $SYSTEMML_HOME/SystemML.jar -f
$SYSTEMML_HOME/algorithms/Univar-Stats.dml -exec hybrid_spark -nvargs
X=uni.mtx TYPES=uni-types.csv STATS=uni-stats.txt
---
and reduce this to 1 command (in the documentation) that the user can
copy/paste and the algorithm runs without any additional work needed by the
user:
---
$ $SPARK_HOME/bin/spark-submit $SYSTEMML_HOME/SystemML.jar -f
$SYSTEMML_HOME/algorithms/Univar-Stats.dml -exec hybrid_spark -nvargs X=
http://www.example.com/uni.mtx TYPES=http://www.example.com/uni-types.csv
STATS=uni-stats.txt
---
If we had this for each of the main algorithms, this would give the users
working examples to start with, which is easier than trying to figure out
this kind of thing by reading the comments in the DML algorithm files.

Deron


On Fri, Apr 8, 2016 at 4:51 PM, Niketan Pansare  wrote:

> Hi all,
>
> As per Luciano's suggestion, I have create a PR with bluemix/datascientist
> tutorial and have flagged it with "Please DONOT push this PR until the
> discussion on dev mailing list is complete." :)
>
> Also, I apologize for incorrect indentation in last email. Here is another
> attempt:
> - How do you want try SystemML ?
> --+ Notebook on cloud
> * Bluemix
> -- + Zeppelin
> --- Using Python Kernel
>  + Learn how to write DML program--(something along the lines
> of
> http://apache.github.io/incubator-systemml/beginners-guide-to-dml-and-pydml.html
> )
>  + Try out pre-packaged algorithms on real-world dataset
> -- * Linear Regression
> -- * GLM
> -- * ALS
> -- * ...
>  + Learn how to pass RDD/DataFrame to SystemML
>  + Learn how to use SystemML as MLPipeline
> estimator/transformer
>  + Learn how to use SystemML with existing Python packages
> --- Using Scala Kernel
>  + ... similar to Python kernel
> --- Using DML Kernel
>  + Learn how to write DML program
> -- + Jupyter
> - Using Python Kernel
> - Using Scala Kernel
> - Using DML Kernel
> * Data scientist's work bench
> * Databricks cloud
> * ...
> --+ Notebook on laptop/cluster
> * Zeppelin
> * Jupyter
> --+ Laptop
> * Run SystemML as Standalone jar:
> http://apache.github.io/incubator-systemml/quick-start-guide.html
> * Embed SystemML into other Java program:
> http://apache.github.io/incubator-systemml/jmlc.html
> * Debug a DML script:
> http://apache.github.io/incubator-systemml/debugger-guide.html
> * Spark local mode
> --+ Spark Cluster
> * Batch invocation
> * Using Spark REPL
> --+ Learn how to pass RDD/DataFrame to SystemML
> --+ Learn how to use SystemML as MLPipeline estimator/transformer
> * Using PySpark REPL
> --+ Learn how to pass RDD/DataFrame to SystemML
> --+ Learn how to use SystemML as MLPipeline estimator/transformer
> --+ Hadoop Cluster
> --+ Spark Cluster on EC2
>
> Thanks,
>
> Niketan Pansare
> IBM Almaden Research Center
> E-mail: npansar At us.ibm.com
> http://researcher.watson.ibm.com/researcher/view.php?person=us-npansar
>
> - Forwarded by Niketan Pansare/Almaden/IBM on 04/08/2016 04:48 PM
> -
>
>
>
> *Fw: Updating documentation for notebook*
>
> *Niketan Pansare *
> to:
> dev
> 04/08/2016 01:11 PM
>
>
>
>
> From:
> Niketan Pansare/Almaden/IBM
>
>
>
>
> To:
> dev 
>
> Hi all,
>
> Here are few suggestions to get things started:
> 1. Have a "Quick Start" (or "Get Started") button besides "Get SystemML"
> on http://systemml.apache.org/.
>
> 2. Then user can go through following questionnaire/bulleted list which
> points people to appropriate link:
> - How do you want try SystemML ?
> + Notebook on cloud
> * Bluemix
> + Zeppelin
> - Using Python Kernel
> + Learn how to write DML program (something along the 

Re: Change commons-math3 to compile scope?

2016-04-06 Thread Deron Eriksson
Adding it to a troubleshooting guide sounds like a reasonable approach.

Deron

On Wed, Apr 6, 2016 at 2:44 PM, Matthias Boehm <mbo...@us.ibm.com> wrote:

> well, we don't want to get into having multiple commons math versions in
> the classpath and newer hadoop distributions have it by default. So I would
> rather add it to a trouble shooting guide. Alternatively, we could have two
> different 'distribution' profiles for releases.
>
> Regards,
> Matthias
>
> [image: Inactive hide details for Deron Eriksson ---04/06/2016 02:40:13
> PM---WRT SYSTEMML-489 (https://issues.apache.org/jira/browse/SY]Deron
> Eriksson ---04/06/2016 02:40:13 PM---WRT SYSTEMML-489 (
> https://issues.apache.org/jira/browse/SYSTEMML-489), support for older
> Hadoop clus
>
> From: Deron Eriksson <deroneriks...@gmail.com>
> To: dev@systemml.incubator.apache.org
> Date: 04/06/2016 02:40 PM
> Subject: Change commons-math3 to compile scope?
> --
>
>
>
> WRT SYSTEMML-489 (https://issues.apache.org/jira/browse/SYSTEMML-489),
> support for older Hadoop clusters was assisted by changing the
> commons-math3 pom.xml scope from "provided" to "compile". Can we update the
> project to reflect this, or are there any reasons not to?
>
> Deron
>
>
>


Remove any of these classes?

2016-04-04 Thread Deron Eriksson
Hi,

If I search for classes/enums that aren't referenced by other classes in
SystemML, I get the following partial list:

org.apache.sysml.hops.Hop.ExtBuiltInOp
org.apache.sysml.parser.Expression.AggOp
org.apache.sysml.parser.Expression.ExtBuiltinFunctionOp
org.apache.sysml.parser.Expression.ReorgOp
org.apache.sysml.runtime.controlprogram.parfor.opt.MemoTable
org.apache.sysml.runtime.functionobjects.MaxIndex
org.apache.sysml.runtime.functionobjects.MinIndex
org.apache.sysml.runtime.instructions.cp.FileObject
org.apache.sysml.runtime.instructions.spark.data.CountLinesInfo
org.apache.sysml.runtime.instructions.spark.functions.ConvertColumnRDDToBinaryBlock
org.apache.sysml.runtime.instructions.spark.functions.ConvertMLLibBlocksToBinaryBlocks
org.apache.sysml.runtime.instructions.spark.functions.ConvertTextLineToBinaryCellFunction
org.apache.sysml.runtime.instructions.spark.functions.ConvertTextToString
org.apache.sysml.runtime.instructions.spark.functions.FindMatrixBlockFromMatrixIndexes
org.apache.sysml.runtime.instructions.spark.functions.GetMLLibBlocks
org.apache.sysml.runtime.instructions.spark.functions.LastCellInMatrixBlock
org.apache.sysml.runtime.instructions.spark.functions.MatrixVectorBinaryOpFunction
org.apache.sysml.runtime.io.FrameReaderFactory
org.apache.sysml.runtime.io.FrameWriterFactory
org.apache.sysml.runtime.io.WriterMatrixMarketParallel
org.apache.sysml.runtime.matrix.data.PoissonRandomMatrixGenerator
org.apache.sysml.runtime.matrix.data.TaggedInt
org.apache.sysml.runtime.matrix.data.TaggedPartialBlock
org.apache.sysml.runtime.matrix.data.WeightedPairToSortInputConverter
org.apache.sysml.runtime.matrix.data.RuntimeDataFormat
org.apache.sysml.runtime.matrix.mapred.CachedMap
org.apache.sysml.runtime.matrix.mapred.MMCJMRCombiner
org.apache.sysml.runtime.matrix.mapred.MMCJMRReducer
org.apache.sysml.runtime.matrix.sort.CompactDoubleIntInputFormat
org.apache.sysml.runtime.util.BinaryBlockInputFormat
org.apache.sysml.runtime.util.RandN
org.apache.sysml.utils.AppException
org.apache.sysml.utils.Timer
org.apache.sysml.yarn.ropt.ResourceOptimizerCPMigration

Can any of these be deleted?

Deron


"Scalable Machine Learning with Apache SystemML" talk tonight

2016-03-09 Thread Deron Eriksson
Berthold Reinwald is giving a talk tonight (Wednesday, March 9, 2016) at
6:30pm at the IBM Spark Technology Center in San Francisco about "Scalable
Machine Learning with Apache SystemML."

Information about the talk can be found here:
http://www.meetup.com/SF-Spark-and-Friends/events/229165430/

If you would like to attend the meetup, please join the meetup by 3pm today.

Deron


DMLRuntimeException

2016-02-29 Thread Deron Eriksson
Hi,

Can we change DMLRuntimeException to extend RuntimeException rather than
DMLException?

1) The javadocs say DMLRuntimeException is equivalent to RuntimeException.
RuntimeException is an uncaught exception.
2) However, DMLRuntimeException extends DMLException which extends
Exception which is a caught exception.

So, this means that currently DMLRuntimeException in this example needs a
throws clause on the method (or the throw needs to be wrapped in a
try/catch).

public void example() throws DMLRuntimeException {
throw new DMLRuntimeException("Example");
}
If it's a RuntimeException, it should really be:

public void example() {
throw new DMLRuntimeException("Example");
}

Deron


"sparse" metadata attribute default value for writing csv

2016-02-16 Thread Deron Eriksson
Hi,

Right now the DML Language Ref states that the default value for the
"sparse" metadata attribute (for the write function) is true.
However, DEFAULT_DELIM_SPARSE in DataExpression is false.

Which value is 'correct'? I assume the docs should be updated to reflect
the code?

Deron


Re: Matrix Market format with metadata file

2016-02-15 Thread Deron Eriksson
Very good eye! I used "m = matrix("1 2 3 0 0 0 7 8 9 0 0 0", rows=4,
cols=3)" to generate the mm file, so the 4th row did indeed contain all
zeros.


On Mon, Feb 15, 2016 at 4:50 PM, Shirish Tatikonda <
shirish.tatiko...@gmail.com> wrote:

> Btw (Just to be precise), in your example of "mm" file.. the metadata is "4
> 3 6" but the following non-zero values are only up to row number 3. So,
> either it was a typo or the 4th row contains all zeros.
>
>
>
> On Mon, Feb 15, 2016 at 4:26 PM, Shirish Tatikonda <
> shirish.tatiko...@gmail.com> wrote:
>
> > Both "mm" and "text" formats are identical except for a couple of
> > differences:
> >
> > 1) for "mm": the matrix metadata is included in the first two lines; and
> > for "text": the metadata is present in the associated .mtd file
> > 2) "mm" data must be in a single file (i.e., no *part* files) where
> > "text" data can span multiple *part* files (like any other file on HDFS).
> >
> > The support for "mm" is created mainly for the purpose of
> > importing/exporting data in the format that R likes.
> >
> > Shirish
> >
> > On Mon, Feb 15, 2016 at 4:17 PM, Deron Eriksson <deroneriks...@gmail.com
> >
> > wrote:
> >
> >> Hi,
> >>
> >> I have a question with regards to text vs mm. Isn't the mm coordinate
> >> format identical to the text format but the mm data file happens to
> >> include
> >> the metadata line for rows, cols, and nnzs, so shouldn't they scale the
> >> same since the text row values (i,j,v) correspond to the mm rows?
> >>
> >> If we have the following MM:
> >> %%MatrixMarket matrix coordinate real general
> >> 4 3 6
> >> 1 1 1.0
> >> 1 2 2.0
> >> 1 3 3.0
> >> 3 1 7.0
> >> 3 2 8.0
> >> 3 3 9.0
> >>
> >> The corresponding text format (with accompanying metadata file) is:
> >> 1 1 1.0
> >> 1 2 2.0
> >> 1 3 3.0
> >> 3 1 7.0
> >> 3 2 8.0
> >> 3 3 9.0
> >>
> >> So aren't these formats essentially the same?
> >>
> >> Deron
> >>
> >>
> >> On Mon, Feb 15, 2016 at 3:56 PM, Matthias Boehm <mbo...@us.ibm.com>
> >> wrote:
> >>
> >> > The meta data file is still useful in order to get the format. In case
> >> of
> >> > matrix market, errors will be raised if included meta data is
> >> inconsistent.
> >> > So no, we should not disallow to specify the meta data. In general, we
> >> > anyway recommend using text (textcell) instead mm (matrix market) for
> >> > scalability reasons.
> >> >
> >> > Regards,
> >> > Matthias
> >> >
> >> > [image: Inactive hide details for Deron Eriksson ---02/15/2016
> 03:45:46
> >> > PM---Hi, The Matrix Market coordinate format contains # rows, #]Deron
> >> > Eriksson ---02/15/2016 03:45:46 PM---Hi, The Matrix Market coordinate
> >> > format contains # rows, # columns, and #
> >> >
> >> > From: Deron Eriksson <deroneriks...@gmail.com>
> >> > To: dev@systemml.incubator.apache.org
> >> > Date: 02/15/2016 03:45 PM
> >> > Subject: Matrix Market format with metadata file
> >> > --
> >> >
> >> >
> >> >
> >> > Hi,
> >> >
> >> > The Matrix Market coordinate format contains # rows, # columns, and #
> >> > non-zero values as metadata near the top of a matrix data file.
> >> >
> >> > If I write a matrix in mm format using SystemML, no metadata file is
> >> > created since the metadata is stored within the data file.
> >> >
> >> > However, when reading a matrix with mm format, I can supply a metadata
> >> > file, even though metadata exists in the matrix data file. Is there
> any
> >> > reason for this, or should this be disallowed since the metadata file
> is
> >> > redundant and can cause confusion, since metadata values can then be
> >> > specified in two places, which then brings up the question, "which
> >> metadata
> >> > value should be used"?
> >> >
> >> > Deron
> >> >
> >> >
> >> >
> >>
> >
> >
>


Re: Matrix Market format with metadata file

2016-02-15 Thread Deron Eriksson
Thank you, Shirish. That makes sense. I'll update the docs to include this
information.

Deron


On Mon, Feb 15, 2016 at 4:26 PM, Shirish Tatikonda <
shirish.tatiko...@gmail.com> wrote:

> Both "mm" and "text" formats are identical except for a couple of
> differences:
>
> 1) for "mm": the matrix metadata is included in the first two lines; and
> for "text": the metadata is present in the associated .mtd file
> 2) "mm" data must be in a single file (i.e., no *part* files) where "text"
> data can span multiple *part* files (like any other file on HDFS).
>
> The support for "mm" is created mainly for the purpose of
> importing/exporting data in the format that R likes.
>
> Shirish
>
> On Mon, Feb 15, 2016 at 4:17 PM, Deron Eriksson <deroneriks...@gmail.com>
> wrote:
>
> > Hi,
> >
> > I have a question with regards to text vs mm. Isn't the mm coordinate
> > format identical to the text format but the mm data file happens to
> include
> > the metadata line for rows, cols, and nnzs, so shouldn't they scale the
> > same since the text row values (i,j,v) correspond to the mm rows?
> >
> > If we have the following MM:
> > %%MatrixMarket matrix coordinate real general
> > 4 3 6
> > 1 1 1.0
> > 1 2 2.0
> > 1 3 3.0
> > 3 1 7.0
> > 3 2 8.0
> > 3 3 9.0
> >
> > The corresponding text format (with accompanying metadata file) is:
> > 1 1 1.0
> > 1 2 2.0
> > 1 3 3.0
> > 3 1 7.0
> > 3 2 8.0
> > 3 3 9.0
> >
> > So aren't these formats essentially the same?
> >
> > Deron
> >
> >
> > On Mon, Feb 15, 2016 at 3:56 PM, Matthias Boehm <mbo...@us.ibm.com>
> wrote:
> >
> > > The meta data file is still useful in order to get the format. In case
> of
> > > matrix market, errors will be raised if included meta data is
> > inconsistent.
> > > So no, we should not disallow to specify the meta data. In general, we
> > > anyway recommend using text (textcell) instead mm (matrix market) for
> > > scalability reasons.
> > >
> > > Regards,
> > > Matthias
> > >
> > > [image: Inactive hide details for Deron Eriksson ---02/15/2016 03:45:46
> > > PM---Hi, The Matrix Market coordinate format contains # rows, #]Deron
> > > Eriksson ---02/15/2016 03:45:46 PM---Hi, The Matrix Market coordinate
> > > format contains # rows, # columns, and #
> > >
> > > From: Deron Eriksson <deroneriks...@gmail.com>
> > > To: dev@systemml.incubator.apache.org
> > > Date: 02/15/2016 03:45 PM
> > > Subject: Matrix Market format with metadata file
> > > --
> > >
> > >
> > >
> > > Hi,
> > >
> > > The Matrix Market coordinate format contains # rows, # columns, and #
> > > non-zero values as metadata near the top of a matrix data file.
> > >
> > > If I write a matrix in mm format using SystemML, no metadata file is
> > > created since the metadata is stored within the data file.
> > >
> > > However, when reading a matrix with mm format, I can supply a metadata
> > > file, even though metadata exists in the matrix data file. Is there any
> > > reason for this, or should this be disallowed since the metadata file
> is
> > > redundant and can cause confusion, since metadata values can then be
> > > specified in two places, which then brings up the question, "which
> > metadata
> > > value should be used"?
> > >
> > > Deron
> > >
> > >
> > >
> >
>


Re: Matrix Market format with metadata file

2016-02-15 Thread Deron Eriksson
Hi,

I have a question with regards to text vs mm. Isn't the mm coordinate
format identical to the text format but the mm data file happens to include
the metadata line for rows, cols, and nnzs, so shouldn't they scale the
same since the text row values (i,j,v) correspond to the mm rows?

If we have the following MM:
%%MatrixMarket matrix coordinate real general
4 3 6
1 1 1.0
1 2 2.0
1 3 3.0
3 1 7.0
3 2 8.0
3 3 9.0

The corresponding text format (with accompanying metadata file) is:
1 1 1.0
1 2 2.0
1 3 3.0
3 1 7.0
3 2 8.0
3 3 9.0

So aren't these formats essentially the same?

Deron


On Mon, Feb 15, 2016 at 3:56 PM, Matthias Boehm <mbo...@us.ibm.com> wrote:

> The meta data file is still useful in order to get the format. In case of
> matrix market, errors will be raised if included meta data is inconsistent.
> So no, we should not disallow to specify the meta data. In general, we
> anyway recommend using text (textcell) instead mm (matrix market) for
> scalability reasons.
>
> Regards,
> Matthias
>
> [image: Inactive hide details for Deron Eriksson ---02/15/2016 03:45:46
> PM---Hi, The Matrix Market coordinate format contains # rows, #]Deron
> Eriksson ---02/15/2016 03:45:46 PM---Hi, The Matrix Market coordinate
> format contains # rows, # columns, and #
>
> From: Deron Eriksson <deroneriks...@gmail.com>
> To: dev@systemml.incubator.apache.org
> Date: 02/15/2016 03:45 PM
> Subject: Matrix Market format with metadata file
> --
>
>
>
> Hi,
>
> The Matrix Market coordinate format contains # rows, # columns, and #
> non-zero values as metadata near the top of a matrix data file.
>
> If I write a matrix in mm format using SystemML, no metadata file is
> created since the metadata is stored within the data file.
>
> However, when reading a matrix with mm format, I can supply a metadata
> file, even though metadata exists in the matrix data file. Is there any
> reason for this, or should this be disallowed since the metadata file is
> redundant and can cause confusion, since metadata values can then be
> specified in two places, which then brings up the question, "which metadata
> value should be used"?
>
> Deron
>
>
>


Matrix Market format with metadata file

2016-02-15 Thread Deron Eriksson
Hi,

The Matrix Market coordinate format contains # rows, # columns, and #
non-zero values as metadata near the top of a matrix data file.

If I write a matrix in mm format using SystemML, no metadata file is
created since the metadata is stored within the data file.

However, when reading a matrix with mm format, I can supply a metadata
file, even though metadata exists in the matrix data file. Is there any
reason for this, or should this be disallowed since the metadata file is
redundant and can cause confusion, since metadata values can then be
specified in two places, which then brings up the question, "which metadata
value should be used"?

Deron


  1   2   >