Re: Runtime package refactoring

2015-12-05 Thread Luciano Resende
On Sat, Dec 5, 2015 at 3:17 PM, Matthias Boehm  wrote:

> yes, these changes are all local to 'org.apache.sysml.runtime'. Other
> than binary format incompatibility, there are no other side effects for MR
> or Spark. These changes are primarily a cleanup of a historically grown
> package structure and a preparation step. For now, there will be still just
> one assembly - down the road however, this allows us to create a separate
> artifact of the core runtime library (which is already used by all three
> CP/MR/Spark runtime backends) for external usage too.
>
>
> Regards,
> Matthias
>
> Thanks for the clarification

And please, when implementing, please follow the steps below to make sure
we don't loose file history.

- Perform the refactor on your own fork (not on apache git)
- Move the files as one git commit
- Do all the file content changes as a second git commit (imports, docs,
javadocs, etc)
- Create a full build to make sure there is no breakages
- Let the team review to make sure we are not loosing history on the files
or something similar.

Thanks

-- 
Luciano Resende
http://people.apache.org/~lresende
http://twitter.com/lresende1975
http://lresende.blogspot.com/


SystemML github mirror is behind one commit

2015-12-05 Thread Luciano Resende
Looks like the SystemML github mirror is missing one commit compared to
Apache SystemML git repository.

git log --oneline master..apache/master
41d9d2c Remove copyrights from license, add license where needed

What's the best way to fix this issue ? Should I create a JIRA ?

Thank you.

-- 
Luciano Resende
http://people.apache.org/~lresende
http://twitter.com/lresende1975
http://lresende.blogspot.com/


Re: Runtime package refactoring

2015-12-05 Thread Luciano Resende
On Fri, Dec 4, 2015 at 5:16 PM, Matthias Boehm  wrote:

>
>
> Hi all,
>
> just a quick heads-up, I'd like to do a refactoring of our runtime package.
> The goals are (1) to separate out all mr-related classes (cleanup), and (2)
> to prepare our core matrix block runtime for packaging as an individual jar
> which would make it consumable as a small-footprint library. I intend to
> make this change mid next week.
>
> Similar to the refactoring from 'com.ibm.bi.dml' to 'org.apache.sysml',
> this change would break binary compatibility with existing datasets in
> binary format because the class names are persistent in the sequence file
> headers. A workaround is to use an old jar to convert your data from the
> old binary format to text, and a new jar to convert the text representation
> to the new binary format.
>
> Here is the proposed package structure:
>
> org.apache.sysml.runtime
> --controlprogram [...]
> --core
> matrix
> funobj
> operators
> --instructions [...]
> --io
>
--mapred
> data
> hadoopfix
> jobs
> tasks
> sort
> --parfor [...]
> --transform
> --util
>

I am assuming these changes are all under org.apache.sysml.runtime


>
> Given this structure we could simply package 'core'/'util' and perhaps 'io'
> into a separate jar.
>
>
Few Questions:

- What would be the side effects for different runtimes (MR/Spark)
integration ?
- Is this is just a local build modularization issue, and we are still
planning to generate ONE distribution assembly ?


>
> Regards,
> Matthias
>

Also, as we experienced multiple issues with the package refactoring, I
would recommend the following :

- Perform the refactor on your own fork (not on apache git)
- Move the files as one git commit
- Do all the file content changes as a second git commit (imports, docs,
javadocs, etc)
- Create a full build to make sure there is no breakages
- Let the team review to make sure we are not loosing history on the files
or something similar.

Thank you

-- 
Luciano Resende
http://people.apache.org/~lresende
http://twitter.com/lresende1975
http://lresende.blogspot.com/