I only looked at replacing Precinditions by asserts and found a bunch of other stuff from Google common package, so held off.
On Tuesday, May 19, 2015, Suneel Marthi <[email protected]> wrote: > I had tried minimizing the Guava Dependency to a large extent in the run up > to 0.10.0. Its not as trivial as it seems, there are parts of the code > (Collocations, lucene2seq. Lucene TokenStream processing and tokenization > code) that are heavily reliant on AbstractIterator; and there are sections > of the code that assign a HashSet to a List (again have to use Guava for > that if one wants to avoid writing boilerplate for doing the same. > > Moreover, things that return something like Iterable<?> and need to be > converted into a regular collection, can easily be done using Guava without > writing own boilerplate again. > > Are we replacing all Preconditions by straight Asserts now ?? > > > On Tue, May 19, 2015 at 11:21 AM, Pat Ferrel <[email protected] > <javascript:;>> wrote: > > > We need to move to Spark 1.3 asap and set the stage for beyond 1.3. The > > primary reason is that the big distros are there already or will be very > > soon. Many people using Mahout will have the environment they must use > > dictated by support orgs in their companies so our current position as > > running only on Spark 1.1.1 means many potential users are out of luck. > > > > Here are the problems I know of in moving Mahout ahead on Spark > > 1) Guava in any backend code (executor closures) relies on being > > serialized with Javaserializer, which is broken and hasn’t been fixed in > > 1.2+ There is a work around, which involves moving a Guava jar to all > Spark > > workers, which is unacceptable in many cases. Guava in the Spark-1.2 PR > has > > been removed from Scala code and will be pushed to the master probably > this > > week. That leaves a bunch of uses of Guava in java math and hdfs. Andrew > > has (I think) removed the Preconditions and replaced them with asserts. > But > > there remain some uses of Map and AbstractIterator from Guava. Not sure > how > > many of these remain but if anyone can help please check here: > > https://issues.apache.org/jira/browse/MAHOUT-1708 < > > https://issues.apache.org/jira/browse/MAHOUT-1708> > > 2) the Mahout Shell relies on APIs not available in Spark 1.3. > > 3) the api for writing to sequence files now requires implicit values > that > > are not available in the current code. I think Andy did a temp fix to > write > > to object files but this is probably nto what we want to release. > > > > I for one would dearly love to see Mahout 0.10.1 support Spark 1.3+. and > > soon. This is a call for help in cleaning these things up. Even with no > new > > features the above things would make Mahout much more usable in current > > environments. >
