Re: Contributing to MLlib: Proposal for Clustering Algorithms

2014-07-19 Thread Jeremy Freeman
Hi RJ, that sounds like a great idea. I'd be happy to look over what you put
together.

-- Jeremy



--
View this message in context: 
http://apache-spark-developers-list.1001551.n3.nabble.com/Contributing-to-MLlib-Proposal-for-Clustering-Algorithms-tp7212p7418.html
Sent from the Apache Spark Developers List mailing list archive at Nabble.com.


Re: small (yet major) change going in: broadcasting RDD to reduce task size

2014-07-19 Thread Reynold Xin
Thanks :)

FYI the pull request has been merged and will be part of Spark 1.1.0.



On Thu, Jul 17, 2014 at 11:09 AM, Nicholas Chammas 
nicholas.cham...@gmail.com wrote:

 On Thu, Jul 17, 2014 at 1:23 AM, Stephen Haberman 
 stephen.haber...@gmail.com wrote:

 I'd be ecstatic if more major changes were this well/succinctly
 explained


 Ditto on that. The summary of user impact was very nice. It would be good
 to repeat that on the user list or release notes when this change goes out.

 Nick



Master compilation with sbt

2014-07-19 Thread Debasish Das
Hi,

Is sbt still used for master compilation ? I could compile for
2.3.0-cdh5.0.2 using maven following the instructions from the website:

http://spark.apache.org/docs/latest/building-with-maven.html

But when I am trying to use sbt for local testing and then I am getting
some weird errors...Is sbt still used by developers ? I am using JDK7...

org.xml.sax.SAXParseException; lineNumber: 4; columnNumber: 57; Element
type settings must be followed by either attribute specifications,  or
/.

at
com.sun.org.apache.xerces.internal.util.ErrorHandlerWrapper.createSAXParseException(ErrorHandlerWrapper.java:198)

at
com.sun.org.apache.xerces.internal.util.ErrorHandlerWrapper.fatalError(ErrorHandlerWrapper.java:177)

at
com.sun.org.apache.xerces.internal.impl.XMLErrorReporter.reportError(XMLErrorReporter.java:441)

at
com.sun.org.apache.xerces.internal.impl.XMLErrorReporter.reportError(XMLErrorReporter.java:368)

at
com.sun.org.apache.xerces.internal.impl.XMLScanner.reportFatalError(XMLScanner.java:1436)

at
com.sun.org.apache.xerces.internal.impl.XMLDocumentFragmentScannerImpl.seekCloseOfStartTag(XMLDocumentFragmentScannerImpl.java:1394)

at
com.sun.org.apache.xerces.internal.impl.XMLDocumentFragmentScannerImpl.scanStartElement(XMLDocumentFragmentScannerImpl.java:1327)

at
com.sun.org.apache.xerces.internal.impl.XMLDocumentScannerImpl$ContentDriver.scanRootElementHook(XMLDocumentScannerImpl.java:1292)

at
com.sun.org.apache.xerces.internal.impl.XMLDocumentFragmentScannerImpl$FragmentContentDriver.next(XMLDocumentFragmentScannerImpl.java:3122)

at
com.sun.org.apache.xerces.internal.impl.XMLDocumentScannerImpl$PrologDriver.next(XMLDocumentScannerImpl.java:880)

at
com.sun.org.apache.xerces.internal.impl.XMLDocumentScannerImpl.next(XMLDocumentScannerImpl.java:606)

at
com.sun.org.apache.xerces.internal.impl.XMLDocumentFragmentScannerImpl.scanDocument(XMLDocumentFragmentScannerImpl.java:510)

at
com.sun.org.apache.xerces.internal.parsers.XML11Configuration.parse(XML11Configuration.java:848)

at
com.sun.org.apache.xerces.internal.parsers.XML11Configuration.parse(XML11Configuration.java:777)

at
com.sun.org.apache.xerces.internal.parsers.XMLParser.parse(XMLParser.java:141)

at
com.sun.org.apache.xerces.internal.parsers.AbstractSAXParser.parse(AbstractSAXParser.java:1213)

at
com.sun.org.apache.xerces.internal.jaxp.SAXParserImpl$JAXPSAXParser.parse(SAXParserImpl.java:649)

at
com.sun.org.apache.xerces.internal.jaxp.SAXParserImpl.parse(SAXParserImpl.java:333)

at scala.xml.factory.XMLLoader$class.loadXML(XMLLoader.scala:40)

at scala.xml.XML$.loadXML(XML.scala:57)

at scala.xml.factory.XMLLoader$class.load(XMLLoader.scala:52)

at scala.xml.XML$.load(XML.scala:57)

at
com.typesafe.sbt.pom.MavenHelper$$anonfun$settingsXml$1.apply(MavenHelper.scala:225)

at
com.typesafe.sbt.pom.MavenHelper$$anonfun$settingsXml$1.apply(MavenHelper.scala:224)

at sbt.Using.apply(Using.scala:25)

at com.typesafe.sbt.pom.MavenHelper$.settingsXml(MavenHelper.scala:224)

at
com.typesafe.sbt.pom.MavenHelper$.settingsXmlServers(MavenHelper.scala:245)

at
com.typesafe.sbt.pom.MavenHelper$.createSbtCredentialsFromSettingsXml(MavenHelper.scala:291)

at
com.typesafe.sbt.pom.MavenHelper$$anonfun$pullSettingsFromPom$10.apply(MavenHelper.scala:83)

at
com.typesafe.sbt.pom.MavenHelper$$anonfun$pullSettingsFromPom$10.apply(MavenHelper.scala:83)

at
sbt.Scoped$RichInitialize$$anonfun$map$1$$anonfun$apply$3.apply(Structure.scala:177)

at sbt.std.Transform$$anon$3$$anonfun$apply$2.apply(System.scala:45)

at sbt.std.Transform$$anon$3$$anonfun$apply$2.apply(System.scala:45)

at sbt.std.Transform$$anon$4.work(System.scala:64)

at sbt.Execute$$anonfun$submit$1$$anonfun$apply$1.apply(Execute.scala:237)

at sbt.Execute$$anonfun$submit$1$$anonfun$apply$1.apply(Execute.scala:237)

at sbt.ErrorHandling$.wideConvert(ErrorHandling.scala:18)

at sbt.Execute.work(Execute.scala:244)

at sbt.Execute$$anonfun$submit$1.apply(Execute.scala:237)

at sbt.Execute$$anonfun$submit$1.apply(Execute.scala:237)

at
sbt.ConcurrentRestrictions$$anon$4$$anonfun$1.apply(ConcurrentRestrictions.scala:160)

at sbt.CompletionService$$anon$2.call(CompletionService.scala:30)

at java.util.concurrent.FutureTask.run(FutureTask.java:262)

at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)

at java.util.concurrent.FutureTask.run(FutureTask.java:262)

at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)

at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)

at java.lang.Thread.run(Thread.java:745)

Thanks.

Deb


Re: Master compilation with sbt

2014-07-19 Thread Mark Hamstra
 project mllib
.
.
.
 clean
.
.
.
 compile
.
.
.
test

...all works fine for me @2a732110d46712c535b75dd4f5a73761b6463aa8


On Sat, Jul 19, 2014 at 11:10 AM, Debasish Das debasish.da...@gmail.com
wrote:

 I am at the reservoir sampling commit:

 commit 586e716e47305cd7c2c3ff35c0e828b63ef2f6a8
 Author: Reynold Xin r...@apache.org
 Date:   Fri Jul 18 12:41:50 2014 -0700

 sbt/sbt -Dhttp.nonProxyHosts=132.197.10.21

  project mllib

 [info] Set current project to spark-mllib (in build
 file:/Users/v606014/spark-master/)

  compile

 [trace] Stack trace suppressed: run last mllib/*:credentials for the full
 output.

 [trace] Stack trace suppressed: run last core/*:credentials for the full
 output.

 [error] (mllib/*:credentials) org.xml.sax.SAXParseException; lineNumber: 4;
 columnNumber: 57; Element type settings must be followed by either
 attribute specifications,  or /.

 [error] (core/*:credentials) org.xml.sax.SAXParseException; lineNumber: 4;
 columnNumber: 57; Element type settings must be followed by either
 attribute specifications,  or /.

 [error] Total time: 0 s, completed Jul 19, 2014 6:09:24 PM
 On Sat, Jul 19, 2014 at 11:02 AM, Debasish Das debasish.da...@gmail.com
 wrote:

  Hi,
 
  Is sbt still used for master compilation ? I could compile for
  2.3.0-cdh5.0.2 using maven following the instructions from the website:
 
  http://spark.apache.org/docs/latest/building-with-maven.html
 
  But when I am trying to use sbt for local testing and then I am getting
  some weird errors...Is sbt still used by developers ? I am using JDK7...
 
  org.xml.sax.SAXParseException; lineNumber: 4; columnNumber: 57; Element
  type settings must be followed by either attribute specifications, 
 or
  /.
 
  at
 
 com.sun.org.apache.xerces.internal.util.ErrorHandlerWrapper.createSAXParseException(ErrorHandlerWrapper.java:198)
 
  at
 
 com.sun.org.apache.xerces.internal.util.ErrorHandlerWrapper.fatalError(ErrorHandlerWrapper.java:177)
 
  at
 
 com.sun.org.apache.xerces.internal.impl.XMLErrorReporter.reportError(XMLErrorReporter.java:441)
 
  at
 
 com.sun.org.apache.xerces.internal.impl.XMLErrorReporter.reportError(XMLErrorReporter.java:368)
 
  at
 
 com.sun.org.apache.xerces.internal.impl.XMLScanner.reportFatalError(XMLScanner.java:1436)
 
  at
 
 com.sun.org.apache.xerces.internal.impl.XMLDocumentFragmentScannerImpl.seekCloseOfStartTag(XMLDocumentFragmentScannerImpl.java:1394)
 
  at
 
 com.sun.org.apache.xerces.internal.impl.XMLDocumentFragmentScannerImpl.scanStartElement(XMLDocumentFragmentScannerImpl.java:1327)
 
  at
 
 com.sun.org.apache.xerces.internal.impl.XMLDocumentScannerImpl$ContentDriver.scanRootElementHook(XMLDocumentScannerImpl.java:1292)
 
  at
 
 com.sun.org.apache.xerces.internal.impl.XMLDocumentFragmentScannerImpl$FragmentContentDriver.next(XMLDocumentFragmentScannerImpl.java:3122)
 
  at
 
 com.sun.org.apache.xerces.internal.impl.XMLDocumentScannerImpl$PrologDriver.next(XMLDocumentScannerImpl.java:880)
 
  at
 
 com.sun.org.apache.xerces.internal.impl.XMLDocumentScannerImpl.next(XMLDocumentScannerImpl.java:606)
 
  at
 
 com.sun.org.apache.xerces.internal.impl.XMLDocumentFragmentScannerImpl.scanDocument(XMLDocumentFragmentScannerImpl.java:510)
 
  at
 
 com.sun.org.apache.xerces.internal.parsers.XML11Configuration.parse(XML11Configuration.java:848)
 
  at
 
 com.sun.org.apache.xerces.internal.parsers.XML11Configuration.parse(XML11Configuration.java:777)
 
  at
 
 com.sun.org.apache.xerces.internal.parsers.XMLParser.parse(XMLParser.java:141)
 
  at
 
 com.sun.org.apache.xerces.internal.parsers.AbstractSAXParser.parse(AbstractSAXParser.java:1213)
 
  at
 
 com.sun.org.apache.xerces.internal.jaxp.SAXParserImpl$JAXPSAXParser.parse(SAXParserImpl.java:649)
 
  at
 
 com.sun.org.apache.xerces.internal.jaxp.SAXParserImpl.parse(SAXParserImpl.java:333)
 
  at scala.xml.factory.XMLLoader$class.loadXML(XMLLoader.scala:40)
 
  at scala.xml.XML$.loadXML(XML.scala:57)
 
  at scala.xml.factory.XMLLoader$class.load(XMLLoader.scala:52)
 
  at scala.xml.XML$.load(XML.scala:57)
 
  at
 
 com.typesafe.sbt.pom.MavenHelper$$anonfun$settingsXml$1.apply(MavenHelper.scala:225)
 
  at
 
 com.typesafe.sbt.pom.MavenHelper$$anonfun$settingsXml$1.apply(MavenHelper.scala:224)
 
  at sbt.Using.apply(Using.scala:25)
 
  at com.typesafe.sbt.pom.MavenHelper$.settingsXml(MavenHelper.scala:224)
 
  at
 
 com.typesafe.sbt.pom.MavenHelper$.settingsXmlServers(MavenHelper.scala:245)
 
  at
 
 com.typesafe.sbt.pom.MavenHelper$.createSbtCredentialsFromSettingsXml(MavenHelper.scala:291)
 
  at
 
 com.typesafe.sbt.pom.MavenHelper$$anonfun$pullSettingsFromPom$10.apply(MavenHelper.scala:83)
 
  at
 
 com.typesafe.sbt.pom.MavenHelper$$anonfun$pullSettingsFromPom$10.apply(MavenHelper.scala:83)
 
  at
 
 sbt.Scoped$RichInitialize$$anonfun$map$1$$anonfun$apply$3.apply(Structure.scala:177)
 
  at sbt.std.Transform$$anon$3$$anonfun$apply$2.apply(System.scala:45)
 
  at sbt.std.Transform$$anon$3$$anonfun$apply$2.apply(System.scala:45)
 
  at 

Re: Master compilation with sbt

2014-07-19 Thread Chester Chen
Works for me as well:


git branch

  branch-0.9

  branch-1.0

* master

Chesters-MacBook-Pro:spark chester$ git pull --rebase

remote: Counting objects: 578, done.

remote: Compressing objects: 100% (369/369), done.

remote: Total 578 (delta 122), reused 418 (delta 71)

Receiving objects: 100% (578/578), 432.42 KiB | 354.00 KiB/s, done.

Resolving deltas: 100% (122/122), done.

From https://github.com/apache/spark

   9c24974..2a73211  master - origin/master

   8e5604b..c93f4a0  branch-0.9 - origin/branch-0.9

   0b0b895..7611840  branch-1.0 - origin/branch-1.0

From https://github.com/apache/spark

 * [new tag] v0.9.2-rc1 - v0.9.2-rc1

First, rewinding head to replay your work on top of it...

Fast-forwarded master to 2a732110d46712c535b75dd4f5a73761b6463aa8.


Chesters-MacBook-Pro:spark chester$ sbt/sbt package



[info] Done packaging.

[success] Total time: 146 s, completed Jul 19, 2014 1:08:52 PM





On Sat, Jul 19, 2014 at 12:50 PM, Mark Hamstra m...@clearstorydata.com
wrote:

  project mllib
 .
 .
 .
  clean
 .
 .
 .
  compile
 .
 .
 .
 test

 ...all works fine for me @2a732110d46712c535b75dd4f5a73761b6463aa8


 On Sat, Jul 19, 2014 at 11:10 AM, Debasish Das debasish.da...@gmail.com
 wrote:

  I am at the reservoir sampling commit:
 
  commit 586e716e47305cd7c2c3ff35c0e828b63ef2f6a8
  Author: Reynold Xin r...@apache.org
  Date:   Fri Jul 18 12:41:50 2014 -0700
 
  sbt/sbt -Dhttp.nonProxyHosts=132.197.10.21
 
   project mllib
 
  [info] Set current project to spark-mllib (in build
  file:/Users/v606014/spark-master/)
 
   compile
 
  [trace] Stack trace suppressed: run last mllib/*:credentials for the full
  output.
 
  [trace] Stack trace suppressed: run last core/*:credentials for the full
  output.
 
  [error] (mllib/*:credentials) org.xml.sax.SAXParseException; lineNumber:
 4;
  columnNumber: 57; Element type settings must be followed by either
  attribute specifications,  or /.
 
  [error] (core/*:credentials) org.xml.sax.SAXParseException; lineNumber:
 4;
  columnNumber: 57; Element type settings must be followed by either
  attribute specifications,  or /.
 
  [error] Total time: 0 s, completed Jul 19, 2014 6:09:24 PM
  On Sat, Jul 19, 2014 at 11:02 AM, Debasish Das debasish.da...@gmail.com
 
  wrote:
 
   Hi,
  
   Is sbt still used for master compilation ? I could compile for
   2.3.0-cdh5.0.2 using maven following the instructions from the website:
  
   http://spark.apache.org/docs/latest/building-with-maven.html
  
   But when I am trying to use sbt for local testing and then I am getting
   some weird errors...Is sbt still used by developers ? I am using
 JDK7...
  
   org.xml.sax.SAXParseException; lineNumber: 4; columnNumber: 57; Element
   type settings must be followed by either attribute specifications,
 
  or
   /.
  
   at
  
 
 com.sun.org.apache.xerces.internal.util.ErrorHandlerWrapper.createSAXParseException(ErrorHandlerWrapper.java:198)
  
   at
  
 
 com.sun.org.apache.xerces.internal.util.ErrorHandlerWrapper.fatalError(ErrorHandlerWrapper.java:177)
  
   at
  
 
 com.sun.org.apache.xerces.internal.impl.XMLErrorReporter.reportError(XMLErrorReporter.java:441)
  
   at
  
 
 com.sun.org.apache.xerces.internal.impl.XMLErrorReporter.reportError(XMLErrorReporter.java:368)
  
   at
  
 
 com.sun.org.apache.xerces.internal.impl.XMLScanner.reportFatalError(XMLScanner.java:1436)
  
   at
  
 
 com.sun.org.apache.xerces.internal.impl.XMLDocumentFragmentScannerImpl.seekCloseOfStartTag(XMLDocumentFragmentScannerImpl.java:1394)
  
   at
  
 
 com.sun.org.apache.xerces.internal.impl.XMLDocumentFragmentScannerImpl.scanStartElement(XMLDocumentFragmentScannerImpl.java:1327)
  
   at
  
 
 com.sun.org.apache.xerces.internal.impl.XMLDocumentScannerImpl$ContentDriver.scanRootElementHook(XMLDocumentScannerImpl.java:1292)
  
   at
  
 
 com.sun.org.apache.xerces.internal.impl.XMLDocumentFragmentScannerImpl$FragmentContentDriver.next(XMLDocumentFragmentScannerImpl.java:3122)
  
   at
  
 
 com.sun.org.apache.xerces.internal.impl.XMLDocumentScannerImpl$PrologDriver.next(XMLDocumentScannerImpl.java:880)
  
   at
  
 
 com.sun.org.apache.xerces.internal.impl.XMLDocumentScannerImpl.next(XMLDocumentScannerImpl.java:606)
  
   at
  
 
 com.sun.org.apache.xerces.internal.impl.XMLDocumentFragmentScannerImpl.scanDocument(XMLDocumentFragmentScannerImpl.java:510)
  
   at
  
 
 com.sun.org.apache.xerces.internal.parsers.XML11Configuration.parse(XML11Configuration.java:848)
  
   at
  
 
 com.sun.org.apache.xerces.internal.parsers.XML11Configuration.parse(XML11Configuration.java:777)
  
   at
  
 
 com.sun.org.apache.xerces.internal.parsers.XMLParser.parse(XMLParser.java:141)
  
   at
  
 
 com.sun.org.apache.xerces.internal.parsers.AbstractSAXParser.parse(AbstractSAXParser.java:1213)
  
   at
  
 
 com.sun.org.apache.xerces.internal.jaxp.SAXParserImpl$JAXPSAXParser.parse(SAXParserImpl.java:649)
  
   at
  
 
 

Pull requests will be automatically linked to JIRA when submitted

2014-07-19 Thread Patrick Wendell
Just a small note, today I committed a tool that will automatically
mirror pull requests to JIRA issues, so contributors will no longer
have to manually post a pull request on the JIRA when they make one.

It will create a link on the JIRA and also make a comment to trigger
an e-mail to people watching.

This should make some things easier, such as avoiding accidental
duplicate effort on the same JIRA.

- Patrick