Re: Buidling spark in Eclipse Kepler

2014-08-07 Thread Sean Owen
(Don't use gen-idea, just open it directly as a Maven project in IntelliJ.) On Thu, Aug 7, 2014 at 4:53 AM, Ron Gonzalez zlgonza...@yahoo.com.invalid wrote: So I downloaded community edition of IntelliJ, and ran sbt/sbt gen-idea. I then imported the pom.xml file. I'm still getting all sorts of

Re: Documentation confusing or incorrect for decision trees?

2014-08-07 Thread Sean Owen
It's definitely just a typo. The ordered categories are A, C, B so the other split can't be A | B, C. Just open a PR. On Thu, Aug 7, 2014 at 2:11 AM, Matt Forbes m...@tellapart.com wrote: I found the section on ordering categorical features really interesting, but the A, B, C example seemed

[SNAPSHOT] Snapshot1 of Spark 1.1.0 has been posted

2014-08-07 Thread Patrick Wendell
Hi All, I've packaged and published a snapshot release of Spark 1.1 for testing. This is being distributed to the community for QA and preview purposes. It is not yet an official RC for voting. Going forward, we'll do preview releases like this for testing ahead of official votes. The tag of

Re: [SNAPSHOT] Snapshot1 of Spark 1.1.0 has been posted

2014-08-07 Thread Patrick Wendell
Minor correction: the encoded URL in the staging repo link was wrong. The correct repo is: https://repository.apache.org/content/repositories/orgapachespark-1025/ On Wed, Aug 6, 2014 at 11:23 PM, Patrick Wendell pwend...@gmail.com wrote: Hi All, I've packaged and published a snapshot release

Re: Buidling spark in Eclipse Kepler

2014-08-07 Thread Madhu
Ron, I was able to build core in Eclipse following these steps: https://cwiki.apache.org/confluence/display/SPARK/Contributing+to+Spark#ContributingtoSpark-Eclipse I was working only on core, so I know that works in Eclipse Juno. I haven't tried yarn or other Eclipse releases. Are you able to

Re: Unit test best practice for Spark-derived projects

2014-08-07 Thread Madhu
How long does it take to get a spark context? I found that if you don't have a network connection (reverse DNS lookup most likely), it can take up 30 seconds to start up locally. I think a hosts file entry is sufficient. - -- Madhu https://www.linkedin.com/in/msiddalingaiah -- View this

Re: Unit test best practice for Spark-derived projects

2014-08-07 Thread Dmitriy Lyubimov
Thanks. let me check this hypothesis (i have dhcp connection on a private net but consequently not sure if there's an inverse). On Thu, Aug 7, 2014 at 10:29 AM, Madhu ma...@madhu.com wrote: How long does it take to get a spark context? I found that if you don't have a network connection

Re: replacement for SPARK_JAVA_OPTS

2014-08-07 Thread Cody Koeninger
Just wanted to check in on this, see if I should file a bug report regarding the mesos argument propagation. On Thu, Jul 31, 2014 at 8:35 AM, Cody Koeninger c...@koeninger.org wrote: 1. I've tried with and without escaping equals sign, it doesn't affect the results. 2. Yeah, exporting

Re: replacement for SPARK_JAVA_OPTS

2014-08-07 Thread Marcelo Vanzin
Andrew has been working on a fix: https://github.com/apache/spark/pull/1770 On Thu, Aug 7, 2014 at 2:35 PM, Cody Koeninger c...@koeninger.org wrote: Just wanted to check in on this, see if I should file a bug report regarding the mesos argument propagation. On Thu, Jul 31, 2014 at 8:35 AM,

Re: Unit test best practice for Spark-derived projects

2014-08-07 Thread Patrick Wendell
In the past I've found if I do a jstack when running some tests, it sits forever inside of a hostname resolution step or something. I never narrowed it down, though. - Patrick On Thu, Aug 7, 2014 at 10:45 AM, Dmitriy Lyubimov dlie...@gmail.com wrote: Thanks. let me check this hypothesis (i

Re: Buidling spark in Eclipse Kepler

2014-08-07 Thread Ron Gonzalez
So I opened it as a maven project (I opened it using the top-level pom.xml file), but rebuilding the project ends up in all sorts of errors about unresolved dependencies. Thanks, Ron On Wednesday, August 6, 2014 11:15 PM, Sean Owen so...@cloudera.com wrote: (Don't use gen-idea, just open

Re: replacement for SPARK_JAVA_OPTS

2014-08-07 Thread Gary Malouf
Can this be cherry-picked for 1.1 if everything works out? In my opinion, it could be qualified as a bug fix. On Thu, Aug 7, 2014 at 5:47 PM, Marcelo Vanzin van...@cloudera.com wrote: Andrew has been working on a fix: https://github.com/apache/spark/pull/1770 On Thu, Aug 7, 2014 at 2:35

Re: replacement for SPARK_JAVA_OPTS

2014-08-07 Thread Andrew Or
Thanks Marcelo, I have moved the changes to a new PR to describe the problems more clearly: https://github.com/apache/spark/pull/1845 @Gary Yeah, the goal is to get this into 1.1 as a bug fix. 2014-08-07 17:30 GMT-07:00 Gary Malouf malouf.g...@gmail.com: Can this be cherry-picked for 1.1 if

Re: Fine-Grained Scheduler on Yarn

2014-08-07 Thread Jun Feng Liu
Any one know the answer? Best Regards Jun Feng Liu IBM China Systems Technology Laboratory in Beijing Phone: 86-10-82452683 E-mail: liuj...@cn.ibm.com BLD 28,ZGC Software Park No.8 Rd.Dong Bei Wang West, Dist.Haidian Beijing 100193 China Jun Feng Liu/China/IBM 2014/08/07

Re: replacement for SPARK_JAVA_OPTS

2014-08-07 Thread Andrew Or
@Cody I took a quick glance at the Mesos code and it appears that we currently do not even pass extra java options to executors except in coarse grained mode, and even in this mode we do not pass them to executors correctly. I have filed a related JIRA here:

Re: replacement for SPARK_JAVA_OPTS

2014-08-07 Thread Patrick Wendell
Andrew - I think your JIRA may duplicate existing work: https://github.com/apache/spark/pull/1513 On Thu, Aug 7, 2014 at 7:55 PM, Andrew Or and...@databricks.com wrote: @Cody I took a quick glance at the Mesos code and it appears that we currently do not even pass extra java options to

Re: Fine-Grained Scheduler on Yarn

2014-08-07 Thread Patrick Wendell
The current YARN is equivalent to what is called fine grained mode in Mesos. The scheduling of tasks happens totally inside of the Spark driver. On Thu, Aug 7, 2014 at 7:50 PM, Jun Feng Liu liuj...@cn.ibm.com wrote: Any one know the answer? Best Regards *Jun Feng Liu* IBM China Systems

Re: Fine-Grained Scheduler on Yarn

2014-08-07 Thread Patrick Wendell
Hey sorry about that - what I said was the opposite of what is true. The current YARN mode is equivalent to coarse grained mesos. There is no fine-grained scheduling on YARN at the moment. I'm not sure YARN supports scheduling in units other than containers. Fine-grained scheduling requires

Re: replacement for SPARK_JAVA_OPTS

2014-08-07 Thread Andrew Or
Ah, great to know this is already being fixed. Thanks Patrick, I have marked my JIRA as a duplicate. 2014-08-07 21:42 GMT-07:00 Patrick Wendell pwend...@gmail.com: Andrew - I think your JIRA may duplicate existing work: https://github.com/apache/spark/pull/1513 On Thu, Aug 7, 2014 at 7:55

Re: Fine-Grained Scheduler on Yarn

2014-08-07 Thread Jun Feng Liu
Thanks for echo on this. Possible to adjust resource based on container numbers? e.g to allocate more container when driver need more resources and return some resource by delete some container when parts of container already have enough cores/memory Best Regards Jun Feng Liu IBM China