Re: Compie error with XML elements

2014-08-29 Thread Patrick Wendell
In some cases IntelliJ's Scala compiler can't compile valid Scala
source files. Hopefully they fix (or have fixed) this in a newer
version.

- Patrick

On Fri, Aug 29, 2014 at 11:38 AM, Yi Tian  wrote:
> Hi, Devl!
>
> I got the same problem.
>
> You can try to upgrade your scala plugins to  0.41.2
>
> It works on my mac.
>
> On Aug 12, 2014, at 15:19, Devl Devel  wrote:
>
>> When compiling the master checkout of spark. The Intellij compile fails
>> with:
>>
>>Error:(45, 8) not found: value $scope
>>  
>>   ^
>> which is caused by HTML elements in classes like HistoryPage.scala:
>>
>>val content =
>>  
>>...
>>
>> How can I compile these classes that have html node elements in them?
>>
>> Thanks in advance.
>
>
> -
> To unsubscribe, e-mail: dev-unsubscr...@spark.apache.org
> For additional commands, e-mail: dev-h...@spark.apache.org
>

-
To unsubscribe, e-mail: dev-unsubscr...@spark.apache.org
For additional commands, e-mail: dev-h...@spark.apache.org



Need to check approach for continuing development on Spark

2014-08-29 Thread smalpani
Hi,

We are developing an app in Spring in which we are using Cassandra and
calling datastax api's from Java to query it. The internal library is
responsible for calling cassandra and other data sources like RDS. We are
calling several client API's from Spark provided by the client-jar to
perform certain operations on that data like:
1. Reading data from S3 and inserting in cassandra by providing the objects
through API and then internally the API will store in cassandra.
2. Taking the data from cassandra through API as objects and then processing
on that data to generate metrics and saving it in cassandra through APi's
only.
3. Then internally through those API's only calculating aggregates and
separating data in bands etc.

The whole project is driven by Spring. Please let me know if we are
approaching towards it fine.




--
View this message in context: 
http://apache-spark-developers-list.1001551.n3.nabble.com/Need-to-check-approach-for-continuing-development-on-Spark-tp8142.html
Sent from the Apache Spark Developers List mailing list archive at Nabble.com.

-
To unsubscribe, e-mail: dev-unsubscr...@spark.apache.org
For additional commands, e-mail: dev-h...@spark.apache.org



Re: [VOTE] Release Apache Spark 1.1.0 (RC2)

2014-08-29 Thread Jeremy Freeman
+1. Validated several custom analysis pipelines on a private cluster in
standalone mode. Tested new PySpark support for arbitrary Hadoop input
formats, works great!

-- Jeremy



--
View this message in context: 
http://apache-spark-developers-list.1001551.n3.nabble.com/VOTE-Release-Apache-Spark-1-1-0-RC2-tp8107p8143.html
Sent from the Apache Spark Developers List mailing list archive at Nabble.com.

-
To unsubscribe, e-mail: dev-unsubscr...@spark.apache.org
For additional commands, e-mail: dev-h...@spark.apache.org



Re: [VOTE] Release Apache Spark 1.1.0 (RC2)

2014-08-29 Thread Cheng Lian
Just noticed one thing: although --with-hive is deprecated by -Phive,
make-distribution.sh still relies on $SPARK_HIVE (which was controlled by
--with-hive) to determine whether to include datanucleus jar files. This
means we have to do something like SPARK_HIVE=true ./make-distribution.sh
... to enable Hive support. Otherwise datanucleus jars are not included in
lib/.

This issue is similar to SPARK-3234
, both
SPARK_HADOOP_VERSION and SPARK_HIVE are controlled by some deprecated
command line options.
​


On Fri, Aug 29, 2014 at 11:18 AM, Patrick Wendell 
wrote:

> Oh darn - I missed this update. GRR, unfortunately I think this means
> I'll need to cut a new RC. Thanks for catching this Nick.
>
> On Fri, Aug 29, 2014 at 10:18 AM, Nicholas Chammas
>  wrote:
> > [Let me know if I should be posting these comments in a different
> thread.]
> >
> > Should the default Spark version in spark-ec2 be updated for this
> release?
> >
> > Nick
> >
> >
> >
> > On Fri, Aug 29, 2014 at 12:55 PM, Patrick Wendell 
> > wrote:
> >>
> >> Hey Nicholas,
> >>
> >> Thanks for this, we can merge in doc changes outside of the actual
> >> release timeline, so we'll make sure to loop those changes in before
> >> we publish the final 1.1 docs.
> >>
> >> - Patrick
> >>
> >> On Fri, Aug 29, 2014 at 9:24 AM, Nicholas Chammas
> >>  wrote:
> >> > There were several formatting and typographical errors in the SQL docs
> >> > that
> >> > I've fixed in this PR. Dunno if we want to roll that into the release.
> >> >
> >> >
> >> > On Fri, Aug 29, 2014 at 12:17 PM, Patrick Wendell  >
> >> > wrote:
> >> >>
> >> >> Okay I'll plan to add cdh4 binary as well for the final release!
> >> >>
> >> >> ---
> >> >> sent from my phone
> >> >> On Aug 29, 2014 8:26 AM, "Ye Xianjin"  wrote:
> >> >>
> >> >> > We just used CDH 4.7 for our production cluster. And I believe we
> >> >> > won't
> >> >> > use CDH 5 in the next year.
> >> >> >
> >> >> > Sent from my iPhone
> >> >> >
> >> >> > > On 2014年8月29日, at 14:39, Matei Zaharia 
> >> >> > > wrote:
> >> >> > >
> >> >> > > Personally I'd actually consider putting CDH4 back if there are
> >> >> > > still
> >> >> > users on it. It's always better to be inclusive, and the
> convenience
> >> >> > of
> >> >> > a
> >> >> > one-click download is high. Do we have a sense on what % of CDH
> users
> >> >> > still
> >> >> > use CDH4?
> >> >> > >
> >> >> > > Matei
> >> >> > >
> >> >> > > On August 28, 2014 at 11:31:13 PM, Sean Owen (so...@cloudera.com
> )
> >> >> > > wrote:
> >> >> > >
> >> >> > > (Copying my reply since I don't know if it goes to the mailing
> >> >> > > list)
> >> >> > >
> >> >> > > Great, thanks for explaining the reasoning. You're saying these
> >> >> > > aren't
> >> >> > > going into the final release? I think that moots any issue
> >> >> > > surrounding
> >> >> > > distributing them then.
> >> >> > >
> >> >> > > This is all I know of from the ASF:
> >> >> > > https://community.apache.org/projectIndependence.html I don't
> read
> >> >> > > it
> >> >> > > as expressly forbidding this kind of thing although you can see
> how
> >> >> > > it
> >> >> > > bumps up against the spirit. There's not a bright line -- what
> >> >> > > about
> >> >> > > Tomcat providing binaries compiled for Windows for example? does
> >> >> > > that
> >> >> > > favor an OS vendor?
> >> >> > >
> >> >> > > From this technical ASF perspective only the releases matter --
> do
> >> >> > > what you want with snapshots and RCs. The only issue there is
> maybe
> >> >> > > releasing something different than was in the RC; is that at all
> >> >> > > confusing? Just needs a note.
> >> >> > >
> >> >> > > I think this theoretical issue doesn't exist if these binaries
> >> >> > > aren't
> >> >> > > released, so I see no reason to not proceed.
> >> >> > >
> >> >> > > The rest is a different question about whether you want to spend
> >> >> > > time
> >> >> > > maintaining this profile and candidate. The vendor already
> manages
> >> >> > > their build I think and -- and I don't know -- may even prefer
> not
> >> >> > > to
> >> >> > > have a different special build floating around. There's also the
> >> >> > > theoretical argument that this turns off other vendors from
> >> >> > > adopting
> >> >> > > Spark if it's perceived to be too connected to other vendors. I'd
> >> >> > > like
> >> >> > > to maximize Spark's distribution and there's some argument you do
> >> >> > > this
> >> >> > > by not making vendor profiles. But as I say a different question
> to
> >> >> > > just think about over time...
> >> >> > >
> >> >> > > (oh and PS for my part I think it's a good thing that CDH4
> binaries
> >> >> > > were removed. I wasn't arguing for resurrecting them)
> >> >> > >
> >> >> > >> On Fri, Aug 29, 2014 at 7:26 AM, Patrick Wendell
> >> >> > >> 
> >> >> > wrote:
> >> >> > >> Hey Sean,
> >> >> > >>
> >> >> > >> The reason there are no longer CDH-specific builds is that all
> >> >> > >> newer
> >> >> > >> ver

Re: [VOTE] Release Apache Spark 1.1.0 (RC2)

2014-08-29 Thread Marcelo Vanzin
In our internal projects we use this bit of code in the maven pom to
create a properties file with build information (sorry for the messy
indentation). Then we have code that reads this property file
somewhere and provides that info. This should make it easier to not
have to change version numbers in Scala/Java/Python code ever again.
:-)

Shouldn't be hard to do something like that in sbt (actually should be
much easier).


  
org.apache.maven.plugins
maven-antrun-plugin
1.6

  
build-info
compile

  run


  


  

  
  

  

  

buildRevision: ${build.hash}



  

  


  
ant-contrib
ant-contrib
1.0b3

  
ant
ant
  

  

  


On Fri, Aug 29, 2014 at 11:43 AM, Nicholas Chammas
 wrote:
> Sounds good. As an FYI, we had this problem with the 1.0.2 release
> . Is there perhaps some
> kind of automated check we can make to catch this for us in the future?
> Where would it go?
>
>
> On Fri, Aug 29, 2014 at 2:18 PM, Patrick Wendell  wrote:
>
>> Oh darn - I missed this update. GRR, unfortunately I think this means
>> I'll need to cut a new RC. Thanks for catching this Nick.
>>
>> On Fri, Aug 29, 2014 at 10:18 AM, Nicholas Chammas
>>  wrote:
>> > [Let me know if I should be posting these comments in a different
>> thread.]
>> >
>> > Should the default Spark version in spark-ec2 be updated for this
>> release?
>> >
>> > Nick
>> >
>> >
>> >
>> > On Fri, Aug 29, 2014 at 12:55 PM, Patrick Wendell 
>> > wrote:
>> >>
>> >> Hey Nicholas,
>> >>
>> >> Thanks for this, we can merge in doc changes outside of the actual
>> >> release timeline, so we'll make sure to loop those changes in before
>> >> we publish the final 1.1 docs.
>> >>
>> >> - Patrick
>> >>
>> >> On Fri, Aug 29, 2014 at 9:24 AM, Nicholas Chammas
>> >>  wrote:
>> >> > There were several formatting and typographical errors in the SQL docs
>> >> > that
>> >> > I've fixed in this PR. Dunno if we want to roll that into the release.
>> >> >
>> >> >
>> >> > On Fri, Aug 29, 2014 at 12:17 PM, Patrick Wendell > >
>> >> > wrote:
>> >> >>
>> >> >> Okay I'll plan to add cdh4 binary as well for the final release!
>> >> >>
>> >> >> ---
>> >> >> sent from my phone
>> >> >> On Aug 29, 2014 8:26 AM, "Ye Xianjin"  wrote:
>> >> >>
>> >> >> > We just used CDH 4.7 for our production cluster. And I believe we
>> >> >> > won't
>> >> >> > use CDH 5 in the next year.
>> >> >> >
>> >> >> > Sent from my iPhone
>> >> >> >
>> >> >> > > On 2014年8月29日, at 14:39, Matei Zaharia 
>> >> >> > > wrote:
>> >> >> > >
>> >> >> > > Personally I'd actually consider putting CDH4 back if there are
>> >> >> > > still
>> >> >> > users on it. It's always better to be inclusive, and the
>> convenience
>> >> >> > of
>> >> >> > a
>> >> >> > one-click download is high. Do we have a sense on what % of CDH
>> users
>> >> >> > still
>> >> >> > use CDH4?
>> >> >> > >
>> >> >> > > Matei
>> >> >> > >
>> >> >> > > On August 28, 2014 at 11:31:13 PM, Sean Owen (so...@cloudera.com
>> )
>> >> >> > > wrote:
>> >> >> > >
>> >> >> > > (Copying my reply since I don't know if it goes to the mailing
>> >> >> > > list)
>> >> >> > >
>> >> >> > > Great, thanks for explaining the reasoning. You're saying these
>> >> >> > > aren't
>> >> >> > > going into the final release? I think that moots any issue
>> >> >> > > surrounding
>> >> >> > > distributing them then.
>> >> >> > >
>> >> >> > > This is all I know of from the ASF:
>> >> >> > > https://community.apache.org/projectIndependence.html I don't
>> read
>> >> >> > > it
>> >> >> > > as expressly forbidding this kind of thing although you can see
>> how
>> >> >> > > it
>> >> >> > > bumps up against the spirit. There's not a bright line -- what
>> >> >> > > about
>> >> >> > > Tomcat providing binaries compiled for Windows for example? does
>> >> >> > > that
>> >> >> > > favor an OS vendor?
>> >> >> > >
>> >> >> > > From this technical ASF perspective only the releases matter --
>> do
>> >> >> > > what you want with snapshots and RCs. The only issue there is
>> maybe
>> >> >> > > releasing something different than was in the RC; is that at all
>> >> >> > > confusing? Just needs a note.
>> >> >> > >
>> >> >> > > I think this theoretical issue doesn't exist if these binaries
>> >> >> > > aren't
>> >> >> > > released, so I see no reason to not proceed.
>> >> >> > >
>> >> >> > > The rest is a different question abou

Re: [VOTE] Release Apache Spark 1.1.0 (RC2)

2014-08-29 Thread Nicholas Chammas
Sounds good. As an FYI, we had this problem with the 1.0.2 release
. Is there perhaps some
kind of automated check we can make to catch this for us in the future?
Where would it go?


On Fri, Aug 29, 2014 at 2:18 PM, Patrick Wendell  wrote:

> Oh darn - I missed this update. GRR, unfortunately I think this means
> I'll need to cut a new RC. Thanks for catching this Nick.
>
> On Fri, Aug 29, 2014 at 10:18 AM, Nicholas Chammas
>  wrote:
> > [Let me know if I should be posting these comments in a different
> thread.]
> >
> > Should the default Spark version in spark-ec2 be updated for this
> release?
> >
> > Nick
> >
> >
> >
> > On Fri, Aug 29, 2014 at 12:55 PM, Patrick Wendell 
> > wrote:
> >>
> >> Hey Nicholas,
> >>
> >> Thanks for this, we can merge in doc changes outside of the actual
> >> release timeline, so we'll make sure to loop those changes in before
> >> we publish the final 1.1 docs.
> >>
> >> - Patrick
> >>
> >> On Fri, Aug 29, 2014 at 9:24 AM, Nicholas Chammas
> >>  wrote:
> >> > There were several formatting and typographical errors in the SQL docs
> >> > that
> >> > I've fixed in this PR. Dunno if we want to roll that into the release.
> >> >
> >> >
> >> > On Fri, Aug 29, 2014 at 12:17 PM, Patrick Wendell  >
> >> > wrote:
> >> >>
> >> >> Okay I'll plan to add cdh4 binary as well for the final release!
> >> >>
> >> >> ---
> >> >> sent from my phone
> >> >> On Aug 29, 2014 8:26 AM, "Ye Xianjin"  wrote:
> >> >>
> >> >> > We just used CDH 4.7 for our production cluster. And I believe we
> >> >> > won't
> >> >> > use CDH 5 in the next year.
> >> >> >
> >> >> > Sent from my iPhone
> >> >> >
> >> >> > > On 2014年8月29日, at 14:39, Matei Zaharia 
> >> >> > > wrote:
> >> >> > >
> >> >> > > Personally I'd actually consider putting CDH4 back if there are
> >> >> > > still
> >> >> > users on it. It's always better to be inclusive, and the
> convenience
> >> >> > of
> >> >> > a
> >> >> > one-click download is high. Do we have a sense on what % of CDH
> users
> >> >> > still
> >> >> > use CDH4?
> >> >> > >
> >> >> > > Matei
> >> >> > >
> >> >> > > On August 28, 2014 at 11:31:13 PM, Sean Owen (so...@cloudera.com
> )
> >> >> > > wrote:
> >> >> > >
> >> >> > > (Copying my reply since I don't know if it goes to the mailing
> >> >> > > list)
> >> >> > >
> >> >> > > Great, thanks for explaining the reasoning. You're saying these
> >> >> > > aren't
> >> >> > > going into the final release? I think that moots any issue
> >> >> > > surrounding
> >> >> > > distributing them then.
> >> >> > >
> >> >> > > This is all I know of from the ASF:
> >> >> > > https://community.apache.org/projectIndependence.html I don't
> read
> >> >> > > it
> >> >> > > as expressly forbidding this kind of thing although you can see
> how
> >> >> > > it
> >> >> > > bumps up against the spirit. There's not a bright line -- what
> >> >> > > about
> >> >> > > Tomcat providing binaries compiled for Windows for example? does
> >> >> > > that
> >> >> > > favor an OS vendor?
> >> >> > >
> >> >> > > From this technical ASF perspective only the releases matter --
> do
> >> >> > > what you want with snapshots and RCs. The only issue there is
> maybe
> >> >> > > releasing something different than was in the RC; is that at all
> >> >> > > confusing? Just needs a note.
> >> >> > >
> >> >> > > I think this theoretical issue doesn't exist if these binaries
> >> >> > > aren't
> >> >> > > released, so I see no reason to not proceed.
> >> >> > >
> >> >> > > The rest is a different question about whether you want to spend
> >> >> > > time
> >> >> > > maintaining this profile and candidate. The vendor already
> manages
> >> >> > > their build I think and -- and I don't know -- may even prefer
> not
> >> >> > > to
> >> >> > > have a different special build floating around. There's also the
> >> >> > > theoretical argument that this turns off other vendors from
> >> >> > > adopting
> >> >> > > Spark if it's perceived to be too connected to other vendors. I'd
> >> >> > > like
> >> >> > > to maximize Spark's distribution and there's some argument you do
> >> >> > > this
> >> >> > > by not making vendor profiles. But as I say a different question
> to
> >> >> > > just think about over time...
> >> >> > >
> >> >> > > (oh and PS for my part I think it's a good thing that CDH4
> binaries
> >> >> > > were removed. I wasn't arguing for resurrecting them)
> >> >> > >
> >> >> > >> On Fri, Aug 29, 2014 at 7:26 AM, Patrick Wendell
> >> >> > >> 
> >> >> > wrote:
> >> >> > >> Hey Sean,
> >> >> > >>
> >> >> > >> The reason there are no longer CDH-specific builds is that all
> >> >> > >> newer
> >> >> > >> versions of CDH and HDP work with builds for the upstream Hadoop
> >> >> > >> projects. I dropped CDH4 in favor of a newer Hadoop version
> (2.4)
> >> >> > >> and
> >> >> > >> the Hadoop-without-Hive (also 2.4) build.
> >> >> > >>
> >> >> > >> For MapR - we can't officially post those artifacts on ASF web
> >> >> > >> space
> >> >

Re: Compie error with XML elements

2014-08-29 Thread Yi Tian
Hi, Devl!

I got the same problem.

You can try to upgrade your scala plugins to  0.41.2

It works on my mac.

On Aug 12, 2014, at 15:19, Devl Devel  wrote:

> When compiling the master checkout of spark. The Intellij compile fails
> with:
> 
>Error:(45, 8) not found: value $scope
>  
>   ^
> which is caused by HTML elements in classes like HistoryPage.scala:
> 
>val content =
>  
>...
> 
> How can I compile these classes that have html node elements in them?
> 
> Thanks in advance.


-
To unsubscribe, e-mail: dev-unsubscr...@spark.apache.org
For additional commands, e-mail: dev-h...@spark.apache.org



Re: [VOTE] Release Apache Spark 1.1.0 (RC2)

2014-08-29 Thread Patrick Wendell
Oh darn - I missed this update. GRR, unfortunately I think this means
I'll need to cut a new RC. Thanks for catching this Nick.

On Fri, Aug 29, 2014 at 10:18 AM, Nicholas Chammas
 wrote:
> [Let me know if I should be posting these comments in a different thread.]
>
> Should the default Spark version in spark-ec2 be updated for this release?
>
> Nick
>
>
>
> On Fri, Aug 29, 2014 at 12:55 PM, Patrick Wendell 
> wrote:
>>
>> Hey Nicholas,
>>
>> Thanks for this, we can merge in doc changes outside of the actual
>> release timeline, so we'll make sure to loop those changes in before
>> we publish the final 1.1 docs.
>>
>> - Patrick
>>
>> On Fri, Aug 29, 2014 at 9:24 AM, Nicholas Chammas
>>  wrote:
>> > There were several formatting and typographical errors in the SQL docs
>> > that
>> > I've fixed in this PR. Dunno if we want to roll that into the release.
>> >
>> >
>> > On Fri, Aug 29, 2014 at 12:17 PM, Patrick Wendell 
>> > wrote:
>> >>
>> >> Okay I'll plan to add cdh4 binary as well for the final release!
>> >>
>> >> ---
>> >> sent from my phone
>> >> On Aug 29, 2014 8:26 AM, "Ye Xianjin"  wrote:
>> >>
>> >> > We just used CDH 4.7 for our production cluster. And I believe we
>> >> > won't
>> >> > use CDH 5 in the next year.
>> >> >
>> >> > Sent from my iPhone
>> >> >
>> >> > > On 2014年8月29日, at 14:39, Matei Zaharia 
>> >> > > wrote:
>> >> > >
>> >> > > Personally I'd actually consider putting CDH4 back if there are
>> >> > > still
>> >> > users on it. It's always better to be inclusive, and the convenience
>> >> > of
>> >> > a
>> >> > one-click download is high. Do we have a sense on what % of CDH users
>> >> > still
>> >> > use CDH4?
>> >> > >
>> >> > > Matei
>> >> > >
>> >> > > On August 28, 2014 at 11:31:13 PM, Sean Owen (so...@cloudera.com)
>> >> > > wrote:
>> >> > >
>> >> > > (Copying my reply since I don't know if it goes to the mailing
>> >> > > list)
>> >> > >
>> >> > > Great, thanks for explaining the reasoning. You're saying these
>> >> > > aren't
>> >> > > going into the final release? I think that moots any issue
>> >> > > surrounding
>> >> > > distributing them then.
>> >> > >
>> >> > > This is all I know of from the ASF:
>> >> > > https://community.apache.org/projectIndependence.html I don't read
>> >> > > it
>> >> > > as expressly forbidding this kind of thing although you can see how
>> >> > > it
>> >> > > bumps up against the spirit. There's not a bright line -- what
>> >> > > about
>> >> > > Tomcat providing binaries compiled for Windows for example? does
>> >> > > that
>> >> > > favor an OS vendor?
>> >> > >
>> >> > > From this technical ASF perspective only the releases matter -- do
>> >> > > what you want with snapshots and RCs. The only issue there is maybe
>> >> > > releasing something different than was in the RC; is that at all
>> >> > > confusing? Just needs a note.
>> >> > >
>> >> > > I think this theoretical issue doesn't exist if these binaries
>> >> > > aren't
>> >> > > released, so I see no reason to not proceed.
>> >> > >
>> >> > > The rest is a different question about whether you want to spend
>> >> > > time
>> >> > > maintaining this profile and candidate. The vendor already manages
>> >> > > their build I think and -- and I don't know -- may even prefer not
>> >> > > to
>> >> > > have a different special build floating around. There's also the
>> >> > > theoretical argument that this turns off other vendors from
>> >> > > adopting
>> >> > > Spark if it's perceived to be too connected to other vendors. I'd
>> >> > > like
>> >> > > to maximize Spark's distribution and there's some argument you do
>> >> > > this
>> >> > > by not making vendor profiles. But as I say a different question to
>> >> > > just think about over time...
>> >> > >
>> >> > > (oh and PS for my part I think it's a good thing that CDH4 binaries
>> >> > > were removed. I wasn't arguing for resurrecting them)
>> >> > >
>> >> > >> On Fri, Aug 29, 2014 at 7:26 AM, Patrick Wendell
>> >> > >> 
>> >> > wrote:
>> >> > >> Hey Sean,
>> >> > >>
>> >> > >> The reason there are no longer CDH-specific builds is that all
>> >> > >> newer
>> >> > >> versions of CDH and HDP work with builds for the upstream Hadoop
>> >> > >> projects. I dropped CDH4 in favor of a newer Hadoop version (2.4)
>> >> > >> and
>> >> > >> the Hadoop-without-Hive (also 2.4) build.
>> >> > >>
>> >> > >> For MapR - we can't officially post those artifacts on ASF web
>> >> > >> space
>> >> > >> when we make the final release, we can only link to them as being
>> >> > >> hosted by MapR specifically since they use non-compatible
>> >> > >> licenses.
>> >> > >> However, I felt that providing these during a testing period was
>> >> > >> alright, with the goal of increasing test coverage. I couldn't
>> >> > >> find
>> >> > >> any policy against posting these on personal web space during RC
>> >> > >> voting. However, we can remove them if there is one.
>> >> > >>
>> >> > >> Dropping CDH4 was more because it is now pretty old, but we can
>> >> > >> add
>>

new jenkins plugin installed and ready for use

2014-08-29 Thread shane knapp
i have always found the 'Rebuild' plugin super useful:
https://wiki.jenkins-ci.org/display/JENKINS/Rebuild+Plugin

this is installed and enables.  enjoy!

shane


Re: [VOTE] Release Apache Spark 1.1.0 (RC2)

2014-08-29 Thread Nicholas Chammas
[Let me know if I should be posting these comments in a different thread.]

Should the default Spark version in spark-ec2

be updated for this release?

Nick
​


On Fri, Aug 29, 2014 at 12:55 PM, Patrick Wendell 
wrote:

> Hey Nicholas,
>
> Thanks for this, we can merge in doc changes outside of the actual
> release timeline, so we'll make sure to loop those changes in before
> we publish the final 1.1 docs.
>
> - Patrick
>
> On Fri, Aug 29, 2014 at 9:24 AM, Nicholas Chammas
>  wrote:
> > There were several formatting and typographical errors in the SQL docs
> that
> > I've fixed in this PR. Dunno if we want to roll that into the release.
> >
> >
> > On Fri, Aug 29, 2014 at 12:17 PM, Patrick Wendell 
> > wrote:
> >>
> >> Okay I'll plan to add cdh4 binary as well for the final release!
> >>
> >> ---
> >> sent from my phone
> >> On Aug 29, 2014 8:26 AM, "Ye Xianjin"  wrote:
> >>
> >> > We just used CDH 4.7 for our production cluster. And I believe we
> won't
> >> > use CDH 5 in the next year.
> >> >
> >> > Sent from my iPhone
> >> >
> >> > > On 2014年8月29日, at 14:39, Matei Zaharia 
> >> > > wrote:
> >> > >
> >> > > Personally I'd actually consider putting CDH4 back if there are
> still
> >> > users on it. It's always better to be inclusive, and the convenience
> of
> >> > a
> >> > one-click download is high. Do we have a sense on what % of CDH users
> >> > still
> >> > use CDH4?
> >> > >
> >> > > Matei
> >> > >
> >> > > On August 28, 2014 at 11:31:13 PM, Sean Owen (so...@cloudera.com)
> >> > > wrote:
> >> > >
> >> > > (Copying my reply since I don't know if it goes to the mailing list)
> >> > >
> >> > > Great, thanks for explaining the reasoning. You're saying these
> aren't
> >> > > going into the final release? I think that moots any issue
> surrounding
> >> > > distributing them then.
> >> > >
> >> > > This is all I know of from the ASF:
> >> > > https://community.apache.org/projectIndependence.html I don't read
> it
> >> > > as expressly forbidding this kind of thing although you can see how
> it
> >> > > bumps up against the spirit. There's not a bright line -- what about
> >> > > Tomcat providing binaries compiled for Windows for example? does
> that
> >> > > favor an OS vendor?
> >> > >
> >> > > From this technical ASF perspective only the releases matter -- do
> >> > > what you want with snapshots and RCs. The only issue there is maybe
> >> > > releasing something different than was in the RC; is that at all
> >> > > confusing? Just needs a note.
> >> > >
> >> > > I think this theoretical issue doesn't exist if these binaries
> aren't
> >> > > released, so I see no reason to not proceed.
> >> > >
> >> > > The rest is a different question about whether you want to spend
> time
> >> > > maintaining this profile and candidate. The vendor already manages
> >> > > their build I think and -- and I don't know -- may even prefer not
> to
> >> > > have a different special build floating around. There's also the
> >> > > theoretical argument that this turns off other vendors from adopting
> >> > > Spark if it's perceived to be too connected to other vendors. I'd
> like
> >> > > to maximize Spark's distribution and there's some argument you do
> this
> >> > > by not making vendor profiles. But as I say a different question to
> >> > > just think about over time...
> >> > >
> >> > > (oh and PS for my part I think it's a good thing that CDH4 binaries
> >> > > were removed. I wasn't arguing for resurrecting them)
> >> > >
> >> > >> On Fri, Aug 29, 2014 at 7:26 AM, Patrick Wendell <
> pwend...@gmail.com>
> >> > wrote:
> >> > >> Hey Sean,
> >> > >>
> >> > >> The reason there are no longer CDH-specific builds is that all
> newer
> >> > >> versions of CDH and HDP work with builds for the upstream Hadoop
> >> > >> projects. I dropped CDH4 in favor of a newer Hadoop version (2.4)
> and
> >> > >> the Hadoop-without-Hive (also 2.4) build.
> >> > >>
> >> > >> For MapR - we can't officially post those artifacts on ASF web
> space
> >> > >> when we make the final release, we can only link to them as being
> >> > >> hosted by MapR specifically since they use non-compatible licenses.
> >> > >> However, I felt that providing these during a testing period was
> >> > >> alright, with the goal of increasing test coverage. I couldn't find
> >> > >> any policy against posting these on personal web space during RC
> >> > >> voting. However, we can remove them if there is one.
> >> > >>
> >> > >> Dropping CDH4 was more because it is now pretty old, but we can add
> >> > >> it
> >> > >> back if people want. The binary packaging is a slightly separate
> >> > >> question from release votes, so I can always add more binary
> packages
> >> > >> whenever. And on this, my main concern is covering the most popular
> >> > >> Hadoop versions to lower the bar for users to build and test Spark.
> >> > >>
> >> > >> - Patrick
> >> > >>
> >> > >>> On T

Re: Jira tickets for starter tasks

2014-08-29 Thread Josh Rosen
Added you; you should be set!

If anyone else wants me to add them, please email me off-list so that we don’t 
end up flooding the dev list with replies. Thanks!


On August 29, 2014 at 10:03:41 AM, Ron's Yahoo! (zlgonza...@yahoo.com) wrote:

Hi Josh,  
Can you add me as well?  

Thanks,  
Ron  

On Aug 28, 2014, at 3:56 PM, Josh Rosen  wrote:  

> A JIRA admin needs to add you to the ‘’Contributors” role group in order to 
> allow you to assign issues to yourself. I’ve added this email address to that 
> group, so you should be set!  
>  
> - Josh  
>  
>  
> On August 28, 2014 at 3:52:57 PM, Bill Bejeck (bbej...@gmail.com) wrote:  
>  
> Hi,  
>  
> How do I get a starter task jira ticket assigned to myself? Or do I just do  
> the work and issue a pull request with the associated jira number?  
>  
> Thanks,  
> Bill  



Re: Jira tickets for starter tasks

2014-08-29 Thread Ron's Yahoo!
Hi Josh,
  Can you add me as well?

Thanks,
Ron

On Aug 28, 2014, at 3:56 PM, Josh Rosen  wrote:

> A JIRA admin needs to add you to the ‘’Contributors” role group in order to 
> allow you to assign issues to yourself.  I’ve added this email address to 
> that group, so you should be set!
> 
> - Josh
> 
> 
> On August 28, 2014 at 3:52:57 PM, Bill Bejeck (bbej...@gmail.com) wrote:
> 
> Hi,  
> 
> How do I get a starter task jira ticket assigned to myself? Or do I just do  
> the work and issue a pull request with the associated jira number?  
> 
> Thanks,  
> Bill  


-
To unsubscribe, e-mail: dev-unsubscr...@spark.apache.org
For additional commands, e-mail: dev-h...@spark.apache.org



Re: [VOTE] Release Apache Spark 1.1.0 (RC2)

2014-08-29 Thread Patrick Wendell
Hey Nicholas,

Thanks for this, we can merge in doc changes outside of the actual
release timeline, so we'll make sure to loop those changes in before
we publish the final 1.1 docs.

- Patrick

On Fri, Aug 29, 2014 at 9:24 AM, Nicholas Chammas
 wrote:
> There were several formatting and typographical errors in the SQL docs that
> I've fixed in this PR. Dunno if we want to roll that into the release.
>
>
> On Fri, Aug 29, 2014 at 12:17 PM, Patrick Wendell 
> wrote:
>>
>> Okay I'll plan to add cdh4 binary as well for the final release!
>>
>> ---
>> sent from my phone
>> On Aug 29, 2014 8:26 AM, "Ye Xianjin"  wrote:
>>
>> > We just used CDH 4.7 for our production cluster. And I believe we won't
>> > use CDH 5 in the next year.
>> >
>> > Sent from my iPhone
>> >
>> > > On 2014年8月29日, at 14:39, Matei Zaharia 
>> > > wrote:
>> > >
>> > > Personally I'd actually consider putting CDH4 back if there are still
>> > users on it. It's always better to be inclusive, and the convenience of
>> > a
>> > one-click download is high. Do we have a sense on what % of CDH users
>> > still
>> > use CDH4?
>> > >
>> > > Matei
>> > >
>> > > On August 28, 2014 at 11:31:13 PM, Sean Owen (so...@cloudera.com)
>> > > wrote:
>> > >
>> > > (Copying my reply since I don't know if it goes to the mailing list)
>> > >
>> > > Great, thanks for explaining the reasoning. You're saying these aren't
>> > > going into the final release? I think that moots any issue surrounding
>> > > distributing them then.
>> > >
>> > > This is all I know of from the ASF:
>> > > https://community.apache.org/projectIndependence.html I don't read it
>> > > as expressly forbidding this kind of thing although you can see how it
>> > > bumps up against the spirit. There's not a bright line -- what about
>> > > Tomcat providing binaries compiled for Windows for example? does that
>> > > favor an OS vendor?
>> > >
>> > > From this technical ASF perspective only the releases matter -- do
>> > > what you want with snapshots and RCs. The only issue there is maybe
>> > > releasing something different than was in the RC; is that at all
>> > > confusing? Just needs a note.
>> > >
>> > > I think this theoretical issue doesn't exist if these binaries aren't
>> > > released, so I see no reason to not proceed.
>> > >
>> > > The rest is a different question about whether you want to spend time
>> > > maintaining this profile and candidate. The vendor already manages
>> > > their build I think and -- and I don't know -- may even prefer not to
>> > > have a different special build floating around. There's also the
>> > > theoretical argument that this turns off other vendors from adopting
>> > > Spark if it's perceived to be too connected to other vendors. I'd like
>> > > to maximize Spark's distribution and there's some argument you do this
>> > > by not making vendor profiles. But as I say a different question to
>> > > just think about over time...
>> > >
>> > > (oh and PS for my part I think it's a good thing that CDH4 binaries
>> > > were removed. I wasn't arguing for resurrecting them)
>> > >
>> > >> On Fri, Aug 29, 2014 at 7:26 AM, Patrick Wendell 
>> > wrote:
>> > >> Hey Sean,
>> > >>
>> > >> The reason there are no longer CDH-specific builds is that all newer
>> > >> versions of CDH and HDP work with builds for the upstream Hadoop
>> > >> projects. I dropped CDH4 in favor of a newer Hadoop version (2.4) and
>> > >> the Hadoop-without-Hive (also 2.4) build.
>> > >>
>> > >> For MapR - we can't officially post those artifacts on ASF web space
>> > >> when we make the final release, we can only link to them as being
>> > >> hosted by MapR specifically since they use non-compatible licenses.
>> > >> However, I felt that providing these during a testing period was
>> > >> alright, with the goal of increasing test coverage. I couldn't find
>> > >> any policy against posting these on personal web space during RC
>> > >> voting. However, we can remove them if there is one.
>> > >>
>> > >> Dropping CDH4 was more because it is now pretty old, but we can add
>> > >> it
>> > >> back if people want. The binary packaging is a slightly separate
>> > >> question from release votes, so I can always add more binary packages
>> > >> whenever. And on this, my main concern is covering the most popular
>> > >> Hadoop versions to lower the bar for users to build and test Spark.
>> > >>
>> > >> - Patrick
>> > >>
>> > >>> On Thu, Aug 28, 2014 at 11:04 PM, Sean Owen 
>> > wrote:
>> > >>> +1 I tested the source and Hadoop 2.4 release. Checksums and
>> > >>> signatures are OK. Compiles fine with Java 8 on OS X. Tests... don't
>> > >>> fail any more than usual.
>> > >>>
>> > >>> FWIW I've also been using the 1.1.0-SNAPSHOT for some time in
>> > >>> another
>> > >>> project and have encountered no problems.
>> > >>>
>> > >>>
>> > >>> I notice that the 1.1.0 release removes the CDH4-specific build, but
>> > >>> adds two MapR-specific builds. Compare with
>> > >>> https://dist.apache.org/repos/dist/release/

Re: [VOTE] Release Apache Spark 1.1.0 (RC2)

2014-08-29 Thread Nicholas Chammas
There were several formatting and typographical errors in the SQL docs that
I've fixed in this PR . Dunno if
we want to roll that into the release.


On Fri, Aug 29, 2014 at 12:17 PM, Patrick Wendell 
wrote:

> Okay I'll plan to add cdh4 binary as well for the final release!
>
> ---
> sent from my phone
> On Aug 29, 2014 8:26 AM, "Ye Xianjin"  wrote:
>
> > We just used CDH 4.7 for our production cluster. And I believe we won't
> > use CDH 5 in the next year.
> >
> > Sent from my iPhone
> >
> > > On 2014年8月29日, at 14:39, Matei Zaharia 
> wrote:
> > >
> > > Personally I'd actually consider putting CDH4 back if there are still
> > users on it. It's always better to be inclusive, and the convenience of a
> > one-click download is high. Do we have a sense on what % of CDH users
> still
> > use CDH4?
> > >
> > > Matei
> > >
> > > On August 28, 2014 at 11:31:13 PM, Sean Owen (so...@cloudera.com)
> wrote:
> > >
> > > (Copying my reply since I don't know if it goes to the mailing list)
> > >
> > > Great, thanks for explaining the reasoning. You're saying these aren't
> > > going into the final release? I think that moots any issue surrounding
> > > distributing them then.
> > >
> > > This is all I know of from the ASF:
> > > https://community.apache.org/projectIndependence.html I don't read it
> > > as expressly forbidding this kind of thing although you can see how it
> > > bumps up against the spirit. There's not a bright line -- what about
> > > Tomcat providing binaries compiled for Windows for example? does that
> > > favor an OS vendor?
> > >
> > > From this technical ASF perspective only the releases matter -- do
> > > what you want with snapshots and RCs. The only issue there is maybe
> > > releasing something different than was in the RC; is that at all
> > > confusing? Just needs a note.
> > >
> > > I think this theoretical issue doesn't exist if these binaries aren't
> > > released, so I see no reason to not proceed.
> > >
> > > The rest is a different question about whether you want to spend time
> > > maintaining this profile and candidate. The vendor already manages
> > > their build I think and -- and I don't know -- may even prefer not to
> > > have a different special build floating around. There's also the
> > > theoretical argument that this turns off other vendors from adopting
> > > Spark if it's perceived to be too connected to other vendors. I'd like
> > > to maximize Spark's distribution and there's some argument you do this
> > > by not making vendor profiles. But as I say a different question to
> > > just think about over time...
> > >
> > > (oh and PS for my part I think it's a good thing that CDH4 binaries
> > > were removed. I wasn't arguing for resurrecting them)
> > >
> > >> On Fri, Aug 29, 2014 at 7:26 AM, Patrick Wendell 
> > wrote:
> > >> Hey Sean,
> > >>
> > >> The reason there are no longer CDH-specific builds is that all newer
> > >> versions of CDH and HDP work with builds for the upstream Hadoop
> > >> projects. I dropped CDH4 in favor of a newer Hadoop version (2.4) and
> > >> the Hadoop-without-Hive (also 2.4) build.
> > >>
> > >> For MapR - we can't officially post those artifacts on ASF web space
> > >> when we make the final release, we can only link to them as being
> > >> hosted by MapR specifically since they use non-compatible licenses.
> > >> However, I felt that providing these during a testing period was
> > >> alright, with the goal of increasing test coverage. I couldn't find
> > >> any policy against posting these on personal web space during RC
> > >> voting. However, we can remove them if there is one.
> > >>
> > >> Dropping CDH4 was more because it is now pretty old, but we can add it
> > >> back if people want. The binary packaging is a slightly separate
> > >> question from release votes, so I can always add more binary packages
> > >> whenever. And on this, my main concern is covering the most popular
> > >> Hadoop versions to lower the bar for users to build and test Spark.
> > >>
> > >> - Patrick
> > >>
> > >>> On Thu, Aug 28, 2014 at 11:04 PM, Sean Owen 
> > wrote:
> > >>> +1 I tested the source and Hadoop 2.4 release. Checksums and
> > >>> signatures are OK. Compiles fine with Java 8 on OS X. Tests... don't
> > >>> fail any more than usual.
> > >>>
> > >>> FWIW I've also been using the 1.1.0-SNAPSHOT for some time in another
> > >>> project and have encountered no problems.
> > >>>
> > >>>
> > >>> I notice that the 1.1.0 release removes the CDH4-specific build, but
> > >>> adds two MapR-specific builds. Compare with
> > >>> https://dist.apache.org/repos/dist/release/spark/spark-1.0.2/ I
> > >>> commented on the commit:
> > >>>
> >
> https://github.com/apache/spark/commit/ceb19830b88486faa87ff41e18d03ede713a73cc
> > >>>
> > >>> I'm in favor of removing all vendor-specific builds. This change
> > >>> *looks* a bit funny as there was no JIRA (?) and appears to swap one
> > >>> vendor for another. Of course there

Re: [VOTE] Release Apache Spark 1.1.0 (RC2)

2014-08-29 Thread Patrick Wendell
Okay I'll plan to add cdh4 binary as well for the final release!

---
sent from my phone
On Aug 29, 2014 8:26 AM, "Ye Xianjin"  wrote:

> We just used CDH 4.7 for our production cluster. And I believe we won't
> use CDH 5 in the next year.
>
> Sent from my iPhone
>
> > On 2014年8月29日, at 14:39, Matei Zaharia  wrote:
> >
> > Personally I'd actually consider putting CDH4 back if there are still
> users on it. It's always better to be inclusive, and the convenience of a
> one-click download is high. Do we have a sense on what % of CDH users still
> use CDH4?
> >
> > Matei
> >
> > On August 28, 2014 at 11:31:13 PM, Sean Owen (so...@cloudera.com) wrote:
> >
> > (Copying my reply since I don't know if it goes to the mailing list)
> >
> > Great, thanks for explaining the reasoning. You're saying these aren't
> > going into the final release? I think that moots any issue surrounding
> > distributing them then.
> >
> > This is all I know of from the ASF:
> > https://community.apache.org/projectIndependence.html I don't read it
> > as expressly forbidding this kind of thing although you can see how it
> > bumps up against the spirit. There's not a bright line -- what about
> > Tomcat providing binaries compiled for Windows for example? does that
> > favor an OS vendor?
> >
> > From this technical ASF perspective only the releases matter -- do
> > what you want with snapshots and RCs. The only issue there is maybe
> > releasing something different than was in the RC; is that at all
> > confusing? Just needs a note.
> >
> > I think this theoretical issue doesn't exist if these binaries aren't
> > released, so I see no reason to not proceed.
> >
> > The rest is a different question about whether you want to spend time
> > maintaining this profile and candidate. The vendor already manages
> > their build I think and -- and I don't know -- may even prefer not to
> > have a different special build floating around. There's also the
> > theoretical argument that this turns off other vendors from adopting
> > Spark if it's perceived to be too connected to other vendors. I'd like
> > to maximize Spark's distribution and there's some argument you do this
> > by not making vendor profiles. But as I say a different question to
> > just think about over time...
> >
> > (oh and PS for my part I think it's a good thing that CDH4 binaries
> > were removed. I wasn't arguing for resurrecting them)
> >
> >> On Fri, Aug 29, 2014 at 7:26 AM, Patrick Wendell 
> wrote:
> >> Hey Sean,
> >>
> >> The reason there are no longer CDH-specific builds is that all newer
> >> versions of CDH and HDP work with builds for the upstream Hadoop
> >> projects. I dropped CDH4 in favor of a newer Hadoop version (2.4) and
> >> the Hadoop-without-Hive (also 2.4) build.
> >>
> >> For MapR - we can't officially post those artifacts on ASF web space
> >> when we make the final release, we can only link to them as being
> >> hosted by MapR specifically since they use non-compatible licenses.
> >> However, I felt that providing these during a testing period was
> >> alright, with the goal of increasing test coverage. I couldn't find
> >> any policy against posting these on personal web space during RC
> >> voting. However, we can remove them if there is one.
> >>
> >> Dropping CDH4 was more because it is now pretty old, but we can add it
> >> back if people want. The binary packaging is a slightly separate
> >> question from release votes, so I can always add more binary packages
> >> whenever. And on this, my main concern is covering the most popular
> >> Hadoop versions to lower the bar for users to build and test Spark.
> >>
> >> - Patrick
> >>
> >>> On Thu, Aug 28, 2014 at 11:04 PM, Sean Owen 
> wrote:
> >>> +1 I tested the source and Hadoop 2.4 release. Checksums and
> >>> signatures are OK. Compiles fine with Java 8 on OS X. Tests... don't
> >>> fail any more than usual.
> >>>
> >>> FWIW I've also been using the 1.1.0-SNAPSHOT for some time in another
> >>> project and have encountered no problems.
> >>>
> >>>
> >>> I notice that the 1.1.0 release removes the CDH4-specific build, but
> >>> adds two MapR-specific builds. Compare with
> >>> https://dist.apache.org/repos/dist/release/spark/spark-1.0.2/ I
> >>> commented on the commit:
> >>>
> https://github.com/apache/spark/commit/ceb19830b88486faa87ff41e18d03ede713a73cc
> >>>
> >>> I'm in favor of removing all vendor-specific builds. This change
> >>> *looks* a bit funny as there was no JIRA (?) and appears to swap one
> >>> vendor for another. Of course there's nothing untoward going on, but
> >>> what was the reasoning? It's best avoided, and MapR already
> >>> distributes Spark just fine, no?
> >>>
> >>> This is a gray area with ASF projects. I mention it as well because it
> >>> came up with Apache Flink recently
> >>> (
> http://mail-archives.eu.apache.org/mod_mbox/incubator-flink-dev/201408.mbox/%3CCANC1h_u%3DN0YKFu3pDaEVYz5ZcQtjQnXEjQA2ReKmoS%2Bye7%3Do%3DA%40mail.gmail.com%3E
> )
> >>> Another vendor rig

Re: [VOTE] Release Apache Spark 1.1.0 (RC2)

2014-08-29 Thread Ye Xianjin
We just used CDH 4.7 for our production cluster. And I believe we won't use CDH 
5 in the next year.

Sent from my iPhone

> On 2014年8月29日, at 14:39, Matei Zaharia  wrote:
> 
> Personally I'd actually consider putting CDH4 back if there are still users 
> on it. It's always better to be inclusive, and the convenience of a one-click 
> download is high. Do we have a sense on what % of CDH users still use CDH4?
> 
> Matei
> 
> On August 28, 2014 at 11:31:13 PM, Sean Owen (so...@cloudera.com) wrote:
> 
> (Copying my reply since I don't know if it goes to the mailing list) 
> 
> Great, thanks for explaining the reasoning. You're saying these aren't 
> going into the final release? I think that moots any issue surrounding 
> distributing them then. 
> 
> This is all I know of from the ASF: 
> https://community.apache.org/projectIndependence.html I don't read it 
> as expressly forbidding this kind of thing although you can see how it 
> bumps up against the spirit. There's not a bright line -- what about 
> Tomcat providing binaries compiled for Windows for example? does that 
> favor an OS vendor? 
> 
> From this technical ASF perspective only the releases matter -- do 
> what you want with snapshots and RCs. The only issue there is maybe 
> releasing something different than was in the RC; is that at all 
> confusing? Just needs a note. 
> 
> I think this theoretical issue doesn't exist if these binaries aren't 
> released, so I see no reason to not proceed. 
> 
> The rest is a different question about whether you want to spend time 
> maintaining this profile and candidate. The vendor already manages 
> their build I think and -- and I don't know -- may even prefer not to 
> have a different special build floating around. There's also the 
> theoretical argument that this turns off other vendors from adopting 
> Spark if it's perceived to be too connected to other vendors. I'd like 
> to maximize Spark's distribution and there's some argument you do this 
> by not making vendor profiles. But as I say a different question to 
> just think about over time... 
> 
> (oh and PS for my part I think it's a good thing that CDH4 binaries 
> were removed. I wasn't arguing for resurrecting them) 
> 
>> On Fri, Aug 29, 2014 at 7:26 AM, Patrick Wendell  wrote: 
>> Hey Sean, 
>> 
>> The reason there are no longer CDH-specific builds is that all newer 
>> versions of CDH and HDP work with builds for the upstream Hadoop 
>> projects. I dropped CDH4 in favor of a newer Hadoop version (2.4) and 
>> the Hadoop-without-Hive (also 2.4) build. 
>> 
>> For MapR - we can't officially post those artifacts on ASF web space 
>> when we make the final release, we can only link to them as being 
>> hosted by MapR specifically since they use non-compatible licenses. 
>> However, I felt that providing these during a testing period was 
>> alright, with the goal of increasing test coverage. I couldn't find 
>> any policy against posting these on personal web space during RC 
>> voting. However, we can remove them if there is one. 
>> 
>> Dropping CDH4 was more because it is now pretty old, but we can add it 
>> back if people want. The binary packaging is a slightly separate 
>> question from release votes, so I can always add more binary packages 
>> whenever. And on this, my main concern is covering the most popular 
>> Hadoop versions to lower the bar for users to build and test Spark. 
>> 
>> - Patrick 
>> 
>>> On Thu, Aug 28, 2014 at 11:04 PM, Sean Owen  wrote: 
>>> +1 I tested the source and Hadoop 2.4 release. Checksums and 
>>> signatures are OK. Compiles fine with Java 8 on OS X. Tests... don't 
>>> fail any more than usual. 
>>> 
>>> FWIW I've also been using the 1.1.0-SNAPSHOT for some time in another 
>>> project and have encountered no problems. 
>>> 
>>> 
>>> I notice that the 1.1.0 release removes the CDH4-specific build, but 
>>> adds two MapR-specific builds. Compare with 
>>> https://dist.apache.org/repos/dist/release/spark/spark-1.0.2/ I 
>>> commented on the commit: 
>>> https://github.com/apache/spark/commit/ceb19830b88486faa87ff41e18d03ede713a73cc
>>>  
>>> 
>>> I'm in favor of removing all vendor-specific builds. This change 
>>> *looks* a bit funny as there was no JIRA (?) and appears to swap one 
>>> vendor for another. Of course there's nothing untoward going on, but 
>>> what was the reasoning? It's best avoided, and MapR already 
>>> distributes Spark just fine, no? 
>>> 
>>> This is a gray area with ASF projects. I mention it as well because it 
>>> came up with Apache Flink recently 
>>> (http://mail-archives.eu.apache.org/mod_mbox/incubator-flink-dev/201408.mbox/%3CCANC1h_u%3DN0YKFu3pDaEVYz5ZcQtjQnXEjQA2ReKmoS%2Bye7%3Do%3DA%40mail.gmail.com%3E)
>>> Another vendor rightly noted this could look like favoritism. They 
>>> changed to remove vendor releases. 
>>> 
 On Fri, Aug 29, 2014 at 3:14 AM, Patrick Wendell  
 wrote: 
 Please vote on releasing the following candidate as Apache Spark version 

Re: "emergency" jenkins restart, aug 29th, 730am-9am PDT -- plus a postmortem

2014-08-29 Thread shane knapp
this is done.


On Fri, Aug 29, 2014 at 7:32 AM, shane knapp  wrote:

> reminder:   this is happening right now.  jenkins is currently in quiet
> mode, and in ~30 minutes, will be briefly going down.
>
>
> On Thu, Aug 28, 2014 at 1:03 PM, shane knapp  wrote:
>
>> as with all software upgrades, sometimes things don't always work as
>> expected.
>>
>> a recent change to stapler[1], to verbosely
>> report NotExportableExceptions[2] is spamming our jenkins log file with
>> stack traces, which is growing rather quickly (1.2G since 9am).  this has
>> been reported to the jenkins jira[3], and a fix has been pushed and will be
>> rolled out "soon"[4].
>>
>> this isn't affecting any builds, and jenkins is happily humming along.
>>
>> in the interim, so that we don't run out of disk space, i will be
>> redirecting the jenkins logs tommorow morning to /dev/null for the long
>> weekend.
>>
>> once a real fix has been released, i will update any packages needed and
>> redirect the logging back to the log file.
>>
>> other than a short downtime, this will have no user-facing impact.
>>
>> please let me know if you have any questions/concerns.
>>
>> thanks for your patience!
>>
>> shane "the new guy"  :)
>>
>> [1] -- https://wiki.jenkins-ci.org/display/JENKINS/Architecture
>> [2] --
>> https://github.com/stapler/stapler/commit/ed2cb8b04c1514377f3a8bfbd567f050a67c6e1c
>> [3] --
>> https://issues.jenkins-ci.org/browse/JENKINS-24458?focusedCommentId=209247
>> [4] --
>> https://github.com/stapler/stapler/commit/e2b39098ca1f61a58970b8a41a3ae79053cf30e3
>>
>
>


Re: Jira tickets for starter tasks

2014-08-29 Thread Madhu
Cheng Lian-2 wrote
> You can just start the work :)

Given 100+ contributors, starting work without a JIRA issue assigned to you
could lead to duplication of effort by well meaning people that have no idea
they are working on the same issue. This does happen and I don't think it's
a good thing.

Just my $0.02



-
--
Madhu
https://www.linkedin.com/in/msiddalingaiah
--
View this message in context: 
http://apache-spark-developers-list.1001551.n3.nabble.com/Jira-tickets-for-starter-tasks-tp8102p8127.html
Sent from the Apache Spark Developers List mailing list archive at Nabble.com.

-
To unsubscribe, e-mail: dev-unsubscr...@spark.apache.org
For additional commands, e-mail: dev-h...@spark.apache.org



Re: "emergency" jenkins restart, aug 29th, 730am-9am PDT -- plus a postmortem

2014-08-29 Thread shane knapp
reminder:   this is happening right now.  jenkins is currently in quiet
mode, and in ~30 minutes, will be briefly going down.


On Thu, Aug 28, 2014 at 1:03 PM, shane knapp  wrote:

> as with all software upgrades, sometimes things don't always work as
> expected.
>
> a recent change to stapler[1], to verbosely
> report NotExportableExceptions[2] is spamming our jenkins log file with
> stack traces, which is growing rather quickly (1.2G since 9am).  this has
> been reported to the jenkins jira[3], and a fix has been pushed and will be
> rolled out "soon"[4].
>
> this isn't affecting any builds, and jenkins is happily humming along.
>
> in the interim, so that we don't run out of disk space, i will be
> redirecting the jenkins logs tommorow morning to /dev/null for the long
> weekend.
>
> once a real fix has been released, i will update any packages needed and
> redirect the logging back to the log file.
>
> other than a short downtime, this will have no user-facing impact.
>
> please let me know if you have any questions/concerns.
>
> thanks for your patience!
>
> shane "the new guy"  :)
>
> [1] -- https://wiki.jenkins-ci.org/display/JENKINS/Architecture
> [2] --
> https://github.com/stapler/stapler/commit/ed2cb8b04c1514377f3a8bfbd567f050a67c6e1c
> [3] --
> https://issues.jenkins-ci.org/browse/JENKINS-24458?focusedCommentId=209247
> [4] --
> https://github.com/stapler/stapler/commit/e2b39098ca1f61a58970b8a41a3ae79053cf30e3
>


Re: Running Spark On Yarn without Spark-Submit

2014-08-29 Thread Chester @work
Archit
 We are using yarn-cluster mode , and calling spark via Client class 
directly from servlet server. It works fine. 
To establish a communication channel to give further requests, 
 It should be possible with yarn client, but not with yarn server. Yarn 
client mode, spark driver is outside the yarn cluster; so it can issue more 
commands. In yarn cluster, all programs including spark driver is running 
inside the yarn cluster. There is no communication channel with the client 
until the job finishes.

If you job is to keep spark context alive, and wait for other commands, then 
this should wait forever. 

I am actually working on some improvements on this and experiment in our 
product, I will create PRs when I feel conformable with the solution

1) change Client API to allow the caller to know yarn app resource capacity 
before passing arguments
2) add YarnApplicationListener to the Client 
3) provide communication channel between application and spark Yarn client in 
cluster. 

The #1) is not directly related to the communication discussed here

#2) allows the application to have application life cycle call back as to app 
start end in progress failure etc with yarn resources allocations 

I changed #1 and #2 in forked spark, and it's worked well in cdh5, and I am 
testing against 2.0.5-alpha as well. 

For #3) I did not change in spark currently, as I am not sure the best approach 
yet. I put the change in the application runner which launch the spark yarn 
client in the cluster. 

The runner in yarn cluster get applications host and port information  from the 
passed configuration (args), then creates an Akka actor using spark context 
actor system, send a hand shake message to the caller outside the cluster, 
after that you will have a two way communications 

With this approach, I can send spark listener call backs to the app, error 
messages, app level messages etc. 

The runner inside the cluster can also receive requests from outside cluster 
such as stop. 

We are not sure Akka approach is the best, so I am still experimenting it. So 
far it does what we wants .

Hope this helps

Chester


Sent from my iPhone

> On Aug 29, 2014, at 2:36 AM, Archit Thakur  wrote:
> 
> including u...@spark.apache.org.
> 
> 
>> On Fri, Aug 29, 2014 at 2:03 PM, Archit Thakur  
>> wrote:
>> Hi,
>> 
>> My requirement is to run Spark on Yarn without using the script spark-submit.
>> 
>> I have a servlet and a tomcat server. As and when request comes, it creates 
>> a new SC and keeps it alive for the further requests, I ma setting my master 
>> in sparkConf
>> 
>> as sparkConf.setMaster("yarn-cluster")
>> 
>> but the request is stuck indefinitely. 
>> 
>> This works when I set
>> sparkConf.setMaster("yarn-client")
>> 
>> I am not sure, why is it not launching job in yarn-cluster mode.
>> 
>> Any thoughts?
>> 
>> Thanks and Regards,
>> Archit Thakur. 
> 


Re: [VOTE] Release Apache Spark 1.1.0 (RC2)

2014-08-29 Thread Koert Kuipers
i suspect there are more cdh4 than cdh5 clusters. most people plan to move
to cdh5 within say 6 months.


On Fri, Aug 29, 2014 at 3:57 AM, Andrew Ash  wrote:

> FWIW we use CDH4 extensively and would very much appreciate having a
> prebuilt version of Spark for it.
>
> We're doing a CDH 4.4 to 4.7 upgrade across all the clusters now and have
> plans for a 5.x transition after that.
> On Aug 28, 2014 11:57 PM, "Sean Owen"  wrote:
>
> > On Fri, Aug 29, 2014 at 7:42 AM, Patrick Wendell 
> > wrote:
> > > In terms of vendor support for this approach - In the early days
> > > Cloudera asked us to add CDH4 repository and more recently Pivotal and
> > > MapR also asked us to allow linking against their hadoop-client
> > > libraries. So we've added these based on direct requests from vendors.
> > > Given the ubiquity of the Hadoop FileSystem API, it's hard for me to
> > > imagine ruffling feathers by supporting this. But if we get feedback
> > > in that direction over time we can of course consider a different
> > > approach.
> >
> > By this, you mean that it's easy to control the Hadoop version in the
> > build and set it to some other vendor-specific release? Yes that seems
> > ideal. Making the build flexible, and adding the repository references
> > to pom.xml is part of enabling that -- to me, no question that's good.
> >
> > So you can always roll your own build for your cluster, if you need
> > to. I understand the role of the cdh4 / mapr3 / mapr4 binaries as just
> > a convenience.
> >
> > But it's a convenience for people who...
> > - are installing Spark on a cluster (i.e. not an end user)
> > - that doesn't have it in their distro already
> > - whose distro isn't compatible with a plain vanilla Hadoop distro
> >
> > That can't be many. CDH4.6+ is most of the installed CDH base and it
> > already has Spark. I thought MapR already had Spark built in. The
> > audience seems small enough, and the convenience relatively small
> > enough (is it hard to run the distribution script?) that it caused me
> > to ask whether it was worth bothering providing these, especially give
> > the possible ASF sensitivity.
> >
> > I say crack on; you get my point.
> >
> > -
> > To unsubscribe, e-mail: dev-unsubscr...@spark.apache.org
> > For additional commands, e-mail: dev-h...@spark.apache.org
> >
> >
>


Re: Running Spark On Yarn without Spark-Submit

2014-08-29 Thread Archit Thakur
including u...@spark.apache.org.


On Fri, Aug 29, 2014 at 2:03 PM, Archit Thakur 
wrote:

> Hi,
>
> My requirement is to run Spark on Yarn without using the script
> spark-submit.
>
> I have a servlet and a tomcat server. As and when request comes, it
> creates a new SC and keeps it alive for the further requests, I ma setting
> my master in sparkConf
>
> as sparkConf.setMaster("yarn-cluster")
>
> but the request is stuck indefinitely.
>
> This works when I set
> sparkConf.setMaster("yarn-client")
>
> I am not sure, why is it not launching job in yarn-cluster mode.
>
> Any thoughts?
>
> Thanks and Regards,
> Archit Thakur.
>
>
>
>


Running Spark On Yarn without Spark-Submit

2014-08-29 Thread Archit Thakur
Hi,

My requirement is to run Spark on Yarn without using the script
spark-submit.

I have a servlet and a tomcat server. As and when request comes, it creates
a new SC and keeps it alive for the further requests, I ma setting my
master in sparkConf

as sparkConf.setMaster("yarn-cluster")

but the request is stuck indefinitely.

This works when I set
sparkConf.setMaster("yarn-client")

I am not sure, why is it not launching job in yarn-cluster mode.

Any thoughts?

Thanks and Regards,
Archit Thakur.


Re: [VOTE] Release Apache Spark 1.1.0 (RC2)

2014-08-29 Thread Andrew Ash
FWIW we use CDH4 extensively and would very much appreciate having a
prebuilt version of Spark for it.

We're doing a CDH 4.4 to 4.7 upgrade across all the clusters now and have
plans for a 5.x transition after that.
On Aug 28, 2014 11:57 PM, "Sean Owen"  wrote:

> On Fri, Aug 29, 2014 at 7:42 AM, Patrick Wendell 
> wrote:
> > In terms of vendor support for this approach - In the early days
> > Cloudera asked us to add CDH4 repository and more recently Pivotal and
> > MapR also asked us to allow linking against their hadoop-client
> > libraries. So we've added these based on direct requests from vendors.
> > Given the ubiquity of the Hadoop FileSystem API, it's hard for me to
> > imagine ruffling feathers by supporting this. But if we get feedback
> > in that direction over time we can of course consider a different
> > approach.
>
> By this, you mean that it's easy to control the Hadoop version in the
> build and set it to some other vendor-specific release? Yes that seems
> ideal. Making the build flexible, and adding the repository references
> to pom.xml is part of enabling that -- to me, no question that's good.
>
> So you can always roll your own build for your cluster, if you need
> to. I understand the role of the cdh4 / mapr3 / mapr4 binaries as just
> a convenience.
>
> But it's a convenience for people who...
> - are installing Spark on a cluster (i.e. not an end user)
> - that doesn't have it in their distro already
> - whose distro isn't compatible with a plain vanilla Hadoop distro
>
> That can't be many. CDH4.6+ is most of the installed CDH base and it
> already has Spark. I thought MapR already had Spark built in. The
> audience seems small enough, and the convenience relatively small
> enough (is it hard to run the distribution script?) that it caused me
> to ask whether it was worth bothering providing these, especially give
> the possible ASF sensitivity.
>
> I say crack on; you get my point.
>
> -
> To unsubscribe, e-mail: dev-unsubscr...@spark.apache.org
> For additional commands, e-mail: dev-h...@spark.apache.org
>
>