Re: [build system] jenkins downtime friday afternoon, july 29th 2016

2016-07-29 Thread shane knapp
we're done and building! a bunch of builds failed w/git auth issues, due to me cancelling the quiet period early (as i thought the firewall update was done). this is no longer the case as i was more patient this time. :) happy friday! shane On Fri, Jul 29, 2016 at 1:45 PM, shane knapp

Re: [build system] jenkins downtime friday afternoon, july 29th 2016

2016-07-29 Thread Reynold Xin
Nice! Thanks! On Fri, Jul 29, 2016 at 1:45 PM, shane knapp wrote: > the move is complete and the machines powered back up right away, with > no problems. we're doing a quick update on the firewall, and then > we'll be done! > > On Fri, Jul 29, 2016 at 1:03 PM, shane knapp

Re: [build system] jenkins downtime friday afternoon, july 29th 2016

2016-07-29 Thread shane knapp
the move is complete and the machines powered back up right away, with no problems. we're doing a quick update on the firewall, and then we'll be done! On Fri, Jul 29, 2016 at 1:03 PM, shane knapp wrote: > machines are going down NOW > > On Fri, Jul 29, 2016 at 10:53 AM,

Re: Clarifying that spark-x.x.x-bin-hadoopx.x.tgz doesn't include Hadoop itself

2016-07-29 Thread Marcelo Vanzin
On Fri, Jul 29, 2016 at 1:13 PM, Nicholas Chammas wrote: > The Hadoop jars packaged with Spark just allow Spark to interact with Hadoop, > or allow it to use the Hadoop API for interacting with systems like S3, > right? If you want HDFS, MapReduce, etc. you're

Re: Clarifying that spark-x.x.x-bin-hadoopx.x.tgz doesn't include Hadoop itself

2016-07-29 Thread Nicholas Chammas
Hmm, perhaps I'm the one who's confused. 樂 I thought the person in the linked discussion expected Hadoop itself (i.e. the full application, not just the jars) to somehow be included, but rereading the discussion I may have just misinterpreted them. The Hadoop jars packaged with Spark just allow

Re: Clarifying that spark-x.x.x-bin-hadoopx.x.tgz doesn't include Hadoop itself

2016-07-29 Thread Cody Koeninger
Yeah, and the without hadoop was even more confusing... because if you weren't using hdfs at all, you still needed to download one of the hadoop-x packages in order to get hadoop io classes used by almost everything. :) On Fri, Jul 29, 2016 at 3:06 PM, Marcelo Vanzin wrote:

Re: Clarifying that spark-x.x.x-bin-hadoopx.x.tgz doesn't include Hadoop itself

2016-07-29 Thread Marcelo Vanzin
Why do you say Hadoop is not included? The Hadoop jars are there in the tarball, and match the advertised version. There is (or at least there was in 1.x) a version called "without-hadoop" which did not include any Hadoop jars. On Fri, Jul 29, 2016 at 12:56 PM, Nicholas Chammas

Re: [build system] jenkins downtime friday afternoon, july 29th 2016

2016-07-29 Thread shane knapp
machines are going down NOW On Fri, Jul 29, 2016 at 10:53 AM, shane knapp wrote: > reminder -- this is happening TODAY. jenkins is currently in quiet mode. > > i will post updates over the course of the afternoon, and we should be > back up and building before COB. > > On

Clarifying that spark-x.x.x-bin-hadoopx.x.tgz doesn't include Hadoop itself

2016-07-29 Thread Nicholas Chammas
I had an interaction on my project today that suggested some people may be confused about what the packages available on the downloads page are actually for. Specifically, the various -hadoopx.x.tgz packages suggest that

Re: [build system] jenkins downtime friday afternoon, july 29th 2016

2016-07-29 Thread shane knapp
reminder -- this is happening TODAY. jenkins is currently in quiet mode. i will post updates over the course of the afternoon, and we should be back up and building before COB. On Thu, Jul 28, 2016 at 4:06 PM, shane knapp wrote: > reminder -- this is happening TOMORROW. >

Re: sampling operation for DStream

2016-07-29 Thread Cody Koeninger
Most stream systems you're still going to incur the cost of reading each message... I suppose you could rotate among reading just the latest messages from a single partition of a Kafka topic if they were evenly balanced. But once you've read the messages, nothing's stopping you from filtering

sampling operation for DStream

2016-07-29 Thread Martin Le
Hi all, I have to handle high-speed rate data stream. To reduce the heavy load, I want to use sampling techniques for each stream window. It means that I want to process a subset of data instead of whole window data. I saw Spark support sampling operations for RDD, but for DStream, Spark supports

Re: renaming "minor release" to "feature release"

2016-07-29 Thread Mark Hamstra
One issue worth at least considering is that our minor releases usually do not include only new features, but also many bug-fixes -- at least some of which often do not get backported into the next patch-level release. "Feature release" does not convey that information. On Thu, Jul 28, 2016 at

Re: tpcds for spark2.0

2016-07-29 Thread Olivier Girardot
I have the same kind of issue (not using spark-sql-perf), just trying to deploy 2.0.0 on mesos. I'll keep you posted as I investigate On Wed, Jul 27, 2016 1:06 PM, kevin kiss.kevin...@gmail.com wrote: hi,all: I want to have a test about tpcds99 sql run on spark2.0. I user