Re: [VOTE] Release Apache Spark 1.6.3 (RC2)

2016-11-04 Thread Weiqing Yang
+1 (non-binding)

Built and tested on CentOS Linux release 7.0.1406 / openjdk version
"1.8.0_111".

On Fri, Nov 4, 2016 at 9:06 AM, Ricardo Almeida <
ricardo.alme...@actnowib.com> wrote:

> +1 (non-binding)
>
> tested over Ubuntu / OpenJDK 1.8.0_111
>
> On 4 November 2016 at 10:00, Sean Owen  wrote:
>
>> Likewise, ran my usual tests on Ubuntu with 
>> yarn/hive/hive-thriftserver/hadoop-2.6
>> on JDK 8 and all passed. Sigs and licenses are OK. +1
>>
>>
>> On Thu, Nov 3, 2016 at 7:57 PM Herman van Hövell tot Westerflier <
>> hvanhov...@databricks.com> wrote:
>>
>>> +1
>>>
>>> On Thu, Nov 3, 2016 at 6:58 PM, Michael Armbrust >> > wrote:
>>>
>>> +1
>>>
>>> On Wed, Nov 2, 2016 at 5:40 PM, Reynold Xin  wrote:
>>>
>>> Please vote on releasing the following candidate as Apache Spark version
>>> 1.6.3. The vote is open until Sat, Nov 5, 2016 at 18:00 PDT and passes if a
>>> majority of at least 3+1 PMC votes are cast.
>>>
>>> [ ] +1 Release this package as Apache Spark 1.6.3
>>> [ ] -1 Do not release this package because ...
>>>
>>>
>>> The tag to be voted on is v1.6.3-rc2 (1e860747458d74a4ccbd081103a05
>>> 42a2367b14b)
>>>
>>> This release candidate addresses 52 JIRA tickets:
>>> https://s.apache.org/spark-1.6.3-jira
>>>
>>> The release files, including signatures, digests, etc. can be found at:
>>> http://people.apache.org/~pwendell/spark-releases/spark-1.6.3-rc2-bin/
>>>
>>> Release artifacts are signed with the following key:
>>> https://people.apache.org/keys/committer/pwendell.asc
>>>
>>> The staging repository for this release can be found at:
>>> https://repository.apache.org/content/repositories/orgapachespark-1212/
>>>
>>> The documentation corresponding to this release can be found at:
>>> http://people.apache.org/~pwendell/spark-releases/spark-1.6.3-rc2-docs/
>>>
>>>
>>> ===
>>> == How can I help test this release?
>>> ===
>>> If you are a Spark user, you can help us test this release by taking an
>>> existing Spark workload and running on this release candidate, then
>>> reporting any regressions from 1.6.2.
>>>
>>> 
>>> == What justifies a -1 vote for this release?
>>> 
>>> This is a maintenance release in the 1.6.x series.  Bugs already present
>>> in 1.6.2, missing features, or bugs related to new features will not
>>> necessarily block this release.
>>>
>>>
>>>
>>>
>


Re: [VOTE] Release Apache Spark 2.0.2 (RC2)

2016-11-04 Thread Reynold Xin
I will cut a new one once https://github.com/apache/spark/pull/15774 gets
in.


On Fri, Nov 4, 2016 at 11:44 AM, Sean Owen  wrote:

> I guess it's worth explicitly stating that I think we need another RC one
> way or the other because this test seems to consistently fail. It was a
> (surprising) last-minute regression. I think I'd have to say -1 only for
> this.
>
> Reverting https://github.com/apache/spark/pull/15706 for branch-2.0 would
> unblock this. There's also some discussion about an alternative resolution
> for the test problem.
>
>
> On Wed, Nov 2, 2016 at 5:44 PM Sean Owen  wrote:
>
>> Sigs, license, etc are OK. There are no Blockers for 2.0.2, though here
>> are the 4 issues still open:
>>
>> SPARK-14387 Enable Hive-1.x ORC compatibility with spark.sql.hive.
>> convertMetastoreOrc
>> SPARK-17957 Calling outer join and na.fill(0) and then inner join will
>> miss rows
>> SPARK-17981 Incorrectly Set Nullability to False in FilterExec
>> SPARK-18160 spark.files & spark.jars should not be passed to driver in
>> yarn mode
>>
>> Running with Java 8, -Pyarn -Phive -Phive-thriftserver -Phadoop-2.7 on
>> Ubuntu 16, I am seeing consistent failures in this test below. I think we
>> very recently changed this so it could be legitimate. But does anyone else
>> see something like this? I have seen other failures in this test due to OOM
>> but my MAVEN_OPTS allows 6g of heap, which ought to be plenty.
>>
>>
>> - SPARK-18189: Fix serialization issue in KeyValueGroupedDataset ***
>> FAILED ***
>>   isContain was true Interpreter output contained 'Exception':
>>   Welcome to
>>   __
>>/ __/__  ___ _/ /__
>>   _\ \/ _ \/ _ `/ __/  '_/
>>  /___/ .__/\_,_/_/ /_/\_\   version 2.0.2
>> /_/
>>
>>   Using Scala version 2.11.8 (OpenJDK 64-Bit Server VM, Java 1.8.0_102)
>>   Type in expressions to have them evaluated.
>>   Type :help for more information.
>>
>>   scala>
>>   scala> keyValueGrouped: org.apache.spark.sql.
>> KeyValueGroupedDataset[Int,(Int, Int)] = org.apache.spark.sql.
>> KeyValueGroupedDataset@70c30f72
>>
>>   scala> mapGroups: org.apache.spark.sql.Dataset[(Int, Int)] = [_1: int,
>> _2: int]
>>
>>   scala> broadcasted: org.apache.spark.broadcast.Broadcast[Int] =
>> Broadcast(0)
>>
>>   scala>
>>   scala>
>>   scala> dataset: org.apache.spark.sql.Dataset[Int] = [value: int]
>>
>>   scala> org.apache.spark.SparkException: Job aborted due to stage
>> failure: Task 0 in stage 0.0 failed 1 times, most recent failure: Lost task
>> 0.0 in stage 0.0 (TID 0, localhost): 
>> com.google.common.util.concurrent.ExecutionError:
>> java.lang.ClassCircularityError: io/netty/util/internal/__
>> matchers__/org/apache/spark/network/protocol/MessageMatcher
>>   at com.google.common.cache.LocalCache$Segment.get(LocalCache.java:2261)
>>   at com.google.common.cache.LocalCache.get(LocalCache.java:4000)
>>   at com.google.common.cache.LocalCache.getOrLoad(LocalCache.java:4004)
>>   at com.google.common.cache.LocalCache$LocalLoadingCache.
>> get(LocalCache.java:4874)
>>   at org.apache.spark.sql.catalyst.expressions.codegen.
>> CodeGenerator$.compile(CodeGenerator.scala:841)
>>   at org.apache.spark.sql.catalyst.expressions.codegen.
>> GenerateSafeProjection$.create(GenerateSafeProjection.scala:188)
>>   at org.apache.spark.sql.catalyst.expressions.codegen.
>> GenerateSafeProjection$.create(GenerateSafeProjection.scala:36)
>>   at org.apache.spark.sql.catalyst.expressions.codegen.
>> CodeGenerator.generate(CodeGenerator.scala:825)
>>   at org.apache.spark.sql.catalyst.expressions.codegen.
>> CodeGenerator.generate(CodeGenerator.scala:822)
>>   at org.apache.spark.sql.execution.ObjectOperator$.
>> deserializeRowToObject(objects.scala:137)
>>   at org.apache.spark.sql.execution.AppendColumnsExec$$
>> anonfun$9.apply(objects.scala:251)
>>   at org.apache.spark.sql.execution.AppendColumnsExec$$
>> anonfun$9.apply(objects.scala:250)
>>   at org.apache.spark.rdd.RDD$$anonfun$mapPartitionsInternal$
>> 1$$anonfun$apply$24.apply(RDD.scala:803)
>>   at org.apache.spark.rdd.RDD$$anonfun$mapPartitionsInternal$
>> 1$$anonfun$apply$24.apply(RDD.scala:803)
>>   at org.apache.spark.rdd.MapPartitionsRDD.compute(
>> MapPartitionsRDD.scala:38)
>>   at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:319)
>>   at org.apache.spark.rdd.RDD.iterator(RDD.scala:283)
>>   at org.apache.spark.rdd.MapPartitionsRDD.compute(
>> MapPartitionsRDD.scala:38)
>>   at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:319)
>>   at org.apache.spark.rdd.RDD.iterator(RDD.scala:283)
>>   at org.apache.spark.scheduler.ShuffleMapTask.runTask(
>> ShuffleMapTask.scala:79)
>>   at org.apache.spark.scheduler.ShuffleMapTask.runTask(
>> ShuffleMapTask.scala:47)
>>   at org.apache.spark.scheduler.Task.run(Task.scala:86)
>>   at org.apache.spark.executor.Executor$TaskRunner.run(
>> Executor.scala:274)
>>   at java.util.concurrent.ThreadPoolExecutor.runWorker(
>> 

Re: [VOTE] Release Apache Spark 2.0.2 (RC2)

2016-11-04 Thread Herman van Hövell tot Westerflier
+1

On Fri, Nov 4, 2016 at 7:20 PM, Michael Armbrust 
wrote:

> +1
>
> On Tue, Nov 1, 2016 at 9:51 PM, Reynold Xin  wrote:
>
>> Please vote on releasing the following candidate as Apache Spark version
>> 2.0.2. The vote is open until Fri, Nov 4, 2016 at 22:00 PDT and passes if a
>> majority of at least 3+1 PMC votes are cast.
>>
>> [ ] +1 Release this package as Apache Spark 2.0.2
>> [ ] -1 Do not release this package because ...
>>
>>
>> The tag to be voted on is v2.0.2-rc2 (a6abe1ee22141931614bf27a4f371
>> c46d8379e33)
>>
>> This release candidate resolves 84 issues: https://s.apache.org/spark-2.0
>> .2-jira
>>
>> The release files, including signatures, digests, etc. can be found at:
>> http://people.apache.org/~pwendell/spark-releases/spark-2.0.2-rc2-bin/
>>
>> Release artifacts are signed with the following key:
>> https://people.apache.org/keys/committer/pwendell.asc
>>
>> The staging repository for this release can be found at:
>> https://repository.apache.org/content/repositories/orgapachespark-1210/
>>
>> The documentation corresponding to this release can be found at:
>> http://people.apache.org/~pwendell/spark-releases/spark-2.0.2-rc2-docs/
>>
>>
>> Q: How can I help test this release?
>> A: If you are a Spark user, you can help us test this release by taking
>> an existing Spark workload and running on this release candidate, then
>> reporting any regressions from 2.0.1.
>>
>> Q: What justifies a -1 vote for this release?
>> A: This is a maintenance release in the 2.0.x series. Bugs already
>> present in 2.0.1, missing features, or bugs related to new features will
>> not necessarily block this release.
>>
>> Q: What fix version should I use for patches merging into branch-2.0 from
>> now on?
>> A: Please mark the fix version as 2.0.3, rather than 2.0.2. If a new RC
>> (i.e. RC3) is cut, I will change the fix version of those patches to 2.0.2.
>>
>
>


Re: [VOTE] Release Apache Spark 2.0.2 (RC2)

2016-11-04 Thread Sean Owen
I guess it's worth explicitly stating that I think we need another RC one
way or the other because this test seems to consistently fail. It was a
(surprising) last-minute regression. I think I'd have to say -1 only for
this.

Reverting https://github.com/apache/spark/pull/15706 for branch-2.0 would
unblock this. There's also some discussion about an alternative resolution
for the test problem.

On Wed, Nov 2, 2016 at 5:44 PM Sean Owen  wrote:

> Sigs, license, etc are OK. There are no Blockers for 2.0.2, though here
> are the 4 issues still open:
>
> SPARK-14387 Enable Hive-1.x ORC compatibility with
> spark.sql.hive.convertMetastoreOrc
> SPARK-17957 Calling outer join and na.fill(0) and then inner join will
> miss rows
> SPARK-17981 Incorrectly Set Nullability to False in FilterExec
> SPARK-18160 spark.files & spark.jars should not be passed to driver in
> yarn mode
>
> Running with Java 8, -Pyarn -Phive -Phive-thriftserver -Phadoop-2.7 on
> Ubuntu 16, I am seeing consistent failures in this test below. I think we
> very recently changed this so it could be legitimate. But does anyone else
> see something like this? I have seen other failures in this test due to OOM
> but my MAVEN_OPTS allows 6g of heap, which ought to be plenty.
>
>
> - SPARK-18189: Fix serialization issue in KeyValueGroupedDataset ***
> FAILED ***
>   isContain was true Interpreter output contained 'Exception':
>   Welcome to
>   __
>/ __/__  ___ _/ /__
>   _\ \/ _ \/ _ `/ __/  '_/
>  /___/ .__/\_,_/_/ /_/\_\   version 2.0.2
> /_/
>
>   Using Scala version 2.11.8 (OpenJDK 64-Bit Server VM, Java 1.8.0_102)
>   Type in expressions to have them evaluated.
>   Type :help for more information.
>
>   scala>
>   scala> keyValueGrouped:
> org.apache.spark.sql.KeyValueGroupedDataset[Int,(Int, Int)] =
> org.apache.spark.sql.KeyValueGroupedDataset@70c30f72
>
>   scala> mapGroups: org.apache.spark.sql.Dataset[(Int, Int)] = [_1: int,
> _2: int]
>
>   scala> broadcasted: org.apache.spark.broadcast.Broadcast[Int] =
> Broadcast(0)
>
>   scala>
>   scala>
>   scala> dataset: org.apache.spark.sql.Dataset[Int] = [value: int]
>
>   scala> org.apache.spark.SparkException: Job aborted due to stage
> failure: Task 0 in stage 0.0 failed 1 times, most recent failure: Lost task
> 0.0 in stage 0.0 (TID 0, localhost):
> com.google.common.util.concurrent.ExecutionError:
> java.lang.ClassCircularityError:
> io/netty/util/internal/__matchers__/org/apache/spark/network/protocol/MessageMatcher
>   at com.google.common.cache.LocalCache$Segment.get(LocalCache.java:2261)
>   at com.google.common.cache.LocalCache.get(LocalCache.java:4000)
>   at com.google.common.cache.LocalCache.getOrLoad(LocalCache.java:4004)
>   at
> com.google.common.cache.LocalCache$LocalLoadingCache.get(LocalCache.java:4874)
>   at
> org.apache.spark.sql.catalyst.expressions.codegen.CodeGenerator$.compile(CodeGenerator.scala:841)
>   at
> org.apache.spark.sql.catalyst.expressions.codegen.GenerateSafeProjection$.create(GenerateSafeProjection.scala:188)
>   at
> org.apache.spark.sql.catalyst.expressions.codegen.GenerateSafeProjection$.create(GenerateSafeProjection.scala:36)
>   at
> org.apache.spark.sql.catalyst.expressions.codegen.CodeGenerator.generate(CodeGenerator.scala:825)
>   at
> org.apache.spark.sql.catalyst.expressions.codegen.CodeGenerator.generate(CodeGenerator.scala:822)
>   at
> org.apache.spark.sql.execution.ObjectOperator$.deserializeRowToObject(objects.scala:137)
>   at
> org.apache.spark.sql.execution.AppendColumnsExec$$anonfun$9.apply(objects.scala:251)
>   at
> org.apache.spark.sql.execution.AppendColumnsExec$$anonfun$9.apply(objects.scala:250)
>   at
> org.apache.spark.rdd.RDD$$anonfun$mapPartitionsInternal$1$$anonfun$apply$24.apply(RDD.scala:803)
>   at
> org.apache.spark.rdd.RDD$$anonfun$mapPartitionsInternal$1$$anonfun$apply$24.apply(RDD.scala:803)
>   at
> org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38)
>   at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:319)
>   at org.apache.spark.rdd.RDD.iterator(RDD.scala:283)
>   at
> org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38)
>   at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:319)
>   at org.apache.spark.rdd.RDD.iterator(RDD.scala:283)
>   at
> org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:79)
>   at
> org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:47)
>   at org.apache.spark.scheduler.Task.run(Task.scala:86)
>   at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:274)
>   at
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
>   at
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
>   at java.lang.Thread.run(Thread.java:745)
>   Caused by: java.lang.ClassCircularityError:
> io/netty/util/internal/__matchers__/org/apache/spark/network/protocol/MessageMatcher

Re: [VOTE] Release Apache Spark 2.0.2 (RC2)

2016-11-04 Thread Joseph Bradley
+1

On Fri, Nov 4, 2016 at 11:20 AM, Michael Armbrust 
wrote:

> +1
>
> On Tue, Nov 1, 2016 at 9:51 PM, Reynold Xin  wrote:
>
>> Please vote on releasing the following candidate as Apache Spark version
>> 2.0.2. The vote is open until Fri, Nov 4, 2016 at 22:00 PDT and passes if a
>> majority of at least 3+1 PMC votes are cast.
>>
>> [ ] +1 Release this package as Apache Spark 2.0.2
>> [ ] -1 Do not release this package because ...
>>
>>
>> The tag to be voted on is v2.0.2-rc2 (a6abe1ee22141931614bf27a4f371
>> c46d8379e33)
>>
>> This release candidate resolves 84 issues: https://s.apache.org/spark-2.0
>> .2-jira
>>
>> The release files, including signatures, digests, etc. can be found at:
>> http://people.apache.org/~pwendell/spark-releases/spark-2.0.2-rc2-bin/
>>
>> Release artifacts are signed with the following key:
>> https://people.apache.org/keys/committer/pwendell.asc
>>
>> The staging repository for this release can be found at:
>> https://repository.apache.org/content/repositories/orgapachespark-1210/
>>
>> The documentation corresponding to this release can be found at:
>> http://people.apache.org/~pwendell/spark-releases/spark-2.0.2-rc2-docs/
>>
>>
>> Q: How can I help test this release?
>> A: If you are a Spark user, you can help us test this release by taking
>> an existing Spark workload and running on this release candidate, then
>> reporting any regressions from 2.0.1.
>>
>> Q: What justifies a -1 vote for this release?
>> A: This is a maintenance release in the 2.0.x series. Bugs already
>> present in 2.0.1, missing features, or bugs related to new features will
>> not necessarily block this release.
>>
>> Q: What fix version should I use for patches merging into branch-2.0 from
>> now on?
>> A: Please mark the fix version as 2.0.3, rather than 2.0.2. If a new RC
>> (i.e. RC3) is cut, I will change the fix version of those patches to 2.0.2.
>>
>
>


Re: [VOTE] Release Apache Spark 2.0.2 (RC2)

2016-11-04 Thread Michael Armbrust
+1

On Tue, Nov 1, 2016 at 9:51 PM, Reynold Xin  wrote:

> Please vote on releasing the following candidate as Apache Spark version
> 2.0.2. The vote is open until Fri, Nov 4, 2016 at 22:00 PDT and passes if a
> majority of at least 3+1 PMC votes are cast.
>
> [ ] +1 Release this package as Apache Spark 2.0.2
> [ ] -1 Do not release this package because ...
>
>
> The tag to be voted on is v2.0.2-rc2 (a6abe1ee22141931614bf27a4f371c
> 46d8379e33)
>
> This release candidate resolves 84 issues: https://s.apache.org/spark-2.
> 0.2-jira
>
> The release files, including signatures, digests, etc. can be found at:
> http://people.apache.org/~pwendell/spark-releases/spark-2.0.2-rc2-bin/
>
> Release artifacts are signed with the following key:
> https://people.apache.org/keys/committer/pwendell.asc
>
> The staging repository for this release can be found at:
> https://repository.apache.org/content/repositories/orgapachespark-1210/
>
> The documentation corresponding to this release can be found at:
> http://people.apache.org/~pwendell/spark-releases/spark-2.0.2-rc2-docs/
>
>
> Q: How can I help test this release?
> A: If you are a Spark user, you can help us test this release by taking an
> existing Spark workload and running on this release candidate, then
> reporting any regressions from 2.0.1.
>
> Q: What justifies a -1 vote for this release?
> A: This is a maintenance release in the 2.0.x series. Bugs already present
> in 2.0.1, missing features, or bugs related to new features will not
> necessarily block this release.
>
> Q: What fix version should I use for patches merging into branch-2.0 from
> now on?
> A: Please mark the fix version as 2.0.3, rather than 2.0.2. If a new RC
> (i.e. RC3) is cut, I will change the fix version of those patches to 2.0.2.
>


Anyone want to weigh in on a Kafka DStreams api change?

2016-11-04 Thread Cody Koeninger
SPARK-17510

https://github.com/apache/spark/pull/15132

It's for allowing tweaking of rate limiting on a per-partition basis

-
To unsubscribe e-mail: dev-unsubscr...@spark.apache.org



Re: Continuous warning while consuming using new kafka-spark010 API

2016-11-04 Thread Cody Koeninger
I answered the duplicate post on the user mailing list, I'd say keep
the discussion there.

On Fri, Nov 4, 2016 at 12:14 PM, vonnagy  wrote:
> Nitin,
>
> I am getting the similar issues using Spark 2.0.1 and Kafka 0.10. I have to
> jobs, one that uses a Kafka stream and one that uses just the KafkaRDD.
>
> With the KafkaRDD, I continually get the "Failed to get records". I have
> adjusted the polling with `spark.streaming.kafka.consumer.poll.ms` and the
> size of records with Kafka's `max.poll.records`. Even when it gets records
> it is extremely slow.
>
> When working with multiple KafkaRDDs in parallel I get the dreaded
> `ConcurrentModificationException`. The Spark logic is supposed to use a
> CachedKafkaConsumer based on the topic and partition. This is supposed to
> guarantee thread safety, but I continually get this error along with the
> polling timeout.
>
> Has anyone else tried to use Spark 2 with Kafka 0.10 and had any success. At
> this point it is completely useless in my experience. With Spark 1.6 and
> Kafka 0.8.x, I never had these problems.
>
>
>
> --
> View this message in context: 
> http://apache-spark-developers-list.1001551.n3.nabble.com/Continuous-warning-while-consuming-using-new-kafka-spark010-API-tp18987p19736.html
> Sent from the Apache Spark Developers List mailing list archive at Nabble.com.
>
> -
> To unsubscribe e-mail: dev-unsubscr...@spark.apache.org
>

-
To unsubscribe e-mail: dev-unsubscr...@spark.apache.org



Hadoop Summit EU 2017

2016-11-04 Thread Owen O'Malley
The DataWorks Summit EU 2017 (including Hadoop Summit) is going to be in
Munich April 5-6 2017
.  I’ve pasted the text from the CFP below.

Would you like to share your knowledge with the best and brightest in the
data community? If so, we encourage you to submit an abstract for DataWorks
Summit with Hadoop Summit being held on April 5-6, 2017 at The ICM –
International Congress Center Munich.

DataWorks Summit with Hadoop Summit is the premier event for business and
technical audiences who want to learn how data is transforming business and
the underlying technologies that are driving that change.

Our 2017 tracks include:
· Applications
· Enterprise Adoption
· Data Processing & Warehousing
· Apache Hadoop Core Internals
· Governance & Security
· IoT & Streaming
· Cloud & Operations
· Apache Spark & Data Science
For questions or additional information, please contact Joshua Woodward.

Deadline: Friday, November 11, 2017.
Submission Link: http://dataworkssummit.com/munich-2017/abstracts/submit-
abstract/

.. Owen


Re: Continuous warning while consuming using new kafka-spark010 API

2016-11-04 Thread vonnagy
Nitin,

I am getting the similar issues using Spark 2.0.1 and Kafka 0.10. I have to
jobs, one that uses a Kafka stream and one that uses just the KafkaRDD. 

With the KafkaRDD, I continually get the "Failed to get records". I have
adjusted the polling with `spark.streaming.kafka.consumer.poll.ms` and the
size of records with Kafka's `max.poll.records`. Even when it gets records
it is extremely slow.

When working with multiple KafkaRDDs in parallel I get the dreaded
`ConcurrentModificationException`. The Spark logic is supposed to use a
CachedKafkaConsumer based on the topic and partition. This is supposed to
guarantee thread safety, but I continually get this error along with the
polling timeout.

Has anyone else tried to use Spark 2 with Kafka 0.10 and had any success. At
this point it is completely useless in my experience. With Spark 1.6 and
Kafka 0.8.x, I never had these problems.



--
View this message in context: 
http://apache-spark-developers-list.1001551.n3.nabble.com/Continuous-warning-while-consuming-using-new-kafka-spark010-API-tp18987p19736.html
Sent from the Apache Spark Developers List mailing list archive at Nabble.com.

-
To unsubscribe e-mail: dev-unsubscr...@spark.apache.org



Re: [VOTE] Release Apache Spark 2.0.2 (RC2)

2016-11-04 Thread Yin Huai
+1

On Tue, Nov 1, 2016 at 9:51 PM, Reynold Xin  wrote:

> Please vote on releasing the following candidate as Apache Spark version
> 2.0.2. The vote is open until Fri, Nov 4, 2016 at 22:00 PDT and passes if a
> majority of at least 3+1 PMC votes are cast.
>
> [ ] +1 Release this package as Apache Spark 2.0.2
> [ ] -1 Do not release this package because ...
>
>
> The tag to be voted on is v2.0.2-rc2 (a6abe1ee22141931614bf27a4f371c
> 46d8379e33)
>
> This release candidate resolves 84 issues: https://s.apache.org/spark-2.
> 0.2-jira
>
> The release files, including signatures, digests, etc. can be found at:
> http://people.apache.org/~pwendell/spark-releases/spark-2.0.2-rc2-bin/
>
> Release artifacts are signed with the following key:
> https://people.apache.org/keys/committer/pwendell.asc
>
> The staging repository for this release can be found at:
> https://repository.apache.org/content/repositories/orgapachespark-1210/
>
> The documentation corresponding to this release can be found at:
> http://people.apache.org/~pwendell/spark-releases/spark-2.0.2-rc2-docs/
>
>
> Q: How can I help test this release?
> A: If you are a Spark user, you can help us test this release by taking an
> existing Spark workload and running on this release candidate, then
> reporting any regressions from 2.0.1.
>
> Q: What justifies a -1 vote for this release?
> A: This is a maintenance release in the 2.0.x series. Bugs already present
> in 2.0.1, missing features, or bugs related to new features will not
> necessarily block this release.
>
> Q: What fix version should I use for patches merging into branch-2.0 from
> now on?
> A: Please mark the fix version as 2.0.3, rather than 2.0.2. If a new RC
> (i.e. RC3) is cut, I will change the fix version of those patches to 2.0.2.
>


Re: [VOTE] Release Apache Spark 1.6.3 (RC2)

2016-11-04 Thread Ricardo Almeida
+1 (non-binding)

tested over Ubuntu / OpenJDK 1.8.0_111

On 4 November 2016 at 10:00, Sean Owen  wrote:

> Likewise, ran my usual tests on Ubuntu with 
> yarn/hive/hive-thriftserver/hadoop-2.6
> on JDK 8 and all passed. Sigs and licenses are OK. +1
>
>
> On Thu, Nov 3, 2016 at 7:57 PM Herman van Hövell tot Westerflier <
> hvanhov...@databricks.com> wrote:
>
>> +1
>>
>> On Thu, Nov 3, 2016 at 6:58 PM, Michael Armbrust 
>> wrote:
>>
>> +1
>>
>> On Wed, Nov 2, 2016 at 5:40 PM, Reynold Xin  wrote:
>>
>> Please vote on releasing the following candidate as Apache Spark version
>> 1.6.3. The vote is open until Sat, Nov 5, 2016 at 18:00 PDT and passes if a
>> majority of at least 3+1 PMC votes are cast.
>>
>> [ ] +1 Release this package as Apache Spark 1.6.3
>> [ ] -1 Do not release this package because ...
>>
>>
>> The tag to be voted on is v1.6.3-rc2 (1e860747458d74a4ccbd081103a05
>> 42a2367b14b)
>>
>> This release candidate addresses 52 JIRA tickets:
>> https://s.apache.org/spark-1.6.3-jira
>>
>> The release files, including signatures, digests, etc. can be found at:
>> http://people.apache.org/~pwendell/spark-releases/spark-1.6.3-rc2-bin/
>>
>> Release artifacts are signed with the following key:
>> https://people.apache.org/keys/committer/pwendell.asc
>>
>> The staging repository for this release can be found at:
>> https://repository.apache.org/content/repositories/orgapachespark-1212/
>>
>> The documentation corresponding to this release can be found at:
>> http://people.apache.org/~pwendell/spark-releases/spark-1.6.3-rc2-docs/
>>
>>
>> ===
>> == How can I help test this release?
>> ===
>> If you are a Spark user, you can help us test this release by taking an
>> existing Spark workload and running on this release candidate, then
>> reporting any regressions from 1.6.2.
>>
>> 
>> == What justifies a -1 vote for this release?
>> 
>> This is a maintenance release in the 1.6.x series.  Bugs already present
>> in 1.6.2, missing features, or bugs related to new features will not
>> necessarily block this release.
>>
>>
>>
>>


Re: [VOTE] Release Apache Spark 1.6.3 (RC2)

2016-11-04 Thread Sean Owen
Likewise, ran my usual tests on Ubuntu with
yarn/hive/hive-thriftserver/hadoop-2.6 on JDK 8 and all passed. Sigs and
licenses are OK. +1

On Thu, Nov 3, 2016 at 7:57 PM Herman van Hövell tot Westerflier <
hvanhov...@databricks.com> wrote:

> +1
>
> On Thu, Nov 3, 2016 at 6:58 PM, Michael Armbrust 
> wrote:
>
> +1
>
> On Wed, Nov 2, 2016 at 5:40 PM, Reynold Xin  wrote:
>
> Please vote on releasing the following candidate as Apache Spark version
> 1.6.3. The vote is open until Sat, Nov 5, 2016 at 18:00 PDT and passes if a
> majority of at least 3+1 PMC votes are cast.
>
> [ ] +1 Release this package as Apache Spark 1.6.3
> [ ] -1 Do not release this package because ...
>
>
> The tag to be voted on is v1.6.3-rc2
> (1e860747458d74a4ccbd081103a0542a2367b14b)
>
> This release candidate addresses 52 JIRA tickets:
> https://s.apache.org/spark-1.6.3-jira
>
> The release files, including signatures, digests, etc. can be found at:
> http://people.apache.org/~pwendell/spark-releases/spark-1.6.3-rc2-bin/
>
> Release artifacts are signed with the following key:
> https://people.apache.org/keys/committer/pwendell.asc
>
> The staging repository for this release can be found at:
> https://repository.apache.org/content/repositories/orgapachespark-1212/
>
> The documentation corresponding to this release can be found at:
> http://people.apache.org/~pwendell/spark-releases/spark-1.6.3-rc2-docs/
>
>
> ===
> == How can I help test this release?
> ===
> If you are a Spark user, you can help us test this release by taking an
> existing Spark workload and running on this release candidate, then
> reporting any regressions from 1.6.2.
>
> 
> == What justifies a -1 vote for this release?
> 
> This is a maintenance release in the 1.6.x series.  Bugs already present
> in 1.6.2, missing features, or bugs related to new features will not
> necessarily block this release.
>
>
>
>