While I'd officially -1 this while there are still many blockers, this
should certainly be tested as usual, because they're mostly doc and
"audit" type issues.
On Wed, Jun 22, 2016 at 2:26 AM, Reynold Xin wrote:
> Please vote on releasing the following candidate as Apache Spark version
> 2.0.0. T
what will the scenario in case of s3 and local file system?
On Tue, Jun 21, 2016 at 4:36 PM, Jörn Franke wrote:
> Based on the underlying Hadoop FileFormat. This one does it mostly based
> on blocksize. You can change this though.
>
> On 21 Jun 2016, at 12:19, Sachin Aggarwal
> wrote:
>
>
> wh
SPARK-12818 is about building a bloom filter on existing data. It has
nothing to do with the ORC bloom filter, which can be used to do predicate
pushdown.
On Tue, Jun 21, 2016 at 7:45 PM, BaiRan wrote:
> Hi all,
>
> I have a question about bloom filter implementation in Spark-12818 issue.
> If
Hi all,
I have a question about bloom filter implementation in Spark-12818 issue. If I
have a ORC file with bloom filter metadata, how can I utilise it by Spark SQL?
Thanks.
Best,
Ran
Please vote on releasing the following candidate as Apache Spark version
2.0.0. The vote is open until Friday, June 24, 2016 at 19:00 PDT and passes
if a majority of at least 3+1 PMC votes are cast.
[ ] +1 Release this package as Apache Spark 2.0.0
[ ] -1 Do not release this package because ...
Hey Pete,
I just pushed your PR to branch 1.6. As it's not a blocker, it may or may
not be in 1.6.2, depending on if there will be another RC.
On Tue, Jun 21, 2016 at 1:36 PM, Pete Robbins wrote:
> It breaks Spark running on machines with less than 3 cores/threads, which
> may be rare, and it i
It breaks Spark running on machines with less than 3 cores/threads, which
may be rare, and it is maybe an edge case.
Personally, I like to fix known bugs and the fact there are other blocking
methods in event loops actually makes it worse not to fix ones that you
know about.
Probably not a blocke
Nice one, yeah indeed I was doing an incremental build. Not a blocker.
I'll have a look into the others, though I suspect they're problems
with tests rather than production code.
On Tue, Jun 21, 2016 at 6:53 PM, Marcelo Vanzin wrote:
> On Tue, Jun 21, 2016 at 10:49 AM, Sean Owen wrote:
>> I'm ge
On Tue, Jun 21, 2016 at 10:49 AM, Sean Owen wrote:
> I'm getting some errors building on Ubuntu 16 + Java 7. First is one
> that may just be down to a Scala bug:
>
> [ERROR] bad symbolic reference. A signature in WebUI.class refers to
> term eclipse
> in package org which is not available.
This i
I'm getting some errors building on Ubuntu 16 + Java 7. First is one
that may just be down to a Scala bug:
[ERROR] bad symbolic reference. A signature in WebUI.class refers to
term eclipse
in package org which is not available.
It may be completely missing from the current classpath, or the versio
Hey Pete,
I didn't backport it to 1.6 because it just affects tests in most cases.
I'm sure we also have other places calling blocking methods in the event
loops, so similar issues are still there even after applying this patch.
Hence, I don't think it's a blocker for 1.6.2.
On Tue, Jun 21, 2016
Hi,
Beginner in Spark development. Took time to configure Eclipse + Scala. Is
there any tutorial that can help beginners.
Still struggling to find Spark JAR files for development. There is no lib
folder in my Spark distribution (neither in pre-built nor in custom built..)
Regards,
I think it is valuable to make the distance function pluggable and also
provide some builtin distance function. This might be also useful for other
algorithms besides KMeans.
On Tue, Jun 21, 2016 at 7:48 PM, Simon NANTY
wrote:
> Hi all,
>
>
>
> In my team, we are currently developing a fork of s
Based on the underlying Hadoop FileFormat. This one does it mostly based on
blocksize. You can change this though.
> On 21 Jun 2016, at 12:19, Sachin Aggarwal wrote:
>
>
> when we use readStream to read data as Stream, how spark decides the no of
> RDD and partition within each RDD with respe
Hi all,
In my team, we are currently developing a fork of spark MLlib extending K-means
method such that it is possible to set its own distance function. In this
implementation, it could be possible to directly pass, in argument of the
K-means train function, a distance function whose signature
You can read this documentation to get started with the setup
https://cwiki.apache.org/confluence/display/SPARK/Useful+Developer+Tools#UsefulDeveloperTools-IntelliJ
There was a pyspark setup discussion on SO over here
http://stackoverflow.com/questions/33478218/write-and-run-pyspark-in-intellij-i
when we use readStream to read data as Stream, how spark decides the no of
RDD and partition within each RDD with respect to storage and file format.
val dsJson = sqlContext.readStream.json("/Users/sachin/testSpark/inputJson")
val dsCsv = sqlContext.readStream.option("header","true").csv(
"/Users
Hi Guys
I have got the Aggregator in Spark 2.0 working for case classes and
primitive types and some complex types like Seq[], but when I use Maps or
multi-dimensional arrays I get an exception at runtime. Is this supported
or I am doing something wrong? Here is a code snippet and a stack trace.
T
The PR (https://github.com/apache/spark/pull/13055) to fix
https://issues.apache.org/jira/browse/SPARK-15262 was applied to 1.6.2
however this fix caused another issue
https://issues.apache.org/jira/browse/SPARK-15606 the fix for which (
https://github.com/apache/spark/pull/13355) has not been back
19 matches
Mail list logo