I am finding using the Dataset API to be very cumbersome to use, which is
unfortunate, as I was looking forward to the type-safety after coming from
a Dataframe codebase.
This link summarizes my troubles: http://loicdescotte.
github.io/posts/spark2-datasets-type-safety/
The problem is having to c
Spark-ts has been under development for a while. So I doubt there is any
integration with Structured Streaming. That said, Structured Streaming uses
DataFrames and Datasets, and a lot of existing libraries build on
Datasets/DataFrames should work directly, especially if they are map-like
functions.
*Environment*:
AWS EMR, yarn cluster.
*Description*:
I am trying to use a java filter to protect the access to spark ui, this by
using the property spark.ui.filters; the problem is that when spark is
running on yarn mode, that property is being allways overriden by hadoop
with the filter org.apach
*Environment:*
AWS EMR, yarn cluster.
*Description:*
On Spark ui, in Environment and Executors tabs, the links of stdout and
stderr point to the internal address of the executors. This would imply to
expose the executors so that links can be accessed. Shouldn't those links
be pointed to master
That source repo is at https://github.com/palantir/spark/ with artifacts
published to Palantir's bintray at
https://palantir.bintray.com/releases/org/apache/spark/ If you're seeing
any of them in Maven Central please flag, as that's a mistake!
Andrew
On Tue, Jan 9, 2018 at 10:10 AM, Sean Owen w
Just to follow up -- those are actually in a Palantir repo, not Central.
Deploying to Central would be uncourteous, but this approach is legitimate
and how it has to work for vendors to release distros of Spark etc.
On Tue, Jan 9, 2018 at 11:43 AM Nan Zhu wrote:
> Hi, all
>
> Out of curious, I j
nvm
On Tue, Jan 9, 2018 at 9:42 AM, Nan Zhu wrote:
> Hi, all
>
> Out of curious, I just found a bunch of Palantir release under
> org.apache.spark in maven central (https://mvnrepository.com/
> artifact/org.apache.spark/spark-core_2.11)?
>
> Is it on purpose?
>
> Best,
>
> Nan
>
>
>
Hi, all
Out of curious, I just found a bunch of Palantir release under
org.apache.spark in maven central (
https://mvnrepository.com/artifact/org.apache.spark/spark-core_2.11)?
Is it on purpose?
Best,
Nan
Hi everyone!
I'm trying to pass -D properties (JVM properties) to a Spark Application where
we have some UDAF (User Defined Aggregate Functions) who will read those
properties (using System.getProperty()). The problem is, the properties are
never there when we try to read them.
According to th
I may be wrong here, but when I see github and apache pig, it says that
there are 8 contributors, and when I see github and look at apache spark it
says there are more than 1000 contributors. And if the above is true I ask
myself, why not shift to SPARK by learning it?
I also started with map redu
Hi,
I am totally confused here, may be because I do not exactly understand
this, but why is this required? I have always used SPARK UI and found that
more than sufficient. And if you know a bit about how SPARK session works
then your performance does have a certain degree of predictability as well
11 matches
Mail list logo