Hi,
Use subscribepattern
You haven't googled well enough -->
https://jaceklaskowski.gitbooks.io/spark-structured-streaming/spark-sql-streaming-KafkaSource.html
:)
Pozdrawiam,
Jacek Laskowski
https://about.me/JacekLaskowski
Spark Structured Streaming (Apache Spark 2.2+)
https://bit.
itbook or any other place you point to :) Thanks!
[1]
https://github.com/apache/spark/commit/18066f2e61f430b691ed8a777c9b4e5786bf9dbc
Pozdrawiam,
Jacek Laskowski
https://about.me/JacekLaskowski
Spark Structured Streaming (Apache Spark 2.2+)
https://bit.ly/spark-structured-streaming
Maste
Hi,
https://stackoverflow.com/q/46032001/1305344 :)
Pozdrawiam,
Jacek Laskowski
https://about.me/JacekLaskowski
Spark Structured Streaming (Apache Spark 2.2+)
https://bit.ly/spark-structured-streaming
Mastering Apache Spark 2 https://bit.ly/mastering-apache-spark
Follow me at https
ps://youtu.be/JAb4FIheP28
Pozdrawiam,
Jacek Laskowski
https://about.me/JacekLaskowski
Spark Structured Streaming (Apache Spark 2.2+)
https://bit.ly/spark-structured-streaming
Mastering Apache Spark 2 https://bit.ly/mastering-apache-spark
Follow me at https://twitter.com/jaceklaskowski
On Fri, Sep
offer!
[1] https://stackoverflow.com/q/46022876/1305344
Pozdrawiam,
Jacek Laskowski
https://about.me/JacekLaskowski
Spark Structured Streaming (Apache Spark 2.2+)
https://bit.ly/spark-structured-streaming
Mastering Apache Spark 2 https://bit.ly/mastering-apache-spark
Follow me at https://t
d appreciate any help. Thanks!
[1]
https://github.com/apache/spark/blob/master/sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/statefulOperators.scala#L249
[2]
https://jaceklaskowski.gitbooks.io/spark-structured-streaming/spark-sql-streaming-StateStoreSaveExec.html
[3]
https
and would appreciate some
more help.
Pozdrawiam,
Jacek Laskowski
https://medium.com/@jaceklaskowski/
Mastering Apache Spark 2 https://bit.ly/mastering-apache-spark
Follow me at https://twitter.com/jaceklaskowski
On Sat, Aug 19, 2017 at 12:10 AM, Burak Yavuz wrote:
> Hi Jacek,
>
>
sink accepting the flag as enabled which would make memory sink the
only one left with the flag enabled for Complete output.
And I thought I've been close to understand Structured Streaming :)
Pozdrawiam,
Jacek Laskowski
https://medium.com/@jaceklaskowski/
Mastering Apache Spark 2 http
-memory-hungry memory sink require yet another thing to get the
query working.
On to exploring the bits...
Pozdrawiam,
Jacek Laskowski
https://medium.com/@jaceklaskowski/
Mastering Apache Spark 2 https://bit.ly/mastering-apache-spark
Follow me at https://twitter.com/jaceklaskowski
On Fri, Aug 18
rk.sql.streaming.StreamingQueryManager.startQuery(StreamingQueryManager.scala:278)
at
org.apache.spark.sql.streaming.DataStreamWriter.start(DataStreamWriter.scala:249)
... 57 elided
Pozdrawiam,
Jacek Laskowski
https://medium.com/@jaceklaskowski/
Mastering Apache Spark 2 https://bit.ly/master
Hi,
Any logs you could share? Anything about the query itself? Watermarked?
Aggregation? How long does it work fine? Is this somehow stable in its
instability? What version of Spark and Kafka?
Pozdrawiam,
Jacek Laskowski
http://blog.japila.pl
On 11 Aug 2017 11:29, "NikhilP" wrote:
&
Hi Michael,
That reflects my sentiments so well. Thanks for having confirmed my thoughts!
https://issues.apache.org/jira/browse/SPARK-21667
Pozdrawiam,
Jacek Laskowski
https://medium.com/@jaceklaskowski/
Mastering Apache Spark 2 https://bit.ly/mastering-apache-spark
Follow me at https
81de598ed657de7R277.
Why is this needed? I can't think of a use case where console sink
could not recover from checkpoint location (since all the information
is available). I'm lost on it and would appreciate some help (to
recover :))
Pozdrawiam,
Jacek Laskowski
https://medium.com/
Hi Myrle,
You're welcome. Pleasure's all mine.
Could you please change Spark Streaming (technically a dead end) with
the modern Structured Streaming. That's what I'd be shooting at.
Thanks.
Pozdrawiam,
Jacek Laskowski
https://medium.com/@jaceklaskowski/
Mastering
g publicly to invite others to have their chance. I could
co-present if that's your first talk.
Pozdrawiam,
Jacek Laskowski
https://medium.com/@jaceklaskowski/
Mastering Apache Spark 2 https://bit.ly/mastering-apache-spark
Follow me at https://twitter.com/jaceklaskowski
On Tue, Jul 18, 2
Hi,
I'd like to hear the official statement too.
My take on GraphX and Spark Streaming is that they are long dead projects
with GraphFrames and Structured Streaming taking their place, respectively.
Jacek
On 13 May 2017 3:00 p.m., "Sergey Zhemzhitsky" wrote:
> Hello Spark users,
>
> I just wo
Thanks Stephen! I appreciate it very much.
And yeah...Stephen is right on this. Go and read the notes and let me know
where you're missing things :-)
p.s. Holden has just announced that her book is complete and think Matei is
also quite far with his writing.
Jacek
On 4 May 2017 2:52 a.m., "Step
sion point).
Pozdrawiam,
Jacek Laskowski
https://medium.com/@jaceklaskowski/
Mastering Apache Spark 2 https://bit.ly/mastering-apache-spark
Follow me at https://twitter.com/jaceklaskowski
On Thu, Apr 27, 2017 at 3:22 PM, Lavelle, Shawn
wrote:
> Hi Jacek,
>
>
>
> I know that
Hi,
Good progress!
Can you remove metastore_db directory and start ./bin/pyspark over? I
don't think starting from ~ is necessary.
Pozdrawiam,
Jacek Laskowski
https://medium.com/@jaceklaskowski/
Mastering Apache Spark 2 https://bit.ly/mastering-apache-spark
Follow me at https://twitte
explain it and you'll know what happens under the covers.
i.e. Use explain on the Dataset.
Jacek
On 25 Apr 2017 12:46 a.m., "Lavelle, Shawn" wrote:
> Hello Spark Users!
>
>Does the Spark Optimization engine reduce overlapping column ranges?
> If so, should it push this down to a Data Sourc
Hi,
You've got two spark sessions up and running (and given Spark SQL uses
Derby-managed Hive MetaStock hence the issue)
Please don't start spark-submit from inside bin. Rather bin/spark-submit...
Jacek
On 26 Apr 2017 1:57 a.m., "Afshin, Bardia"
wrote:
I’m having issues when I fire up pyspar
Hi,
What's the alternative? Dataset? You've got textFile then.
It's an older API from the ages when Dataset was merely experimental.
Jacek
On 29 Mar 2017 8:58 p.m., "George Obama" wrote:
> Hi,
>
> I saw that the API, either R or Scala, we are returning DataFrame for
> sparkSession.read.text()
Hi,
If your Spark app uses snappy in the code, define an appropriate library
dependency to have it on classpath. Don't rely on transitive dependencies.
Jacek
On 7 Apr 2017 8:34 a.m., "satishl" wrote:
Hi, I am planning to process spark app eventlogs with another spark app.
These event logs are
Thanks Koert for the kind words. That part however is easy to fix and
was surprised to have seen the old style referenced (!)
Pozdrawiam,
Jacek Laskowski
https://medium.com/@jaceklaskowski/
Mastering Apache Spark 2 https://bit.ly/mastering-apache-spark
Follow me at https://twitter.com
Hi,
I'm very sorry for not being up to date with the current style (and
"promoting" the old style) and am going to review that part soon. I'm very
close to touch it again since I'm with Optimizer these days.
Jacek
On 5 Apr 2017 6:08 a.m., "Kazuaki Ishizaki" wrote:
> Hi,
> The page in the URL e
Hi,
Answering your question from the title (that seems different from what's in
the email) and leaving the other part of how to do it using a DI framework
to others.
Spark does not use any DI framework internally and wires components itself.
Jacek
On 2 Apr 2017 3:29 p.m., "kant kodali" wrote:
Hi Hyukjin,
It was a false alarm as I had a local change to `def schema` in
`Dataset` that caused the issue.
I apologize for the noise. Sorry and thanks a lot for the prompt
response. I appreciate.
Pozdrawiam,
Jacek Laskowski
https://medium.com/@jaceklaskowski/
Mastering Apache Spark 2
ng (nullable = false)
p.s. http://stackoverflow.com/q/43041975/1305344
Pozdrawiam,
Jacek Laskowski
https://medium.com/@jaceklaskowski/
Mastering Apache Spark 2 https://bit.ly/mastering-apache-spark
Follow me at https://twitter.com/jaceklaskowski
--
Hi,
Think it's the size of the type to count the partitions which I think is
Int. I don't think there's another reason.
Jacek
On 23 Feb 2017 5:01 a.m., "Parag Chaudhari" wrote:
> Hi,
>
> Is there any limit on number of tasks per stage attempt?
>
>
> *Thanks,*
>
> *Parag*
>
Hi Justin,
I have never seen such a list. I think the area is in heavy development
esp. optimizations for typed operations.
There's a JIRA to somehow find out more on the behavior of Scala code
(non-Column-based one from your list) but I've seen no activity in this
area. That's why for now Column
Hi,
Guess you're use local mode which has only one executor called driver. Is
my guessing correct?
Jacek
On 23 Feb 2017 2:03 a.m., wrote:
> Hello,
>
> Had a question. When I look at the executors tab in Spark UI, I notice
> that some RDD blocks are assigned to the driver as well. Can someone p
Hi,
Never heard about such a tool before. You could use Antlr to parse SQLs
(just as Spark SQL does while parsing queries). I think it's a one-hour
project.
Jacek
On 21 Feb 2017 4:44 a.m., "Linyuxin" wrote:
Hi All,
Is there any tool/api to check the sql syntax without running spark job
actuall
Hi,
Yes, it's the "sum of values for all tasks" (it's based on TaskMetrics
which are accumulators behind the scenes).
Why "it appears that value isnt much of help while debugging?" ?
Pozdrawiam,
Jacek Laskowski
https://medium.com/@jaceklaskowski/
Mastering A
"Something like that" I've never tried it out myself so I'm only
guessing having a brief look at the API.
Pozdrawiam,
Jacek Laskowski
https://medium.com/@jaceklaskowski/
Mastering Apache Spark 2.0 https://bit.ly/mastering-apache-spark
Follow me at https://twitter.com/jac
Hi,
Yes, that's ForeachWriter.
Yes, it works with element by element. You're looking for mapPartition
and ForeachWriter has partitionId that you could use to implement a
similar thing.
Pozdrawiam,
Jacek Laskowski
https://medium.com/@jaceklaskowski/
Mastering Apache Spark 2.0 http
better than
window (there were more exchanges in play for windows I reckon).
Pozdrawiam,
Jacek Laskowski
https://medium.com/@jaceklaskowski/
Mastering Apache Spark 2.0 https://bit.ly/mastering-apache-spark
Follow me at https://twitter.com/jaceklaskowski
On Tue, Feb 7, 2017 at 10:54 PM, Everett A
Hi,
Could groupBy and withColumn or UDAF work perhaps? I think window could
help here too.
Jacek
On 7 Feb 2017 8:02 p.m., "Everett Anderson"
wrote:
> Hi,
>
> I'm trying to un-explode or denormalize a table like
>
> +---++-+--++
> |id |name|extra|data |priority|
> +---+
Hi,
I know nothing about Spark in GCP so answering this for a pure Spark.
Can you use web UI and Executors tab or a SparkListener?
Jacek
On 7 Feb 2017 5:33 p.m., "Anahita Talebi" wrote:
Hello Friends,
I am trying to run a spark code on multiple machines. To this aim, I submit
a spark code on
On 7 Feb 2017 4:17 a.m., "Mars Xu" wrote:
Hello All,
Some spark sqls will produce one or more jobs, I have 2 questions,
1, How the cc.sql(“sql statement”) divided into one or more jobs ?
It's an implementation detail. You can have zero or more jobs for a single
structured quer
Hi,
Have you considered foreach sink?
Jacek
On 6 Feb 2017 8:39 p.m., "Egor Pahomov" wrote:
> Hi, I'm thinking of using Structured Streaming instead of old streaming,
> but I need to be able to save results to Hive table. Documentation for file
> sink says(http://spark.apache.org/docs/latest/st
Hi,
I may have seen this issue already...
What's the cluster manager? How do you spark-submit?
Jacek
On 7 Feb 2017 7:44 p.m., "dgoldenberg" wrote:
Hi,
Any reason why we might be getting this error? The code seems to work fine
in the non-distributed mode but the same code when run from a Spar
to logback eventually).
Pozdrawiam,
Jacek Laskowski
https://medium.com/@jaceklaskowski/
Mastering Apache Spark 2.0 https://bit.ly/mastering-apache-spark
Follow me at https://twitter.com/jaceklaskowski
On Mon, Feb 6, 2017 at 9:06 AM, Mendelson, Assaf
wrote:
> Shading doesn’t help (we a
Hi,
Shading conflicting dependencies?
Jacek
On 5 Feb 2017 3:56 p.m., "Mendelson, Assaf" wrote:
> Hi,
>
> Spark seems to explicitly use log4j.
>
> This means that if I use an alternative backend for my application (e.g.
> ch.qos.logback) I have a conflict.
>
> Sure I can exclude logback but tha
o resurrect it a few times. The other "components", i.e. map shuffle
stages, partitions/tasks, are handled by Spark itself.
Pozdrawiam,
Jacek Laskowski
https://medium.com/@jaceklaskowski/
Mastering Apache Spark 2.0 https://bit.ly/mastering-apache-spark
Follow me at https://twitter.co
Hi,
I'd say the error says it all :
Caused by: NoNodeAvailableException[None of the configured nodes are
available: [{#transport#-1}{XX.XXX.XXX.XX}{XX.XXX.XXX.XX:9300}]]
Jacek
On 3 Feb 2017 7:58 p.m., "Anastasios Zouzias" wrote:
Hi there,
Are you sure that the cluster nodes where the executo
Hi,
➜ spark git:(master) ✗ ./bin/spark-submit whatever || echo $?
Error: Cannot load main class from JAR file:/Users/jacek/dev/oss/spark/whatever
Run with --help for usage help or --verbose for debug output
1
I see 1 and there are other cases for 1 too.
Pozdrawiam,
Jacek Laskowski
https
u see "There is an exception in the script
exiting with status 1" printed out to stdout?
Pozdrawiam,
Jacek Laskowski
https://medium.com/@jaceklaskowski/
Mastering Apache Spark 2.0 https://bit.ly/mastering-apache-spark
Follow me at https://twitter.com/jaceklaskowski
On Fri, Feb 3,
Hi,
Yes. Forget about SQLContext. It's been merged into SparkSession as of
Spark 2.0 (same about HiveContext).
Long live SparkSession! :-)
Jacek
On 3 Feb 2017 7:48 p.m., "☼ R Nair (रविशंकर नायर)" <
ravishankar.n...@gmail.com> wrote:
All,
In Spark 1.6.0, we used
val jdbcDF = sqlContext.read.
Hi,
I think you have to upgrade to 2.1.0. There were few changes wrt the ERROR
since.
Jacek
On 29 Jan 2017 9:24 a.m., "Chetan Khatri"
wrote:
Hello Spark Users,
I am getting error while saving Spark Dataframe to Hive Table:
Hive 1.2.1
Spark 2.0.0
Local environment.
Note: Job is getting execut
Hi,
Wonder if you have any adblocker enabled in your browser? Is this the only
version giving you this behavior? All Spark jobs have no visualization?
Jacek
On 28 Jan 2017 7:03 p.m., "Md. Rezaul Karim" <
rezaul.ka...@insight-centre.org> wrote:
Hi All,
I am running a Spark job on my local machi
Hi,
How did you start spark-shell?
Jacek
On 28 Jan 2017 11:20 a.m., "Mich Talebzadeh"
wrote:
>
> Hi,
>
> My spark-streaming application works fine when compiled with Maven with
> uber jar file.
>
> With spark-shell this program throws an error as follows:
>
> scala> val dstream = KafkaUtils.cr
Repartition
Jacek
On 26 Jan 2017 6:13 p.m., "Md. Rezaul Karim" <
rezaul.ka...@insight-centre.org> wrote:
> Hi All,
>
> When I run a Spark job on my local machine (having 8 cores and 16GB of
> RAM) on an input data of 6.5GB, it creates 193 parallel tasks and put
> the output into 193 partitions.
Hi,
I think that the only way to get the information about a cached RDD is to
use SparkListener and intercept respective events about cached blocks on
BlockManagers.
Jacek
On 25 Jan 2017 5:54 a.m., "kumar r" wrote:
Hi,
I have cached some table in Spark Thrift Server. I want to get all cached
Hi,
The files are for shuffle blocks. Where did you find the docs about them?
Jacek
On 25 Jan 2017 8:41 p.m., "kant kodali" wrote:
oh sorry its actually in the documentation. I should just
set spark.worker.cleanup.enabled = true
On Wed, Jan 25, 2017 at 11:30 AM, kant kodali wrote:
> I have
Hi Koert,
map will take the value that has an implicit Encoder to any value that
may or may not have an encoder in scope. That's why I'm asking about
the map function to see what it does.
Pozdrawiam,
Jacek Laskowski
https://medium.com/@jaceklaskowski/
Mastering Apache Spark
Hi,
Can you show the code from map to reproduce the issue? You can create
encoders using Encoders object (I'm using it all over the place for schema
generation).
Jacek
On 25 Jan 2017 10:19 p.m., "Koert Kuipers" wrote:
> i often run into problems like this:
>
> i need to write a Dataset[T] => D
Hi,
Shooting in the dark...it's executed on executors (it's old tech RDD-based
so not many extra optimizations like in Spark SQL now).
Can you show the code as I'm scared to hear that you're trying to broadcast
inside a transformation which I'd believe is impossible.
Jacek
On 26 Jan 2017 12:18
Hi,
My impression is to use StreamingListener to track metrics and react
appropriately.
http://spark.apache.org/docs/latest/api/scala/index.html#org.apache.spark.streaming.scheduler.StreamingListener
Pozdrawiam,
Jacek Laskowski
https://medium.com/@jaceklaskowski/
Mastering Apache Spark 2.0
s/api/java/util/concurrent/ExecutorService.html
Pozdrawiam,
Jacek Laskowski
https://medium.com/@jaceklaskowski/
Mastering Apache Spark 2.0 https://bit.ly/mastering-apache-spark
Follow me at https://twitter.com/jaceklaskowski
On Tue, Jan 24, 2017 at 10:48 PM, Shiyuan wrote:
> Hi spark us
Executors are "dumb", i.e. they execute TaskRunners for tasks and...that's it.
Your logic should be on the driver that can intercept events
and...trigger cleanup.
I don't think there's another way to do it.
Pozdrawiam,
Jacek Laskowski
https://medium.com/@jacekla
Thanks for sharing! A very interesting reading indeed.
Pozdrawiam,
Jacek Laskowski
https://medium.com/@jaceklaskowski/
Mastering Apache Spark 2.0 https://bit.ly/mastering-apache-spark
Follow me at https://twitter.com/jaceklaskowski
On Fri, Jan 20, 2017 at 10:17 PM, Morten Hornbech wrote
Hi,
I'd be very interested in how you figured it out. Mind sharing?
Jacek
On 18 Jan 2017 9:51 p.m., "mhornbech" wrote:
> For anyone revisiting this at a later point, the issue was that Spark 2.1.0
> upgrades netty to version 4.0.42 which is not binary compatible with
> version
> 4.0.37 used by
Hi,
(redirecting to users as it has nothing to do with Spark project
development)
Monitor jobs and stages using SparkListener and submit cleanup jobs where a
condition holds.
Jacek
On 20 Jan 2017 3:57 a.m., "Keith Chapman" wrote:
> Hi ,
>
> Is it possible for an executor (or slave) to know wh
?
Pozdrawiam,
Jacek Laskowski
https://medium.com/@jaceklaskowski/
Mastering Apache Spark 2.0 https://bit.ly/mastering-apache-spark
Follow me at https://twitter.com/jaceklaskowski
On Sun, Jan 15, 2017 at 11:48 PM, ayan guha wrote:
> archive.apache.org will always have all the
Hi Phadnis,
I found this in
http://spark.apache.org/docs/latest/streaming-kafka-0-10-integration.html:
> This version of the integration is marked as experimental, so the API is
> potentially subject to change.
Pozdrawiam,
Jacek Laskowski
https://medium.com/@jaceklaskowski/
Mas
Hi,
A possible workaround...Use SparkListener and save the results to a custom
sink.
After all web UI is a mere bag of SparkListeners + excellent
visualizations.
Jacek
On 3 Jan 2017 4:14 p.m., "Joseph Naegele"
wrote:
Hi all,
Is there any way to observe Storage history in Spark, i.e. which RD
FYI option works with boolean literals directly.
Jacek
On 30 Dec 2016 9:32 p.m., "Palash Gupta"
wrote:
> Hi,
>
> If you want to load from csv, you can use below procedure. Of course you
> need to define spark context first. (Given example to load all csv under a
> folder, you can use specific n
Hi Yan,
I've been surprised the first time when I noticed rxin stepped back and a
new release manager stepped in. Congrats on your first ANNOUNCE!
I can only expect even more great stuff coming in to Spark from the dev
team after Reynold spared some time 😉
Can't wait to read the changes...
Jace
Thanks a LOT, Michael!
Pozdrawiam,
Jacek Laskowski
https://medium.com/@jaceklaskowski/
Mastering Apache Spark 2.0 https://bit.ly/mastering-apache-spark
Follow me at https://twitter.com/jaceklaskowski
On Mon, Dec 26, 2016 at 10:04 PM, Michael Gummelt
wrote:
> In fine-grained mode (which
Hi Michael,
That caught my attention...
Could you please elaborate on "elastically grow and shrink CPU usage"
and how it really works under the covers? It seems that CPU usage is
just a "label" for an executor on Mesos. Where's this in the code?
Pozdrawiam,
J
Hi David,
Can you use persist instead? Perhaps with some other StorageLevel? It
worked with Spark 2.2.0-SNAPSHOT I use and don't remember how it
worked back then in 1.6.2.
You could also check the Executors tab and see how many blocks you
have in their BlockManagers.
Pozdrawiam,
Jacek Lask
Hi,
What's the entire spark-submit + Spark properties you're using?
Pozdrawiam,
Jacek Laskowski
https://medium.com/@jaceklaskowski/
Mastering Apache Spark 2.0 https://bit.ly/mastering-apache-spark
Follow me at https://twitter.com/jaceklaskowski
On Fri, Dec 2, 2016 at 6:28 P
ad pulling and in the master
> spark UI I see the executor thread id is showing as 0 and that’s it.
>
>
>
> Thanks,
>
> Gabe
>
>
>
>
>
> *From: *Jacek Laskowski
> *Date: *Friday, December 2, 2016 at 11:47 AM
> *To: *Gabriel Perez
> *Cc: *user
>
Hi,
How many partitions does the topic have? How do you check how many
executors read from the topic?
Jacek
On 2 Dec 2016 2:44 p.m., "gabrielperez2484" wrote:
Hello,
I am trying to perform a POC between Kafka 0.10 and Spark 2.0.2. Currently I
am running into an issue, where only one executor
Hi,
Interesting, but I personally would opt for withColumn since it'd be
less to type (and also be consistent with ticks (')) as follows:
df.withColumn(explode('myArray) as 'arrayItem)
(Spark SQL made my SQL developer's life so easy these days :))
Pozdrawiam,
J
Hi Luciano,
Mind sharing why to have a structured streaming source/sink for Akka
if Kafka's available and Akka Streams has a Kafka module? #curious
Pozdrawiam,
Jacek Laskowski
https://medium.com/@jaceklaskowski/
Mastering Apache Spark 2.0 https://bit.ly/mastering-apache-spark
Follow
//github.com/apache/spark/blob/master/external/kafka-0-10-sql/src/main/scala/org/apache/spark/sql/kafka010/KafkaSourceProvider.scala
Pozdrawiam,
Jacek Laskowski
https://medium.com/@jaceklaskowski/
Mastering Apache Spark 2.0 https://bit.ly/mastering-apache-spark
Follow me at https://twitter
ache/spark/blob/master/core/src/main/scala/org/apache/spark/deploy/SparkSubmit.scala#L163-L164
Pozdrawiam,
Jacek Laskowski
https://medium.com/@jaceklaskowski/
Mastering Apache Spark 2.0 https://bit.ly/mastering-apache-spark
Follow me at https://twitter.com/jaceklaskowski
On Wed, Nov 9, 2016
tial_sum(cast(id#15 as bigint))])
+- *Project [_1#12 AS id#15, (_1#12 % 3) AS ID % 3#681]
+- *Filter isnotnull((_1#12 % 3))
+- LocalTableScan [_1#12, _2#13]
Pozdrawiam,
Jacek Laskowski
https://medium.com/@jaceklaskowski/
Mastering Apache Spar
Hi,
Remove
.setMaster("spark://spark-437-1-5963003:7077").
set("spark.driver.host","11.104.29.106")
and start over.
Can you also run the following command to check out Spark Standalone:
run-example --master spark://spark-437-1-5963003:7077 SparkPi
Pozdrawi
Hi,
How did you install Spark 1.6? It's usually as simple as rm -rf
$SPARK_1.6_HOME, but it really depends on how you installed it in the
first place.
Pozdrawiam,
Jacek Laskowski
https://medium.com/@jaceklaskowski/
Mastering Apache Spark 2.0 http://bit.ly/mastering-apache-spark
Follow
You've got two Spark runtimes up that may or may not contribute to the issue.
Pozdrawiam,
Jacek Laskowski
https://medium.com/@jaceklaskowski/
Mastering Apache Spark 2.0 http://bit.ly/mastering-apache-spark
Follow me at https://twitter.com/jaceklaskowski
On Sun, Sep 25, 2016 at 8:36 AM, v
Hi Everett,
I'd bet on --driver-class-path (but didn't check that out myself).
Pozdrawiam,
Jacek Laskowski
https://medium.com/@jaceklaskowski/
Mastering Apache Spark 2.0 http://bit.ly/mastering-apache-spark
Follow me at https://twitter.com/jaceklaskowski
On Wed, Sep 21, 2016 a
Hi Janardhan,
What's the command to build the project (sbt package or sbt assembly)?
What's the command you execute to run the application?
Pozdrawiam,
Jacek Laskowski
https://medium.com/@jaceklaskowski/
Mastering Apache Spark 2.0 http://bit.ly/mastering-apache-spark
Follow m
On Mon, Sep 19, 2016 at 11:36 AM, Mich Talebzadeh
wrote:
> Spark UI on port 4040 by default
That's exactly *a* SparkListener + web UI :)
Jacek
-
To unsubscribe e-mail: user-unsubscr...@spark.apache.org
Hi Cristina,
http://blog.jaceklaskowski.pl/spark-workshop/slides/08_Monitoring_using_SparkListeners.html
http://spark.apache.org/docs/latest/api/scala/index.html#org.apache.spark.scheduler.SparkListener
Let me know if you've got more questions.
Pozdrawiam,
Jacek Laskowski
SparkListener perhaps?
Jacek
On 15 Sep 2016 1:41 p.m., "Cristina Rozee" wrote:
> Hello,
>
> I am running a spark application and I would like to know the total amount
> of shuffle data (read + write ) so could anyone let me know how to get this
> information?
>
> Thank you
> Cristina.
>
Hi Jonardhan,
Can you share the code that you execute? What's the command? Mind
sharing the complete project on github?
Pozdrawiam,
Jacek Laskowski
https://medium.com/@jaceklaskowski/
Mastering Apache Spark 2.0 http://bit.ly/mastering-apache-spark
Follow me at https://twitte
Hi Advait,
It's due to https://issues.apache.org/jira/browse/SPARK-15565.
See http://stackoverflow.com/a/38945867/1305344 for a solution (that's
spark.sql.warehouse.dir away). Upvote if it works for you. Thanks!
Pozdrawiam,
Jacek Laskowski
https://medium.com/@jaceklaskowski/
s/TODOs in my Spark notes...
Pozdrawiam,
Jacek Laskowski
https://medium.com/@jaceklaskowski/
Mastering Apache Spark 2.0 http://bit.ly/mastering-apache-spark
Follow me at https://twitter.com/jaceklaskowski
On Wed, Sep 14, 2016 at 4:09 PM, Mich Talebzadeh
wrote:
> Hi Ashok,
>
> I am
d to
handle the single Spark application.
Pozdrawiam,
Jacek Laskowski
https://medium.com/@jaceklaskowski/
Mastering Apache Spark 2.0 http://bit.ly/mastering-apache-spark
Follow me at https://twitter.com/jaceklaskowski
On Sun, Sep 11, 2016 at 11:18 AM, Vladimir Tretyakov
wrote:
> Hello Ja
Hi Muhammad,
sep or delimiter should both work fine.
Pozdrawiam,
Jacek Laskowski
https://medium.com/@jaceklaskowski/
Mastering Apache Spark 2.0 http://bit.ly/mastering-apache-spark
Follow me at https://twitter.com/jaceklaskowski
On Sat, Sep 10, 2016 at 10:42 AM, Muhammad Asif Abbasi
eve got its own file format and support @ spark-packages.
Pozdrawiam,
Jacek Laskowski
https://medium.com/@jaceklaskowski/
Mastering Apache Spark 2.0 http://bit.ly/mastering-apache-spark
Follow me at https://twitter.com/jaceklaskowski
On Sat, Sep 10, 2016 at 8:00 AM, Mich Talebzadeh
wrote:
ps://issues.apache.org/jira/browse/SPARK.
Have you run into any issues with CSV and Java? Share the code.
Pozdrawiam,
Jacek Laskowski
https://medium.com/@jaceklaskowski/
Mastering Apache Spark 2.0 http://bit.ly/mastering-apache-spark
Follow me at https://twitter.com/jaceklaskowski
On Sat
Hi,
That's correct. One app one web UI. Open 4041 and you'll see the other app.
Jacek
On 9 Sep 2016 11:53 a.m., "Vladimir Tretyakov" <
vladimir.tretya...@sematext.com> wrote:
> Hello again.
>
> I am trying to play with Spark version "2.11-2.0.0".
>
> Problem that REST API and UI shows me differ
Hi,
https://issues.apache.org/jira/browse/SPARK-17363?jql=project%20%3D%20SPARK%20AND%20fixVersion%20%3D%202.1.0%20AND%20component%20%3D%20MLlib
Pozdrawiam,
Jacek Laskowski
https://medium.com/@jaceklaskowski/
Mastering Apache Spark 2.0 http://bit.ly/mastering-apache-spark
Follow me at https
es in onExecutorMetricsUpdate.
[1]
http://spark.apache.org/docs/latest/api/scala/index.html#org.apache.spark.scheduler.SparkListener
Pozdrawiam,
Jacek Laskowski
https://medium.com/@jaceklaskowski/
Mastering Apache Spark 2.0 http://bit.ly/mastering-apache-spark
Follow me at https://twitter.com/jaceklask
.
scala> s"I'm using $spark in ${spark.version}"
res0: String = I'm using org.apache.spark.sql.SparkSession@1fc1c7e in
2.1.0-SNAPSHOT
Pozdrawiam,
Jacek Laskowski
https://medium.com/@jaceklaskowski/
Mastering Apache Spark 2.0 http://bit.ly/mastering-apache-spark
Follow me
Hi Mich,
This is Scala's string interpolation which allow for replacing $-prefixed
expressions with their values.
It's what cool kids use in Scala to do templating and concatenation 😁
Jacek
On 23 Aug 2016 9:21 a.m., "Mich Talebzadeh"
wrote:
> What is --> s below before the text of sql?
>
>
may arise
> from relying on this email's technical content is explicitly disclaimed.
> The author will in no case be liable for any monetary damages arising from
> such loss, damage or destruction.
>
>
>
> On 26 August 2016 at 23:21, Jacek Laskowski wrote:
>
> Hi Mich,
&
101 - 200 of 467 matches
Mail list logo