ble (as it's
on a stable HDFS file system not on an ephemeral executor). In either case,
the lineage should be the same = cut.
Pozdrawiam,
Jacek Laskowski
https://about.me/JacekLaskowski
"The Internals Of" Online Books <https://books.japila.pl/>
Follow me on http
ay back.
I wish myself that someone with more skills in this area chimed in...
Pozdrawiam,
Jacek Laskowski
https://about.me/JacekLaskowski
"The Internals Of" Online Books <https://books.japila.pl/>
Follow me on https://twitter.com/jaceklaskowski
<https://twitter.com/jacekl
Hi,
Start with DataStreamWriter.foreachBatch.
Pozdrawiam,
Jacek Laskowski
https://about.me/JacekLaskowski
"The Internals Of" Online Books <https://books.japila.pl/>
Follow me on https://twitter.com/jaceklaskowski
<https://twitter.com/jaceklaskowski>
On Thu, Jan 7, 2
Hi,
Can you post the whole message? I'm trying to find what might be causing
it. A small reproducible example would be of help too. Thank you.
Pozdrawiam,
Jacek Laskowski
https://about.me/JacekLaskowski
"The Internals Of" Online Books <https://books.japila.pl/&g
a.schema.names: _*)
.write
.insertInto(sqlView)
In summary, you should report this to JIRA, but don't expect this get fixed
other than to catch this case just to throw this exception
from ResolveRelations: Inserting into a view is not allowed"
Unless I'm mistaken...
Pozd
Hey Yurii,
> which is unavailable from executors.
Register it on the driver and use accumulators on executors to update the
values (on the driver)?
Pozdrawiam,
Jacek Laskowski
https://about.me/JacekLaskowski
"The Internals Of" Online Books <https://books.japila.pl/>
n of a query is used to look up
any cached queries.
Again, I'm not really sure and if I'd have to answer it (e.g. as part of an
interview) I'd say nothing would be shared / re-used.
Pozdrawiam,
Jacek Laskowski
https://about.me/JacekLaskowski
"The Internals Of" Online Bo
ed and also forward that to ElasticSearch via log4j
for monitoring
Think SparkListener API would help here too.
Pozdrawiam,
Jacek Laskowski
https://about.me/JacekLaskowski
"The Internals Of" Online Books <https://books.japila.pl/>
Follow me on https://twitter.com/jaceklask
pment IMHO).
Pozdrawiam,
Jacek Laskowski
https://about.me/JacekLaskowski
"The Internals Of" Online Books <https://books.japila.pl/>
Follow me on https://twitter.com/jaceklaskowski
<https://twitter.com/jaceklaskowski>
On Wed, Jan 20, 2021 at 2:44 PM Marco Firrincieli
wrote:
Hi Marco,
A Scala dev here.
In short: yet another reason against Python :)
Honestly, I've got no idea why the code gives the output. Ran it with
3.1.1-rc1 and got the very same results. Hoping pyspark/python devs will
chime in and shed more light on this.
Pozdrawiam,
Jacek Laskowski
life as a container of a driver pod.
There's no point using cluster deploy mode...ever. Makes sense?
Pozdrawiam,
Jacek Laskowski
https://about.me/JacekLaskowski
"The Internals Of" Online Books <https://books.japila.pl/>
Follow me on https://twitter.com/jaceklaskowski
<https:
Hi,
I'd look at stages and jobs as it's possible that the only task running is
the missing one in a stage of a job. Just guessing...
Pozdrawiam,
Jacek Laskowski
https://about.me/JacekLaskowski
"The Internals Of" Online Books <https://books.japila.pl/>
Follow
Hi Brett,
No idea why it happens, but got curious about this "Cores" column being 0.
Is this always the case?
Pozdrawiam,
Jacek Laskowski
https://about.me/JacekLaskowski
"The Internals Of" Online Books <https://books.japila.pl/>
Follow me on https://twitter.
Hi,
Can you use console sink and make sure that the pipeline shows some
progress?
Pozdrawiam,
Jacek Laskowski
https://about.me/JacekLaskowski
"The Internals Of" Online Books <https://books.japila.pl/>
Follow me on https://twitter.com/jaceklaskowski
<https://twitter.com/j
Hi,
Never heard of it (and have once been tasked to explore a similar use
case). I'm curious how you'd like it to work? (no idea how Hive does this
either)
Pozdrawiam,
Jacek Laskowski
https://about.me/JacekLaskowski
"The Internals Of" Online Books <https://books.j
Hi Filip,
Care to share the code behind "The only thing I found so far involves using
forEachBatch and manually updating my aggregates. "?
I'm not completely sure I understand your use case and hope the code could
shed more light on it. Thank you.
Pozdrawiam,
Jacek Lasko
case they're
not deleted as they simply wait forever. I might be mistaken here though.
What property is this for "this timeout of 60 sec."?
Pozdrawiam,
Jacek Laskowski
https://about.me/JacekLaskowski
"The Internals Of" Online Books <https://books.japila.pl/&
Hi,
> as Executors terminates after their work completes.
--conf spark.kubernetes.executor.deleteOnTermination=false ?
Pozdrawiam,
Jacek Laskowski
https://about.me/JacekLaskowski
"The Internals Of" Online Books <https://books.japila.pl/>
Follow me on https://twitter
Hi,
On GCP I'd go for buckets in Google Storage. Not sure how reliable it is in
production deployments though. Only demo experience here.
Pozdrawiam,
Jacek Laskowski
https://about.me/JacekLaskowski
"The Internals Of" Online Books <https://books.japila.pl/>
Follow me o
wski.github.io/spark-kubernetes-book/demo/persistentvolumeclaims/
Please help. Thank you!
Pozdrawiam,
Jacek Laskowski
https://about.me/JacekLaskowski
"The Internals Of" Online Books <https://books.japila.pl/>
Follow me on https://twitter.com/jaceklaskowski
<https://twitter.com/jaceklaskowski>
Hi,
I think I found it. I should be using OnDemand claim name so it gets
replaced to be unique per executor (?)
Pozdrawiam,
Jacek Laskowski
https://about.me/JacekLaskowski
"The Internals Of" Online Books <https://books.japila.pl/>
Follow me on https://twitter.com/jacekl
a3a4954291b74f8c8/sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/Source.scala#L61
[2]
https://github.com/apache/spark/blob/053dd858d38e6107bc71e0aa3a4954291b74f8c8/sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/Source.scala#L35
Pozdrawiam,
Jacek Laskowski
h
many HTTP calls are there under the
covers? How to know it for GCS?
Thank you for any help you can provide. Merci beaucoup mes amis :)
[1] https://stackoverflow.com/q/66933229/1305344
Pozdrawiam,
Jacek Laskowski
https://about.me/JacekLaskowski
"The Internals Of" Online Books <
uot;safe" and "safety" meanings.
Pozdrawiam,
Jacek Laskowski
https://about.me/JacekLaskowski
"The Internals Of" Online Books <https://books.japila.pl/>
Follow me on https://twitter.com/jaceklaskowski
<https://twitter.com/jaceklaskowski>
On Sat, Apr 3,
Hi Bartosz,
This is not a question about whether the data source supports fixed or
user-defined schema but what schema to use when requested for a streaming
batch in Source.getBatch.
Pozdrawiam,
Jacek Laskowski
https://about.me/JacekLaskowski
"The Internals Of" Online Bo
Hi Vaquar,
Thanks a lot! Accepted as the answer (yet there was the other answer that
was very helpful too). Tons of reading ahead to understand it more.
That once again makes me feel that Hadoop MapReduce experience would help a
great deal (and I've got none).
Pozdrawiam,
Jacek Lask
hat's what happens in Kafka Streams too
Pozdrawiam,
Jacek Laskowski
https://about.me/JacekLaskowski
"The Internals Of" Online Books <https://books.japila.pl/>
Follow me on https://twitter.com/jaceklaskowski
<https://twitter.com/jaceklaskowski>
On Sun, Apr 4, 2021
Hi,
The easiest (but perhaps not necessarily the most flexible) is simply to
use two different versions of spark-submit script with the env var set to
two different values. Have you tried it yet?
Pozdrawiam,
Jacek Laskowski
https://about.me/JacekLaskowski
"The Internals Of" On
Big shout-out to you, Dongjoon! Thank you.
Pozdrawiam,
Jacek Laskowski
https://about.me/JacekLaskowski
"The Internals Of" Online Books <https://books.japila.pl/>
Follow me on https://twitter.com/jaceklaskowski
<https://twitter.com/jaceklaskowski>
On Wed, Jun 2, 20
Hi Pedro,
No idea what might be causing it. Do you perhaps have some code to
reproduce it locally?
Pozdrawiam,
Jacek Laskowski
https://about.me/JacekLaskowski
"The Internals Of" Online Books <https://books.japila.pl/>
Follow me on https://twitter.com/jaceklaskowski
<
k to avoid OOMEs).
Pozdrawiam,
Jacek Laskowski
https://about.me/JacekLaskowski
"The Internals Of" Online Books <https://books.japila.pl/>
Follow me on https://twitter.com/jaceklaskowski
<https://twitter.com/jaceklaskowski>
Hi Bobby,
What a great summary of what happens behind the scenes! Enjoyed every
sentence!
"The default shuffle implementation will always write out to disk." <--
that's what I wasn't sure about the most. Thanks again!
/me On digging deeper...
Pozdrawiam,
Jacek Laskowsk
w are the above different from yours?
Pozdrawiam,
Jacek Laskowski
https://about.me/JacekLaskowski
"The Internals Of" Online Books <https://books.japila.pl/>
Follow me on https://twitter.com/jaceklaskowski
<https://twitter.com/jaceklaskowski>
On Thu, Aug 19, 2021 at 5:
rrors
coming from broadcast joins perhaps?
Pozdrawiam,
Jacek Laskowski
https://about.me/JacekLaskowski
"The Internals Of" Online Books <https://books.japila.pl/>
Follow me on https://twitter.com/jaceklaskowski
<https://twitter.com/jaceklaskowski>
On Mon, Aug 30, 2021 at
a thought but wanted to share as I think it's worth investigating.
Pozdrawiam,
Jacek Laskowski
https://about.me/JacekLaskowski
"The Internals Of" Online Books <https://books.japila.pl/>
Follow me on https://twitter.com/jaceklaskowski
<https://twitter.com/jaceklasko
part of Spark?
You should not really be doing such risky config changes (unless you've got
no other choice and you know what you're doing).
Pozdrawiam,
Jacek Laskowski
https://about.me/JacekLaskowski
"The Internals Of" Online Books <https://books.japila.pl/>
g and am really curious (not implying that one is better or
worse than the other(s)).
Pozdrawiam,
Jacek Laskowski
https://about.me/JacekLaskowski
"The Internals Of" Online Books <https://books.japila.pl/>
Follow me on https://twitter.com/jaceklaskowski
<https://twitter.com/jacekl
Hi Raj,
Do you want to do the following?
spark.read.format("prometheus").load...
I haven't heard of such a data source / format before.
What would you like it for?
Pozdrawiam,
Jacek Laskowski
https://about.me/JacekLaskowski
"The Internals Of" Online Books
Yoohoo! Thanks Yuming for driving this release. A tiny step for Spark a
huge one for my clients (who still are on 3.2.1 or even older :))
Pozdrawiam,
Jacek Laskowski
https://about.me/JacekLaskowski
"The Internals Of" Online Books <https://books.japila.pl/>
Follow me on htt
/github.com/apache/spark/blob/e60ce3e85081ca8bb247aeceb2681faf6a59a056/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/FunctionRegistry.scala#L91
Pozdrawiam,
Jacek Laskowski
"The Internals Of" Online Books <https://books.japila.pl/>
Follow me on https://
Hi,
You could use QueryExecutionListener or Spark listeners to intercept query
execution events and extract whatever is required. That's what web UI does
(as it's simply a bunch of SparkListeners --> https://youtu.be/mVP9sZ6K__Y
;-)).
Pozdrawiam,
Jacek Laskowski
"The In
reenshots won't give you that level
of detail. You'd have to intercept execution events and correlate them. Not
an easy task yet doable. HTH.
Pozdrawiam,
Jacek Laskowski
"The Internals Of" Online Books <https://books.japila.pl/>
Follow me on https://twitter.com/jaceklas
d and used properly using the custom catalog impl.
HTH
Pozdrawiam,
Jacek Laskowski
"The Internals Of" Online Books <https://books.japila.pl/>
Follow me on https://twitter.com/jaceklaskowski
<https://twitter.com/jaceklaskowski>
On Fri, Apr 14, 2023 at 2:10 PM 许新浩 <948
l/core/src/main/scala/org/apache/spark/sql/execution/ui/SQLListener.scala#L60
[4]
https://github.com/apache/spark/blob/c124037b97538b2656d29ce547b2a42209a41703/sql/core/src/main/scala/org/apache/spark/sql/execution/ui/SQLTab.scala#L24
Pozdrawiam,
Jacek Laskowski
"The Internals Of&
Hi Karthick,
Sorry to say it but there's not enough "data" to help you. There should be
something more above or below this exception snippet you posted that could
pinpoint the root cause.
Pozdrawiam,
Jacek Laskowski
"The Internals Of" Online Books <https://bo
Hi,
Thanks a lot, Guru Medasani, for such an excellent theory rich intro to
MLlib! I wish I found only such emails in my mailbox.
Sorry. Couldn't resist since I've just started with MLlib and your response
has resonated so well with my initial experience.
Thanks!
Jacek
06.03.2016 6:55 AM "Guru
What about sum?
Jacek
06.03.2016 7:28 AM "Angel Angel" napisał(a):
> Hello,
> I have one table and 2 fields in it
> 1) item_id and
> 2) count
>
>
>
> i want to add the count field as per item (means group the item_ids)
>
> example
> Input
> itea_ID Count
> 500 2
> 200 6
> 500 4
> 100 3
> 200 6
>
Hi Praveen,
I've spent few hours on the changes related to streaming dataframes
(included in the SPARK-8360) and concluded that it's currently only
possible to read.stream(), but not write.stream() since there are no
streaming Sinks yet.
Pozdrawiam,
Jacek Laskowski
https://
Hi Praveen,
I don't really know. I think TD or Michael should know as they
personally involved in the task (as far as I could figure it out from
the JIRA and the changes). Ping people on the JIRA so they notice your
question(s).
Pozdrawiam,
Jacek Laskowski
https://medium.com/@jaceklask
What about write.save(file)?
P.s. I'm new to Spark MLlib.
11.03.2016 4:57 AM "Shishir Anshuman"
napisał(a):
> hello,
>
> I am new to Apache Spark and would like to get the Recommendation output
> of the ALS algorithm in a file.
> Please suggest me the solution.
>
> Thank you
>
>
>
Hi,
Use the names of the datasets not $, i. e. a("edid").
Jacek
11.03.2016 6:09 AM "박주형" napisał(a):
> Hi. I want to join two DataSet. but below stderr is shown
>
> 16/03/11 13:55:51 WARN ColumnName: Constructing trivially true equals
> predicate, ''edid = 'edid'. Perhaps you need to use aliase
Hi,
How do you check which executor is used? Can you include a screenshot of
the master's webUI with workers?
Jacek
11.03.2016 6:57 PM "Darin McBeath" napisał(a):
> I've run into a situation where it would appear that foreachPartition is
> only running on one of my executors.
>
> I have a small
e other
> executor). The executor that was used in the foreachPartition call works
> fine and doesn't experience issue. But, because the other executor is
> failing on every request the job dies.
>
> Darin.
>
>
>
> From: Jacek Laskowski
> T
Hi,
Why do you use maven not sbt for Scala?
Can you show the entire pom.xml and the command to execute the app?
Jacek
11.03.2016 7:33 PM "vasu20" napisał(a):
> Hi
>
> Any help appreciated on this. I am trying to write a Spark program using
> IntelliJ. I get a run time error as soon as new Sp
Hi Tristan,
Mind sharing the relevant code? I'd like to learn the way you use
Transformer to do so. Thanks!
Jacek
11.03.2016 7:07 PM "Tristan Nixon" napisał(a):
> I have a similar situation in an app of mine. I implemented a custom ML
> Transformer that wraps the Jackson ObjectMapper - this giv
Just a guess...flatMap?
Jacek
11.03.2016 7:46 PM "Stefan Panayotov" napisał(a):
> Hi,
>
> I have a problem that requires me to go through the rows in a DataFrame
> (or possibly through rows in a JSON file) and conditionally add rows
> depending on a value in one of the columns in each existing r
Hi
It could also be conf/spark-defaults.conf.
Jacek
11.03.2016 8:07 PM "Cesar Flores" napisał(a):
>
> Right now I know of three different things to pass property parameters to
> the Spark Context. They are:
>
>- A) Inside a SparkConf object just before creating the Spark Context
>- B) D
> shade
>
>
>
>
>
> ${project.artifactId}-${project.version}-with-dependencies
>
>
>
>
>
>
&
Hi,
For jars use spark-submit --jars. Dunno about so's. Could that work through
jars?
Jacek
11.03.2016 8:07 PM "prateek arora" napisał(a):
> Hi
>
> I have multiple node cluster and my spark jobs depend on a native
> library (.so files) and some jar files.
>
> Can some one please explain what ar
Hi,
Just a side question: why do you convert DataFrame to RDD? It's like
driving backwards (possible but ineffective and dangerous at times)
P. S. I'd even go for Dataset.
Jacek
18.03.2016 5:20 PM "Bauer, Robert" napisał(a):
> I have data that I pull in using a sql context and then I convert t
Hi,
Why don't you use Datasets? You'd cut the number of getStrings and
it'd read nicer to your eyes. Also, doing such transformations would
*likely* be easier.
p.s. Please gist your example to fix it.
Pozdrawiam,
Jacek Laskowski
https://medium.com/@jaceklaskowski/
Masteri
Hi,
You may want to use SparkListener [1] (as webui) and listens to
SparkListenerExecutorAdded and SparkListenerExecutorRemoved.
[1]
http://spark.apache.org/docs/latest/api/scala/index.html#org.apache.spark.scheduler.SparkListener
Pozdrawiam,
Jacek Laskowski
https://medium.com
eft.join(right, Seq("_1")).show
+---+---+---+
| _1| _2| _2|
+---+---+---+
| 1| a| a|
| 2| b| b|
+---+---+---+
scala> left.join(right, left("_1") === right("_1")).show
+---+---+---+---+
| _1| _2| _1| _2|
+---+---+---+---+
| 1| a| 1| a|
| 2| b| 2|
Hi,
How do you run the pipeline? Do you assembly or package? Is this on
local or spark or other cluster manager? What's the build
configuration?
Pozdrawiam,
Jacek Laskowski
https://medium.com/@jaceklaskowski/
Mastering Apache Spark http://bit.ly/mastering-apache-spark
Follow me at
error: missing parameter type for expanded function
((x$1) => x$1.id)
ds.select(_.id).show
^
Is this supposed to work in Spark 2.0 (today's build)?
BTW, Why is Seq(Text(0, "hello"), Text(1, "world")).as[Text] not possible?
Pozdrawiam,
Jacek Laskowsk
5|
| 1|swiecie| 1|three|5|
+---+---+---+-+-+
scala> df.join(nums, df("id") === nums("id")).withColumn("TEXT", lit(5)).show
+---++---++
| id|TEXT| id|TEXT|
+---++---++
| 0| 5| 0| 5|
| 1| 5| 1| 5|
+---++---++
Pozdrawiam,
Jacek L
Hi,
Thanks Ted.
It means that it's not only possible to rename a column using
withColumnRenamed, but also replace the content of a column (in one
shot) using withColumn with an existing column name. I can live with
that :)
Pozdrawiam,
Jacek Laskowski
https://medium.com/@jaceklask
Hi Ted,
Sure! It works with map, but not with select. Wonder if it's by design
or...will soon be fixed? Thanks again for your help.
Pozdrawiam,
Jacek Laskowski
https://medium.com/@jaceklaskowski/
Mastering Apache Spark http://bit.ly/mastering-apache-spark
Follow me at https://twitte
On Thu, Mar 31, 2016 at 5:47 PM, Jacek Laskowski wrote:
> It means that it's not only possible to rename a column using
> withColumnRenamed, but also replace the content of a column (in one
> shot) using withColumn with an existing column name. I can live with
> that :)
Hi,
No
de.
Pozdrawiam,
Jacek Laskowski
https://medium.com/@jaceklaskowski/
Mastering Apache Spark http://bit.ly/mastering-apache-spark
Follow me at https://twitter.com/jaceklaskowski
On Sat, Apr 9, 2016 at 7:51 PM, Buntu Dev wrote:
> I'm running this motif pattern against 1.5M vertice
Hi,
Not that I might help much with deployment to Mesos, but can you describe
your Mesos/Marathon setup? What's Mesos cluster dispatcher?
Jacek
18.04.2016 12:54 PM "Joao Azevedo" napisał(a):
> Hi!
>
> I'm trying to submit Spark applications to Mesos using the 'cluster'
> deploy mode. I'm using
Hi Arun,
My bet is...https://spark-summit.org/2016 :)
Pozdrawiam,
Jacek Laskowski
https://medium.com/@jaceklaskowski/
Mastering Apache Spark http://bit.ly/mastering-apache-spark
Follow me at https://twitter.com/jaceklaskowski
On Thu, Apr 28, 2016 at 1:43 PM, Arun Patel wrote:
> A sm
the logs in YARN. Go to
localhost:8088/cluster/apps and see the app's logs.
Pozdrawiam,
Jacek Laskowski
https://medium.com/@jaceklaskowski/
Mastering Apache Spark http://bit.ly/mastering-apache-spark
Follow me at https://twitter.com/jaceklaskowski
On Mon, May 9, 2016 at 9:45 AM, A
Hi,
I'd say "one per classloader".
Pozdrawiam,
Jacek Laskowski
https://medium.com/@jaceklaskowski/
Mastering Apache Spark http://bit.ly/mastering-apache-spark
Follow me at https://twitter.com/jaceklaskowski
On Mon, May 9, 2016 at 10:16 AM, praveen S wrote:
> Hi,
>
&
ble deploy-mode).
Also, deploy-mode client is the default deploy mode so you may safely remove it.
Pozdrawiam,
Jacek Laskowski
https://medium.com/@jaceklaskowski/
Mastering Apache Spark http://bit.ly/mastering-apache-spark
Follow me at https://twitter.com/jaceklaskowski
On Sun, May 15,
On Sun, May 15, 2016 at 5:50 PM, Richard Siebeling wrote:
> I'm getting the following errors running SparkPi on a clean just compiled
> and checked Mesos 0.29.0 installation with Spark 1.6.1
>
> 16/05/15 23:05:52 ERROR TaskSchedulerImpl: Lost executor
> e23f2d53-22c5-40f0-918d-0d73805fdfec-S0/0 o
On Sun, May 15, 2016 at 8:19 AM, Mail.com wrote:
> In all that I have seen, it seems each job has to be given the max resources
> allowed in the cluster.
Hi,
I'm fairly sure it was because FIFO scheduling mode was used. You
could change it to FAIR and make some adjustments.
https://spark.apac
(while Data Distribution table's columns not)?
A bug? It at least looks a bit odd.
Pozdrawiam,
Jacek Laskowski
https://medium.com/@jaceklaskowski/
Mastering Apache Spark http://bit.ly/mastering-apache-spark
Follow me at https://twitter.com/jaceklask
Hi Todd,
It's used heavily for thread pool executors for one. Don't know about other
uses.
Jacek
On 23 May 2016 5:49 a.m., "Todd" wrote:
> Hi,
> In the spark code, guava maven dependency scope is provided, my question
> is, how spark depends on guava during runtime? I looked into the
> spark-as
Hi,
What happens when you create the parent directory /home/stuti? I think the
failure is due to missing parent directories. What's the OS?
Jacek
On 24 May 2016 11:27 a.m., "Stuti Awasthi" wrote:
Hi All,
I have 3 nodes Spark 1.6 Standalone mode cluster with 1 Master and 2
Slaves. Also Im not h
Hi Mathieu,
Thanks a lot for the answer! I did *not* know it's the driver to
create the directory.
You said "standalone mode", is this the case for the other modes -
yarn and mesos?
p.s. Did you find it in the code or...just experienced before? #curious
Pozdrawiam,
Jacek Lasko
On 25 May 2016 6:00 p.m., "Daniel Barclay"
wrote:
>
> Was the feature of displaying accumulators in the Spark UI implemented in
Spark 1.4.1, or was that added later?
Dunno, but only *named* *accumulators* are displayed in Spark’s webUI
(under Stages tab for a given stage).
Jacek
sense to me".
Pozdrawiam,
Jacek Laskowski
https://medium.com/@jaceklaskowski/
Mastering Apache Spark http://bit.ly/mastering-apache-spark
Follow me at https://twitter.com/jaceklaskowski
On Fri, May 27, 2016 at 3:42 AM, Yong Zhang wrote:
> That just makes sense, doesn't it?
>
Hi,
How do you start thrift server? What's your user name? I think it takes the
user and always runs as it. Seen proxyUser today in spark-submit that may
or may not be useful here.
Jacek
On 31 May 2016 10:01 a.m., "Radhika Kothari"
wrote:
Hi
Anyone knows about spark thrift server always take h
What's "With the help of UI"?
Pozdrawiam,
Jacek Laskowski
https://medium.com/@jaceklaskowski/
Mastering Apache Spark http://bit.ly/mastering-apache-spark
Follow me at https://twitter.com/jaceklaskowski
On Tue, May 31, 2016 at 1:02 PM, Radhika Kothari
wrote:
> Hi,
>
lease confirm (or fix) my understanding before I file a JIRA issue. Thanks!
[1]
https://github.com/apache/spark/blob/master/core/src/main/scala/org/apache/spark/deploy/SparkSubmit.scala#L475-L476
Pozdrawiam,
Jacek Laskowski
https://medium.com/@jaceklaskowski/
Mastering Apache Spark http://bit.
Hi,
Few things for closer examination:
* is yarn master URL accepted in 1.3? I thought it was only in later
releases. Since you're seeing the issue it seems it does work.
* I've never seen specifying confs using a single string. Can you check in
the Web ui they're applied?
* what about this in
--executor-cores 1 to be exact.
Pozdrawiam,
Jacek Laskowski
https://medium.com/@jaceklaskowski/
Mastering Apache Spark http://bit.ly/mastering-apache-spark
Follow me at https://twitter.com/jaceklaskowski
On Fri, Jun 3, 2016 at 12:28 AM, Mich Talebzadeh
wrote:
> interesting. a vm with
Hi,
"I am supposed to work with akka and Hadoop in building apps on top of
the data available in hadoop" <-- that's outside the topics covered in
this mailing list (unless you're going to use Spark, too).
Pozdrawiam,
Jacek Laskowski
https://medium.com/@jaceklaskowski/
On Sun, Jun 5, 2016 at 9:01 PM, Ashok Kumar
wrote:
> Now I have added this
>
> libraryDependencies += "com.databricks" % "apps.twitter_classifier"
>
> However, I am getting an error
>
>
> error: No implicit for Append.Value[Seq[sbt.ModuleID],
> sbt.impl.GroupArtifactID] found,
> so sbt.impl.Gr
Hi,
What's the version of Spark? You're using Kafka 0.9.0.1, ain't you? What's
the topic name?
Jacek
On 7 Jun 2016 11:06 a.m., "Dominik Safaric"
wrote:
> As I am trying to integrate Kafka into Spark, the following exception
> occurs:
>
> org.apache.spark.SparkException: java.nio.channels.Closed
Hi,
It's not possible. YARN uses CPU and memory for resource constraints and
places AM on any node available. Same about executors (unless data locality
constraints the placement).
Jacek
On 6 Jun 2016 1:54 a.m., "Saiph Kappa" wrote:
> Hi,
>
> In yarn-cluster mode, is there any way to specify on
On Tue, Jun 7, 2016 at 1:25 PM, Arun Patel wrote:
> Do we have any further updates on release date?
Nope :( And it's even more quiet than I could have thought. I was so
certain that today's the date. Looks like Spark Summit has "consumed"
all the people behind 2.0...Can't believe no one (from the
Hi,
--master yarn-client is deprecated and you should use --master yarn
--deploy-mode client instead. There are two deploy-modes: client
(default) and cluster. See
http://spark.apache.org/docs/latest/cluster-overview.html.
Pozdrawiam,
Jacek Laskowski
https://medium.com/@jaceklaskowski
Finally, the PMC voice on the subject. Thanks a lot, Sean!
p.s. Given how much time it takes to ship 2.0 (with so many cool
features already backed in!) I'd vote for releasing a few more RCs
before 2.0 hits the shelves. I hope 2.0 is not Java 9 or Jigsaw ;-)
Pozdrawiam,
Jacek Lask
On Tue, Jun 7, 2016 at 3:25 PM, Sean Owen wrote:
> That's not any kind of authoritative statement, just my opinion and guess.
Oh, come on. You're not **a** Sean but **the** Sean (= a PMC member
and the JIRA/PRs keeper) so what you say **is** kinda official. Sorry.
But don't worry the PMC (the gro
for web UI
that the console knows what happens under the covers (and can
calculate the stats).
BTW, spark.ui.port (default: 4040) controls the port Web UI binds to.
Pozdrawiam,
Jacek Laskowski
https://medium.com/@jaceklaskowski/
Mastering Apache Spark http://bit.ly/mastering-apache-spark
Follow m
onProtocol
object.
Pozdrawiam,
Jacek Laskowski
https://medium.com/@jaceklaskowski/
Mastering Apache Spark http://bit.ly/mastering-apache-spark
Follow me at https://twitter.com/jaceklaskowski
On Tue, Jun 7, 2016 at 8:18 PM, Jacek Laskowski wrote:
> Hi,
>
> It is the driver - see the
Hi,
I'm not surprised to see Hadoop jars on the driver (yet I couldn't
explain exactly why they need to be there). I can't find a way now to
display the classpath for executors.
Pozdrawiam,
Jacek Laskowski
https://medium.com/@jaceklaskowski/
Mastering Apache Spark http://b
et[Person] = [name: string, age: int]
scala> ds.as("a").joinWith(ds.as("b"), $"a.name" === $"b.name").show(false)
+++
|_1 |_2 |
+++
|[foo,42]|[foo,42]|
|[bar,24]|[bar,24]|
+++
Pozdrawiam,
Jacek Lask
1 - 100 of 467 matches
Mail list logo