Hi Sanket,
Driver and executor logs are written to stdout by default, it can be
configured using SPARK_HOME/conf/log4j.properties file. The file including
the entire SPARK_HOME/conf is auto propogateded to all driver and executor
container and mounted as volume.
Thanks
On Mon, 9 Oct, 2023, 5:37
Hi Sanket, more details might help here.
How does your spark configuration look like?
What exactly was done when this happened?
On Thu, 5 Oct, 2023, 2:29 pm Agrawal, Sanket,
wrote:
> Hello Everyone,
>
>
>
> We are trying to stream the changes in our Iceberg tables stored in AWS
> S3. We are ac
Hi Sachit,
The fix verison on that JIRA says 3.0.2, so this fix is not yet released.
Soon, there will be a 3.1.1 release, in the meantime you can try out the
3.1.1-rc which also has the fix and let us know your findings.
Thanks,
On Mon, Feb 1, 2021 at 10:24 AM Sachit Murarka
wrote:
> Followin
A lot of developers may have already moved to 3.0.x, FYI 3.1.0 is just
around the corner hopefully(in a few days) and has a lot of improvements to
spark on K8s, including it will be transitioning from experimental to GA in
this release.
See: https://issues.apache.org/jira/browse/SPARK-33005
Than
ate.driver.serviceAccountName=spark-sa --conf
> spark.kubernetes.container.image=sparkpy local:///opt/spark/da/main.py
>
> Kind Regards,
> Sachit Murarka
>
>
> On Mon, Jan 4, 2021 at 5:46 PM Prashant Sharma
> wrote:
>
>> Hi Sachit,
>>
>> Can you give more details on how did you
Hi Sachit,
Can you give more details on how did you run? i.e. spark submit command. My
guess is, a service account with sufficient privilege is not provided.
Please see:
http://spark.apache.org/docs/latest/running-on-kubernetes.html#rbac
Thanks,
On Mon, Jan 4, 2021 at 5:27 PM Sachit Murarka
wro
-dev
Hi,
I have used Spark with HDFS encrypted with Hadoop KMS, and it worked well.
Somehow, I could not recall, if I had the kubernetes in the mix. Somehow,
seeing the error, it is not clear what caused the failure. Can I reproduce
this somehow?
Thanks,
On Sat, Aug 15, 2020 at 7:18 PM Michel Su
Hi Ashika,
Hadoop 2.6 is now no longer supported, and since it has not been maintained
in the last 2 years, it means it may have some security issues unpatched.
Spark 3.0 onwards, we no longer support it, in other words, we have
modified our codebase in a way that Hadoop 2.6 won't work. However, i
Hi Ankur,
Java 11 support was added in Spark 3.0.
https://issues.apache.org/jira/browse/SPARK-24417
Thanks,
On Tue, Jul 14, 2020 at 6:12 PM Ankur Mittal
wrote:
> Hi,
>
> I am using Spark 2.X and need to execute Java 11 .Its not able to execute
> Java 11 using Spark 2.X.
>
> Is there any way w
> scalable and dynamic-allocation-enabled for deploying Spark on K8s? Any
> suggested github repo or link?
>
>
>
> Thanks,
>
> Vaibhav V
>
>
>
>
>
> *From:* Prashant Sharma
> *Sent:* Friday, July 10, 2020 12:57 AM
> *To:* user@spark.apache.org
Hi,
Whether it is a blocker or not, is upto you to decide. But, spark k8s
cluster supports dynamic allocation, through a different mechanism, that
is, without using an external shuffle service.
https://issues.apache.org/jira/browse/SPARK-27963. There are pros and cons
of both approaches. The only
Hi,
My employer(IBM) is interested in hiring people in hyderabad if they are
committers in any of the Apache Projects and are interested Spark and
ecosystem.
Thanks,
Prashant.
I have a Spark Streaming job which takes too long to delete temp RDD's. I
collect about 4MM telemetry metrics per minute and do minor aggregations in
the Streaming Job.
I am using Amazon R4 instances. The Driver RPC call although Async,i
believe, is slow getting the handle for future object at "
Hi Darshan,
Did you try passing the config directly as an option, like this:
.option("kafka.sasl.jaas.config", saslConfig)
Where saslConfig can look like:
com.sun.security.auth.module.Krb5LoginModule required \
useKeyTab=true \
storeKey=true \
keyTab="/etc/security/key
Hi,
Goal of my benchmark is to arrive at end to end latency lower than 100ms
and sustain them over time, by consuming from a kafka topic and writing
back to another kafka topic using Spark. Since the job does not do
aggregation and does a constant time processing on each message, it
appeared to me
+user -dev
Since the same hash based partitioner is in action by default. In my
understanding every time same partitioning will happen.
Thanks,
On Nov 10, 2016 7:13 PM, "WangJianfei"
wrote:
> Hi Devs:
> If i run sc.textFile(path,xxx) many times, will the elements be the
> same(same elemen
Hi Baahu,
That should not be a problem, given you allocate sufficient buffer for
reading.
I was just working on implementing a patch[1] to support the feature for
reading wholetextfiles in SQL. This can actually be slightly better
approach, because here we read to offheap memory for holding data(
Since you are reading from file stream, I would suggest instead of printing
try to save it on a file. There may be output the first time and then no
data in subsequent iterations.
Prashant Sharma
On Tue, Apr 26, 2016 at 7:40 PM, Ashutosh Kumar
wrote:
> I created a Streaming k means based
ormat.html
is one such formatter class.
thanks,
Prashant Sharma
On Wed, Apr 27, 2016 at 5:22 AM, Davies Liu wrote:
> hdfs://192.168.10.130:9000/dev/output/test already exists, so you need
> to remove it first.
>
> On Tue, Apr 26, 2016 at 5:28 AM, Luke Adolph wrote:
> > H
As far as I can understand, your requirements are pretty straight forward
and doable with just simple SQL queries. Take a look at Spark SQL on spark
documentation.
Prashant Sharma
On Tue, Apr 12, 2016 at 8:13 PM, Joe San wrote:
> up vote
> down votefavorite
&
This can happen if system time is not in sync. By default, streaming uses
SystemClock(it also supports ManualClock) and that relies
on System.currentTimeMillis() for determining start time.
Prashant Sharma
On Sat, Apr 16, 2016 at 10:09 PM, Hemalatha A <
hemalatha.amru...@googlemail.com>
May be you can try creating it before running the App.
xml[1] messages.
Thanks,
Prashant Sharma
1. https://github.com/databricks/spark-xml
On Tue, Apr 19, 2016 at 10:31 AM, Deepak Sharma
wrote:
> Hi all,
> I am looking for an architecture to ingest 10 mils of messages in the
> micro batches of seconds.
> If anyone has worked on sim
*This is a known issue. *
https://issues.apache.org/jira/browse/SPARK-3200
Prashant Sharma
On Thu, Mar 3, 2016 at 9:01 AM, Rahul Palamuttam
wrote:
> Thank you Jeff.
>
> I have filed a JIRA under the following link :
>
> https://issues.apache.org/jira/browse/SPARK-13634
>
This is the jira I referred to
https://issues.apache.org/jira/browse/SPARK-3256. Another reason for not
working on it is evaluating priority between upgrading to scala 2.11.5(it
is non trivial I suppose because repl has changed a bit) or migrating that
patch is much simpler.
Prashant Sharma
On
planning to work, I can help you ?
Prashant Sharma
On Thu, Apr 9, 2015 at 3:08 PM, anakos wrote:
> Hi-
>
> I am having difficulty getting the 1.3.0 Spark shell to find an external
> jar. I have build Spark locally for Scala 2.11 and I am starting the REPL
> as follows:
>
Hi Folks,
We are trying to run the following code from the spark shell in a CDH 5.3
cluster running on RHEL 5.8.
*spark-shell --master yarn --deploy-mode client --num-executors 15
--executor-cores 6 --executor-memory 12G *
*import org.apache.spark.mllib.recommendation.ALS *
*import org.apache.spa
is just a warning. FYI spark ignores BindException and probes
for next available port and continues. So you application is fine if that
particular error comes up.
Prashant Sharma
On Tue, Jan 20, 2015 at 10:30 AM, Deep Pradhan
wrote:
> Yes, I have increased the driver memory in sp
Looks like sbt/sbt -Pscala-2.11 is broken by a recent patch for improving
maven build.
Prashant Sharma
On Tue, Nov 18, 2014 at 12:57 PM, Prashant Sharma
wrote:
> It is safe in the sense we would help you with the fix if you run into
> issues. I have used it, but since I worked on the
/patch-3/docs/building-spark.md
Prashant Sharma
On Tue, Nov 18, 2014 at 12:19 PM, Jianshi Huang
wrote:
> Any notable issues for using Scala 2.11? Is it stable now?
>
> Or can I use Scala 2.11 in my spark application and use Spark dist build
> with 2.10 ?
>
> I'm lookin
spray depends on and use the akka spark depends on.
Prashant Sharma
On Wed, Oct 29, 2014 at 9:27 AM, Jianshi Huang
wrote:
> I'm using Spark built from HEAD, I think it uses modified Akka 2.3.4,
> right?
>
> Jianshi
>
> On Wed, Oct 29, 2014 at 5:53 AM, Mohammed Gulle
What is the motivation behind this ?
You can start with master as local[NO_OF_THREADS]. Reducing the threads at
all other places can have unexpected results. Take a look at this.
http://spark.apache.org/docs/latest/configuration.html.
Prashant Sharma
On Tue, Oct 28, 2014 at 2:08 PM, Wanda
Are you doing this in REPL ? Then there is a bug filed for this, I just
can't recall the bug ID at the moment.
Prashant Sharma
On Fri, Oct 24, 2014 at 4:07 AM, Niklas Wilcke <
1wil...@informatik.uni-hamburg.de> wrote:
> Hi Jao,
>
> I don't really know why this do
So if you need those features you can go ahead and setup one of Filesystem
or zookeeper options. Please take a look at:
http://spark.apache.org/docs/latest/spark-standalone.html.
Prashant Sharma
On Wed, Oct 15, 2014 at 3:25 PM, Chitturi Padma <
learnings.chitt...@gmail.com> wrote:
&
[Removing dev lists]
You are absolutely correct about that.
Prashant Sharma
On Tue, Oct 14, 2014 at 5:03 PM, Priya Ch
wrote:
> Hi Spark users/experts,
>
> In Spark source code (Master.scala & Worker.scala), when registering the
> worker with master, I see the usage of *p
What is your spark version ? This was fixed I suppose. Can you try it with
latest release ?
Prashant Sharma
On Fri, Sep 12, 2014 at 9:47 PM, Ramaraju Indukuri
wrote:
> This is only a problem in shell, but works fine in batch mode though. I am
> also interested in how others are solvi
Hey,
You can use spark-shell -i sparkrc, to do this.
Prashant Sharma
On Wed, Sep 3, 2014 at 2:17 PM, Jianshi Huang
wrote:
> To make my shell experience merrier, I need to import several packages,
> and define implicit sparkContext and sqlContext.
>
> Is there a start
-framework/chill-akka) might help. I am not well
aware about how kryo works internally, may be someone else can throw some
light on this.
Prashant Sharma
On Sat, Jul 26, 2014 at 6:26 AM, Alan Ngai wrote:
> The stack trace was from running the Actor count sample directly, without
> a
s setup it is kinda fast to do either tag prediction
at point which is not accurate etc.. but its useful.
Incase you are working on building this(inferior mode for spark repl) for
us, I can come up with a wishlist.
Prashant Sharma
On Sat, Jul 26, 2014 at 3:07 AM, Andrei wrote:
> I have neve
Hi,
What is your Zeromq version ? It is known to work well with 2.2
an output of `sudo ldconfig -v | grep zmq` would helpful in this regard.
Thanks
Prashant Sharma
On Wed, Jun 4, 2014 at 11:40 AM, Tobias Pfeiffer wrote:
> Hi,
>
> I am trying to use Spark Streaming (1.0.0) with Ze
I had like to be corrected on this but I am just trying to say small enough
of the order of few 100 MBs. Imagine the size gets shipped to all nodes, it
can be a GB but not GBs and then depends on the network too.
Prashant Sharma
On Fri, May 2, 2014 at 6:42 PM, Diana Carroll wrote:
> Any
I have pasted the link in my previous post.
Prashant Sharma
On Fri, May 2, 2014 at 4:15 PM, N.Venkata Naga Ravi wrote:
> Thanks for your quick replay.
>
> I tried with fresh installation, it downloads sbt 0.12.4 only (please
> check below logs). So it is not working. Can you tel
%3DGJh1g2zxOJd02Wt7L06mCLjo-vwwG9Q%40mail.gmail.com%3E
Prashant Sharma
On Fri, May 2, 2014 at 3:56 PM, N.Venkata Naga Ravi wrote:
>
> Hi,
>
>
> I am tyring to build Apache Spark with Java 8 in my Mac system ( OS X
> 10.8.5) , but getting following exception.
> Please help on resolving
Well that is not going to be easy, simply because we depend on akka-zeromq
for zeromq support. And since akka does not support the latest zeromq
library yet, I doubt if there is something simple that can be done to
support it.
Prashant Sharma
On Tue, Apr 29, 2014 at 2:44 PM, Francis.Hu wrote
zeromq 2.2.0 and if you
have jzmq libraries installed performance is much better.
Prashant Sharma
On Tue, Apr 29, 2014 at 12:29 PM, Francis.Hu
wrote:
> Hi, all
>
>
>
> I installed spark-0.9.1 and zeromq 4.0.1 , and then run below example:
>
>
It is the same file and hadoop library that we use for splitting takes care
of assigning the right split to each node.
Prashant Sharma
On Thu, Apr 24, 2014 at 1:36 PM, Carter wrote:
> Thank you very much for your help Prashant.
>
> Sorry I still have another question about yo
Prashant Sharma
On Thu, Apr 24, 2014 at 12:15 PM, Carter wrote:
> Thanks Mayur.
>
> So without Hadoop and any other distributed file systems, by running:
> val doc = sc.textFile("/home/scalatest.txt",5)
> doc.count
> we can only get parallelization within
Hi Ishaaq,
answers inline from what I know, I had like to be corrected though.
On Tue, Apr 15, 2014 at 5:58 PM, ishaaq wrote:
> Hi all,
> I am evaluating Spark to use here at my work.
>
> We have an existing Hadoop 1.x install which I planning to upgrade to
> Hadoop
> 2.3.
>
> This is not reall
for a
memory friendly workload.
I think it would be good to post experiences and then that can eventually
become some sort of guidelines.
Prashant Sharma
On Thu, Apr 3, 2014 at 1:36 PM, Sonal Goyal wrote:
> Hi,
>
> My earlier email did not get any response, I am looking for some
&g
I think Mahout uses FuzzyKmeans, which is different algorithm and it is not
iterative.
Prashant Sharma
On Tue, Mar 25, 2014 at 6:50 PM, Egor Pahomov wrote:
> Hi, I'm running benchmark, which compares Mahout and SparkML. For now I
> have next results for k-means:
> Number of
Hi David,
There are many implementations of RDD available in org.apache.spark. All
you have to do is implement RDD class. Ofcourse this is not possible from
java AFAIK.
Prashant Sharma
On Tue, Mar 11, 2014 at 1:00 AM, David Thomas wrote:
> Is there any guide available on creating a cus
You can enable debug logging for repl, thankfully it uses sparks logging
framework. Trouble must be with wrappers.
Prashant Sharma
On Fri, Feb 28, 2014 at 12:29 PM, Sampo Niskanen
wrote:
> Hi,
>
> Thanks for the pointers. I did get my code working within the normal
> spark-she
52 matches
Mail list logo