Oct 2019 at 11:02 AM, Chetan Khatri <
> chetan.opensou...@gmail.com> wrote:
>
>> Could someone please help me.
>>
>> On Thu, Oct 17, 2019 at 7:29 PM Chetan Khatri <
>> chetan.opensou...@gmail.com> wrote:
>>
>>> Hi Users,
>>>
>
an Khatri
> wrote:
>
>> Hi Users,
>>
>> I am setting spark configuration in below way;
>>
>> val spark = SparkSession.builder().appName(APP_NAME).getOrCreate()
>>
>> spark.conf.set("spark.speculation", "false")
>> spark.conf.se
Could someone please help me.
On Thu, Oct 17, 2019 at 7:29 PM Chetan Khatri
wrote:
> Hi Users,
>
> I am setting spark configuration in below way;
>
> val spark = SparkSession.builder().appName(APP_NAME).getOrCreate()
>
> spark.conf.set("spark.speculation&q
Hi Users,
I am setting spark configuration in below way;
val spark = SparkSession.builder().appName(APP_NAME).getOrCreate()
spark.conf.set("spark.speculation", "false")
spark.conf.set("spark.broadcast.compress", "true")
spark.conf.set("spark.s
Ryan,
I agree that Hive 1.2.1 work reliably with Spark 2.x , but i went through
with current stable version of Hive which is 2.0.1 and I am working with
that. seems good but i want to make sure the which version of Hive is more
reliable with Spark 2.x and i think @Ryan you replied the same which
Chetan,
Spark is currently using Hive 1.2.1 to interact with the Metastore. Using
that version for Hive is going to be the most reliable, but the metastore
API doesn't change very often and we've found (from having different
versions as well) that older versions are mostly compatible. Some things
Hi,
I think that you can configure the hive metastore versions in SPARK.
Regards,
Gourav
On Wed, Dec 28, 2016 at 12:22 PM, Chetan Khatri wrote:
> Hello Users / Developers,
>
> I am using Hive 2.0.1 with MySql as a Metastore, can you tell me which
> version is
Hello Users / Developers,
I am using Hive 2.0.1 with MySql as a Metastore, can you tell me which
version is more compatible with Spark 2.0.2 ?
THanks
If you're changing properties for the SparkContext, then I believe you will
have to start a new SparkContext with the new properties.
On Wed, Aug 10, 2016 at 8:47 AM, Jestin Ma
wrote:
> If I run an application, for example with 3 joins:
>
> [join 1]
> [join 2]
> [join
If I run an application, for example with 3 joins:
[join 1]
[join 2]
[join 3]
[final join and save to disk]
Could I change Spark properties in between each join?
[join 1]
[change properties]
[join 2]
[change properties]
...
Or would I have to create a separate application with different
Thanks Steve,
For NN it all depends how fast you want a start-up. 1GB of NameNode
memory accommodates around 42T so if you are talking about 100GB of NN
memory then SSD may make sense to speed up the start-up. Raid 10 is the
best one that one can get assuming all internal disks.
In general it
> On 17 Mar 2016, at 12:28, Mich Talebzadeh wrote:
>
> Thanks Steve,
>
> For NN it all depends how fast you want a start-up. 1GB of NameNode memory
> accommodates around 42T so if you are talking about 100GB of NN memory then
> SSD may make sense to speed up the
On 11 Mar 2016, at 16:25, Mich Talebzadeh
> wrote:
Hi Steve,
My argument has always been that if one is going to use Solid State Disks
(SSD), it makes sense to have it for NN disks start-up from fsimage etc.
Obviously NN lives in
Thank you for info Steve.
I always believed (IMO) that there is an optimal position where one can
plot the projected NN memory (assuming 1GB--> 40TB of data) to the number
of nodes. For example heuristically how many nodes would be sufficient for
1PB of storage with nodes each having 512GB of
Hi Steve,
My argument has always been that if one is going to use Solid State Disks
(SSD), it makes sense to have it for NN disks start-up from fsimage etc.
Obviously NN lives in memory. Would you also rerommend RAID10 (mirroring &
striping) for NN disks?
Thanks
Dr Mich Talebzadeh
On 10 Mar 2016, at 22:15, Ashok Kumar
> wrote:
Hi,
We intend to use 5 servers which will be utilized for building Bigdata Hadoop
data warehouse system (not using any propriety distribution like Hortonworks or
Cloudera or
Hi,
Bear in mind that you typically need 1GB of NameNode memory for 1 million
blocks. So if you have 128MB block size, you can store 128 * 1E6 / (3
*1024) = 41,666GB of data for every 1GB. Number 3 comes from the fact that
the block is replicated three times. In other words just under 42TB of
Ashok,
Cluster nodes has enough memory but CPU cores are less. 512GB / 16 = 32
GB. For 1 core the cluster has 32GB memory. Either their should be more
cores available to use efficiently the
available memory or don't configure a higher executor memory which will
cause lot of GC.
Thanks,
Hi,
We intend to use 5servers which will be utilized for building Bigdata Hadoop
data warehousesystem (not using any propriety distribution like Hortonworks or
Cloudera orothers).All servers configurations are 512GB RAM, 30TB storageand 16
cores, Ubuntu Linux servers. Hadoop will be
:
http://apache-spark-user-list.1001560.n3.nabble.com/Spark-configuration-file-metrics-properties-tp23985.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.
-
To unsubscribe, e-mail: user-unsubscr
(SQLConf.scala:283)
at
org.apache.spark.sql.SQLConf$$anonfun$getConf$1.apply(SQLConf.scala:283)
*Am I retrieving the properties in the right way?
--
View this message in context:
http://apache-spark-user-list.1001560.n3.nabble.com/Retrieving-Spark-Configuration-properties-tp23881.html
Sent from
)
at
org.apache.spark.sql.SQLConf$$anonfun$getConf$1.apply(SQLConf.scala:283)
*Am I retrieving the properties in the right way?
--
View this message in context:
http://apache-spark-user-list.1001560.n3.nabble.com/Retrieving-Spark-Configuration-properties-tp23881.html
Sent from the Apache Spark
Hi guys: I added a parameter spark.worker.cleanup.appDataTtl 3 * 24 *
3600 in my conf/spark-default.conf, then I start my spark cluster. However I
got an exception:
15/06/16 14:25:14 INFO util.Utils: Successfully started service 'sparkWorker'
on port 43344.
15/06/16 14:25:14 ERROR
I think you have to using 604800 instead of 7 * 24 * 3600, obviously
SparkConf will not do multiplication for you..
The exception is quite obvious: Caused by: java.lang.NumberFormatException:
For input string: 3 * 24 * 3600
2015-06-16 14:52 GMT+08:00 luohui20...@sina.com:
Hi guys:
I
:Re: Spark Configuration of spark.worker.cleanup.appDataTtl
日期:2015年06月16日 15点00分
I think you have to using 604800 instead of 7 * 24 * 3600, obviously
SparkConf will not do multiplication for you..
The exception is quite obvious: Caused by: java.lang.NumberFormatException:
For input string: 3
related to cassandra so i declare sparkcontext earlier
and then want to set this property at some later stage.
Any suggestion?
--
View this message in context:
http://apache-spark-user-list.1001560.n3.nabble.com/setting-spark-configuration-properties-problem-tp22764.html
Sent from the Apache
Hi Experts,
I am new in Spark, so I want manipulate it locally on my machine with
Ubuntu as OS.
I dowloaded the last version of Spark.
I ran this command to start it : ./sbin/start-master.sh
but an error is occured :
*starting org.apache.spark.deploy.master.Master, logging to
It sounds like you downloaded the source distribution perhaps, but
have not built it. That's what the message is telling you. See
http://spark.apache.org/docs/latest/building-spark.html
Or maybe you intended to get a binary distribution.
On Mon, Feb 23, 2015 at 10:40 PM, King sami
I guess you downloaded the source code.
You can build it with the following command:
mvn -DskipTests clean package
Or just download a compiled version.
Shlomi
On 24 בפבר׳ 2015, at 00:40, King sami kgsam...@gmail.com wrote:
Hi Experts,
I am new in Spark, so I want manipulate it locally
.
You can read about it in my Blog Post
http://progexc.blogspot.co.il/2014/12/spark-configuration-mess-solved.html
--
Enjoy,
Demi Ben-Ari
Senior Software Engineer
Windward LTD.
/12/spark-configuration-mess-solved.html
--
Enjoy,
Demi Ben-Ari
Senior Software Engineer
Windward LTD.
.
Thanks in advance.
--
View this message in context:
http://apache-spark-user-list.1001560.n3.nabble.com/Optimized-spark-configuration-tp20495.html
Sent from the Apache Spark User List mailing list archive at Nabble.com
: vdiwakar.malladimailto:vdiwakar.mall...@gmail.com
Inviato: 05/12/2014 18:52
A: u...@spark.incubator.apache.orgmailto:u...@spark.incubator.apache.org
Oggetto: Optimized spark configuration
Hi
Could any one help what would be better / optimized configuration for driver
memory, worker memory
Hi there,
My product environment is AWS EMR with hadoop2.4.0 and spark1.0.2. I moved the
spark configuration in SPARK_CLASSPATH to spark-default.conf, then the
hiveContext went wrong.
I also found WARN info “WARN DataNucleus.General: Plugin (Bundle)
org.datanucleus.store.rdbms is already
Hi All,
I am working with Spark to add new slaves automatically when there is more data
to be processed by the cluster. During this process there is question arisen,
after adding/removing new slave node to/from the spark cluster do we need to
restart master and other existing slaves in the
Not really. You are better off using a cluster manager like Mesos or Yarn
for this.
Mayur Rustagi
Ph: +1 (760) 203 3257
http://www.sigmoidanalytics.com
@mayur_rustagi https://twitter.com/mayur_rustagi
On Tue, Jun 24, 2014 at 11:35 AM, Sirisha Devineni
sirisha_devin...@persistent.co.in wrote:
-Spark-Configuration-Files-tp8159p8219.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.
37 matches
Mail list logo