customized
hadoop jar and relative pom.xml to nexus repository. Check the link for
reference:
https://books.sonatype.com/nexus-book/reference/staging-deployment.html
fightf...@163.com
From: Lu, Yingqi
Date: 2016-03-08 15:23
To: fightf...@163.com; user
Subject: RE: How to compile Spark
I think you can establish your own maven repository and deploy your modified
hadoop binary jar
with your modified version number. Then you can add your repository in spark
pom.xml and use
mvn -Dhadoop.version=
fightf...@163.com
From: Lu, Yingqi
Date: 2016-03-08 15:09
To: user
I think this may be some permission issue. Check your spark conf for hadoop
related.
fightf...@163.com
From: Arunkumar Pillai
Date: 2016-02-23 14:08
To: user
Subject: spark 1.6 Not able to start spark
Hi When i try to start spark-shell
I'm getting following error
Exception in thread "
Oh, thanks. Make sense to me.
Best,
Sun.
fightf...@163.com
From: Takeshi Yamamuro
Date: 2016-02-04 16:01
To: fightf...@163.com
CC: user
Subject: Re: Re: About cache table performance in spark sql
Hi,
Parquet data are column-wise and highly compressed, so the size of deserialized
rows
Hi,
How could I clear cache (execute sql query without any cache) using spark sql
cli ?
Is there any command available ?
Best,
Sun.
fightf...@163.com
Hi, Ted
Yes. I had seen that issue. But it seems that in spark-sql cli cannot do
command like :
sqlContext.clearCache()
Is this right ? In spark-sql cli I can only run some sql queries. So I want to
see if there
are any available options to reach this.
Best,
Sun.
fightf...@163.com
? From impala I get the overall
parquet file size if about 24.59GB. Would be good to had some correction on
this.
Best,
Sun.
fightf...@163.com
From: Prabhu Joseph
Date: 2016-02-04 14:35
To: fightf...@163.com
CC: user
Subject: Re: About cache table performance in spark sql
Sun,
When
...@163.com
From: Ted Yu
Date: 2016-02-04 11:49
To: fightf...@163.com
CC: user
Subject: Re: Re: clear cache using spark sql cli
In spark-shell, I can do:
scala> sqlContext.clearCache()
Is that not the case for you ?
On Wed, Feb 3, 2016 at 7:35 PM, fightf...@163.com <fightf...@163.com>
age
cannot hold the 24.59GB+ table size into memory. But why the performance is so
different and even so bad ?
Best,
Sun.
fightf...@163.com
cessfully. Do I need to increase the partitions? Or is
there any other
alternatives I can choose to tune this ?
Best,
Sun.
fightf...@163.com
From: fightf...@163.com
Date: 2016-01-20 15:06
To: 刘虓
CC: user
Subject: Re: Re: spark dataframe jdbc read/write using dbcp connection pool
Hi,
Thanks a lot
was 377,769 milliseconds
ago. The last packet sent successfully to the server was 377,790 milliseconds
ago.
Do I need to increase the partitions ? Or shall I write parquet file for each
partition in a iterable way ?
Thanks a lot for your advice.
Best,
Sun.
fightf...@163.com
From: 刘虓
Date
1 in stage 0.0
(TID 2)
com.mysql.jdbc.exceptions.jdbc4.CommunicationsException: Communications link
failure
fightf...@163.com
4")
The added_year column in mysql table contains range of (1985-2015), and I pass
the numPartitions property
to get the partition purpose. Is this what you recommend ? Can you advice a
little more implementation on this ?
Best,
Sun.
fightf...@163.com
From: 刘虓
Date: 2016-01-20 11:26
rTempTable("video_test")
sqlContext.sql("select count(1) from video_test").show()
Overally the load process would stuck and get connection timeout. Mysql table
hold about 100 million records.
Would be happy to provide more usable info.
Best,
Sun.
fightf...@163.com
Hi, Vivek M
I had ever tried 1.5.x spark-cassandra connector and indeed encounter some
classpath issues, mainly for the guaua dependency.
I believe that can be solved by some maven config, but have not tried that yet.
Best,
Sun.
fightf...@163.com
From: vivek.meghanat...@wipro.com
Date
Emm...I think you can do a df.map and store each column value to your list.
fightf...@163.com
发件人: zml张明磊
发送时间: 2015-12-25 15:33
收件人: user@spark.apache.org
抄送: dev-subscr...@spark.apache.org
主题: How can I get the column data based on specific column name and then stored
these data in array
Agree with you that assembly jar is not good to publish. However, what he
really need is to fetch
an updatable maven jar file.
fightf...@163.com
From: Mark Hamstra
Date: 2015-12-11 15:34
To: fightf...@163.com
CC: Xiaoyong Zhu; Jeff Zhang; user; Zhaomin Xu; Joe Zhang (SDE)
Subject: Re: RE
Using maven to download the assembly jar is fine. I would recommend to deploy
this
assembly jar to your local maven repo, i.e. nexus repo, Or more likey a
snapshot repository
fightf...@163.com
From: Xiaoyong Zhu
Date: 2015-12-11 15:10
To: Jeff Zhang
CC: user@spark.apache.org; Zhaomin Xu
of
using this.
fightf...@163.com
发件人: censj
发送时间: 2015-12-09 15:44
收件人: fightf...@163.com
抄送: user@spark.apache.org
主题: Re: About Spark On Hbase
So, I how to get this jar? I use set package project.I not found sbt lib.
在 2015年12月9日,15:42,fightf...@163.com 写道:
I don't think it really need CDH
and got the
daily distinct count. However , I am not sure about this implementation can be
some efficient workaround.
Hope some guys can shed a little light on this.
Best,
Sun.
fightf...@163.com
I don't think it really need CDH component. Just use the API
fightf...@163.com
发件人: censj
发送时间: 2015-12-09 15:31
收件人: fightf...@163.com
抄送: user@spark.apache.org
主题: Re: About Spark On Hbase
But this is dependent on CDH。I not install CDH。
在 2015年12月9日,15:18,fightf...@163.com 写道:
Actually
Actually you can refer to https://github.com/cloudera-labs/SparkOnHBase
Also, HBASE-13992 already integrates that feature into the hbase side, but
that feature has not been released.
Best,
Sun.
fightf...@163.com
From: censj
Date: 2015-12-09 15:04
To: user@spark.apache.org
Subject: About
Well , Sorry for late reponse and thanks a lot for pointing out the clue.
fightf...@163.com
From: Akhil Das
Date: 2015-12-03 14:50
To: Sahil Sareen
CC: fightf...@163.com; user
Subject: Re: spark sql cli query results written to file ?
Oops 3 mins late. :)
Thanks
Best Regards
On Thu, Dec 3
HI,
How could I save the spark sql cli running queries results and write the
results to some local file ?
Is there any available command ?
Thanks,
Sun.
fightf...@163.com
and hive config, that would help to locate root cause for
the problem.
Best,
Sun.
fightf...@163.com
From: Ashok Kumar
Date: 2015-12-01 18:54
To: user@spark.apache.org
Subject: New to Spark
Hi,
I am new to Spark.
I am trying to use spark-sql with SPARK CREATED and HIVE CREATED tables.
I have
Could you provide your hive-site.xml file info ?
Best,
Sun.
fightf...@163.com
From: Chandra Mohan, Ananda Vel Murugan
Date: 2015-11-27 17:04
To: fightf...@163.com; user
Subject: RE: error while creating HiveContext
Hi,
I verified and I could see hive-site.xml in spark conf directory
Hi,
I think you just want to put the hive-site.xml in the spark/conf directory and
it would load
it into spark classpath.
Best,
Sun.
fightf...@163.com
From: Chandra Mohan, Ananda Vel Murugan
Date: 2015-11-27 15:04
To: user
Subject: error while creating HiveContext
Hi,
I am building
I think the exception info just says clear that you may miss some tez related
jar on the
spark thrift server classpath.
fightf...@163.com
From: DaeHyun Ryu
Date: 2015-11-11 14:47
To: user
Subject: Spark Thrift doesn't start
Hi folks,
I configured tez as execution engine of Hive. After done
Hi,
Have you ever considered cassandra as a replacement ? We are now almost the
seem usage as your engine, e.g. using mysql to store
initial aggregated data. Can you share more about your kind of Cube queries ?
We are very interested in that arch too : )
Best,
Sun.
fightf...@163.com
for prompt response.
fightf...@163.com
From: tsh
Date: 2015-11-10 02:56
To: fightf...@163.com; user; dev
Subject: Re: OLAP query using spark dataframe with cassandra
Hi,
I'm in the same position right now: we are going to implement something like
OLAP BI + Machine Learning explorations on the same
-apache-cassandra-and-spark
fightf...@163.com
of olap architecture.
And we are happy to hear more use case from this community.
Best,
Sun.
fightf...@163.com
From: Jörn Franke
Date: 2015-11-09 14:40
To: fightf...@163.com
CC: user; dev
Subject: Re: OLAP query using spark dataframe with cassandra
Is there any distributor supporting
Hi
I notice that you configured the following :
configuration.set("hbase.master", "192.168.1:6");
Did you mistyped the host IP ?
Best,
Sun.
fightf...@163.com
发件人: jinhong lu
发送时间: 2015-10-27 17:22
收件人: spark users
主题: spark to hbase
Hi,
I write my result to hd
such progress ?
Best,
Sun.
fightf...@163.com
Gateway s3 rest api, agreed for such inconvinience and
some incompobilities. However, we had not
yet quite researched and tested over radosgw a lot. But we had some little
requirements using gw in some use cases.
Hope for more considerations and talks.
Best,
Sun.
fightf...@163.com
From: Jerry
Hi, Sarath
Did you try to use and increase spark.excecutor.extraJaveOptions -XX:PermSize=
-XX:MaxPermSize=
fightf...@163.com
From: Sarath Chandra
Date: 2015-07-29 17:39
To: user@spark.apache.org
Subject: PermGen Space Error
Dear All,
I'm using -
= Spark 1.2.0
= Hive 0.13.1
= Mesos
Hi, there
I test with sqlContext.sql(select funcName(param1,param2,...) from tableName )
just worked fine.
Would you like to paste your test code here ? And which version of Spark are u
using ?
Best,
Sun.
fightf...@163.com
From: vinod kumar
Date: 2015-07-27 15:04
To: User
Subject
suggest you
firstly to deploy a spark standalone cluster to run some integration tests, and
also you can consider running spark on yarn for
the later development use cases.
Best,
Sun.
fightf...@163.com
From: Jeetendra Gangele
Date: 2015-07-23 13:39
To: user
Subject: Re: Need help in setting
Hi, there
Which version are you using ? Actually the problem seems gone after we change
our spark version from 1.2.0 to 1.3.0
Not sure what the internal changes did.
Best,
Sun.
fightf...@163.com
From: Night Wolf
Date: 2015-05-12 22:05
To: fightf...@163.com
CC: Patrick Wendell; user; dev
Hi, there
you may need to add :
import sqlContext.implicits._
Best,
Sun
fightf...@163.com
From: java8964
Date: 2015-04-03 10:15
To: user@spark.apache.org
Subject: Cannot run the example in the Spark 1.3.0 following the document
I tried to check out what Spark SQL 1.3.0. I installed
Hi
Still no good luck with your guide.
Best.
Sun.
fightf...@163.com
From: Yuri Makhno
Date: 2015-04-01 15:26
To: fightf...@163.com
CC: Taotao.Li; user
Subject: Re: Re: rdd.cache() not working ?
cache() method returns new RDD so you have to use something like this:
val person
Hi
That is just the issue. After running person.cache we then run person.count
however, there still not be any cache performance showed from web ui storage.
Thanks,
Sun.
fightf...@163.com
From: Taotao.Li
Date: 2015-04-01 14:02
To: fightfate
CC: user
Subject: Re: rdd.cache() not working
sqlContext.cacheTable operation,
we can see the cache results. Not sure what's happening here. If anyone can
reproduce this issue, please let me know.
Thanks,
Sun
fightf...@163.com
From: Sean Owen
Date: 2015-04-01 15:54
To: Yuri Makhno
CC: fightf...@163.com; Taotao.Li; user
Subject: Re: Re
this for a little.
Best,
Sun.
case class Person(id: Int, col1: String)
val person =
sc.textFile(hdfs://namenode_host:8020/user/person.txt).map(_.split(,)).map(p
= Person(p(0).trim.toInt, p(1)))
person.cache
person.count
fightf...@163.com
Thanks, Jerry
I got that way. Just to make sure whether there can be some option to directly
specifying tachyon version.
fightf...@163.com
From: Shao, Saisai
Date: 2015-03-16 11:10
To: fightf...@163.com
CC: user
Subject: RE: Building spark over specified tachyon
I think you could change
.
fightf...@163.com
Thanks haoyuan.
fightf...@163.com
From: Haoyuan Li
Date: 2015-03-16 12:59
To: fightf...@163.com
CC: Shao, Saisai; user
Subject: Re: RE: Building spark over specified tachyon
Here is a patch: https://github.com/apache/spark/pull/4867
On Sun, Mar 15, 2015 at 8:46 PM, fightf...@163.com fightf
Hi,
You may want to check your spark environment config in spark-env.sh,
specifically for the SPARK_LOCAL_IP and check that whether you did modify
that value, which may default be localhost.
Thanks,
Sun.
fightf...@163.com
From: sara mustafa
Date: 2015-03-14 15:13
To: user
Subject: deploying
Hi,
You may want to check your spark environment config in spark-env.sh,
specifically for the SPARK_LOCAL_IP and check that whether you did modify
that value, which may default be localhost.
Thanks,
Sun.
fightf...@163.com
From: sara mustafa
Date: 2015-03-14 15:13
To: user
Subject
Hi, there
You may want to check your hbase config.
e.g. the following property can be changed to /hbase
property
namezookeeper.znode.parent/name
value/hbase-unsecure/value
/property
fightf...@163.com
From: HARIPRIYA AYYALASOMAYAJULA
Date: 2015-03-14 10:47
To: user
Hi,
You can first establish a scala ide to develop and debug your spark program,
lets say, intellij idea or eclipse.
Thanks,
Sun.
fightf...@163.com
From: Xi Shen
Date: 2015-03-06 09:19
To: user@spark.apache.org
Subject: Spark code development practice
Hi,
I am new to Spark. I see every
application? Does spark provide such configs for achieving that goal?
We know that this is trickle to get it working. Just want to know that how
could this be resolved, or from other possible channel for
we did not cover.
Expecting for your kind advice.
Thanks,
Sun.
fightf...@163.com
Hi,
Really have no adequate solution got for this issue. Expecting any available
analytical rules or hints.
Thanks,
Sun.
fightf...@163.com
From: fightf...@163.com
Date: 2015-02-09 11:56
To: user; dev
Subject: Re: Sort Shuffle performance issues about using AppendOnlyMap for
large data
for supporting modifying this ?
Very thanks,
fightf...@163.com
Hi,
Problem still exists. Any experts would take a look at this?
Thanks,
Sun.
fightf...@163.com
From: fightf...@163.com
Date: 2015-02-06 17:54
To: user; dev
Subject: Sort Shuffle performance issues about using AppendOnlyMap for large
data sets
Hi, all
Recently we had caught performance
Hi, Siddharth
You can re build spark with maven by specifying -Dhadoop.version=2.5.0
Thanks,
Sun.
fightf...@163.com
From: Siddharth Ubale
Date: 2015-01-30 15:50
To: user@spark.apache.org
Subject: Hi: hadoop 2.5 for spark
Hi ,
I am beginner with Apache spark.
Can anyone let me know
)
List(kv)
}
Thanks,
Sun
fightf...@163.com
From: Jim Green
Date: 2015-01-28 04:44
To: Ted Yu
CC: user
Subject: Re: Bulk loading into hbase using saveAsNewAPIHadoopFile
I used below code, and it still failed with the same error.
Anyone has experience on bulk loading
57 matches
Mail list logo