Hi,
Is there a way to instantiate multiple Thrift servers on one Spark Cluster?
--
View this message in context:
http://apache-spark-user-list.1001560.n3.nabble.com/Multiple-Thrift-servers-on-one-Spark-cluster-tp24148.html
Sent from the Apache Spark User List mailing list archive at
Hi,
I just checked and i can see that there is method called withColumn:
def withColumn(colName: String, col: Column
http://spark.apache.org/docs/latest/api/scala/org/apache/spark/sql/Column.html
): DataFrame
http://spark.apache.org/docs/latest/api/scala/org/apache/spark/sql/DataFrame.html
I think it uses local dir, hdfs dir path starts with hdfs://
Check permissions on folders, and also check logs. There should be more info
about exception.
Best
Bojan
--
View this message in context:
You can use toDebugString to see all the steps in job.
Best
Bojan
--
View this message in context:
http://apache-spark-user-list.1001560.n3.nabble.com/Caching-and-Actions-tp22418p22433.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.
You could convert DF to RDD, then in map phase or in join add new column,
and then again convert to DF. I know this is not elegant solution and maybe
it is not a solution at all. :) But this is the first thing that popped in
my mind.
I am new also to DF api.
Best
Bojan
On Apr 9, 2015 00:37,
Try building with scala 2.10.
Best
Bojan
On Mar 31, 2015 01:51, nightwolf [via Apache Spark User List]
ml-node+s1001560n22309...@n3.nabble.com wrote:
I am having the same problems. Did you find a fix?
--
If you reply to this email, your message will be added to
Did you find any other way for this issue?
I just found out that i have 22 columns data set... And now i am searching
for best solution.
Anyone else have experienced with this problem?
Best
Bojan
--
View this message in context:
I finally solved issue with Spark Tableau connection.
Thanks Denny Lee for blog post:
https://www.concur.com/blog/en-us/connect-tableau-to-sparksql
Solution was to use Authentication type Username. And then use username for
metastore.
Best regards
Bojan
--
View this message in context:
Here is the link on jira: https://issues.apache.org/jira/browse/SPARK-4243
https://issues.apache.org/jira/browse/SPARK-4243
--
View this message in context:
http://apache-spark-user-list.1001560.n3.nabble.com/SQL-COUNT-DISTINCT-tp17818p18166.html
Sent from the Apache Spark User List
Hi Michael,
Thanks for response. I did test with query that you send me. And it works
really faster:
Old queries stats by phases:
3.2min
17s
Your query stats by phases:
0.3 s
16 s
20 s
But will this improvement also affect when you want to count distinct on 2
or more fields:
SELECT COUNT(f1),
While i testing Spark SQL i noticed that COUNT DISTINCT works really slow.
Map partitions phase finished fast, but collect phase is slow.
It's only runs on single executor.
Should this run this way?
And here is the simple code which i use for testing:
val sqlContext = new
I'm testing beta driver from Databricks for Tableua.
And unfortunately i encounter some issues.
While beeline connection works without problems, Tableau can't connect to
spark thrift server.
Error from driver(Tableau):
Unable to connect to the ODBC Data Source. Check that the necessary drivers
I use beta driver SQL ODBC from Databricks.
--
View this message in context:
http://apache-spark-user-list.1001560.n3.nabble.com/Spark-Tableau-tp17720p17727.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.
?
On Thu, Oct 30, 2014 at 8:00 AM, Bojan Kostic blood9ra...@gmail.com wrote:
I use beta driver SQL ODBC from Databricks.
--
View this message in context:
http://apache-spark-user-list.1001560.n3.nabble.com/Spark-Tableau-tp17720p17727.html
Sent from the Apache Spark User List mailing
14 matches
Mail list logo