Multiple Thrift servers on one Spark cluster

2015-08-06 Thread Bojan Kostic
Hi, Is there a way to instantiate multiple Thrift servers on one Spark Cluster? -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/Multiple-Thrift-servers-on-one-Spark-cluster-tp24148.html Sent from the Apache Spark User List mailing list archive at

Re: Add row IDs column to data frame

2015-04-09 Thread Bojan Kostic
Hi, I just checked and i can see that there is method called withColumn: def withColumn(colName: String, col: Column http://spark.apache.org/docs/latest/api/scala/org/apache/spark/sql/Column.html ): DataFrame http://spark.apache.org/docs/latest/api/scala/org/apache/spark/sql/DataFrame.html

Re: SQL can't not create Hive database

2015-04-09 Thread Bojan Kostic
I think it uses local dir, hdfs dir path starts with hdfs:// Check permissions on folders, and also check logs. There should be more info about exception. Best Bojan -- View this message in context:

Re: Caching and Actions

2015-04-09 Thread Bojan Kostic
You can use toDebugString to see all the steps in job. Best Bojan -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/Caching-and-Actions-tp22418p22433.html Sent from the Apache Spark User List mailing list archive at Nabble.com.

Re: Add row IDs column to data frame

2015-04-08 Thread Bojan Kostic
You could convert DF to RDD, then in map phase or in join add new column, and then again convert to DF. I know this is not elegant solution and maybe it is not a solution at all. :) But this is the first thing that popped in my mind. I am new also to DF api. Best Bojan On Apr 9, 2015 00:37,

Re: Spark 1.3 build with hive support fails

2015-03-30 Thread Bojan Kostic
Try building with scala 2.10. Best Bojan On Mar 31, 2015 01:51, nightwolf [via Apache Spark User List] ml-node+s1001560n22309...@n3.nabble.com wrote: I am having the same problems. Did you find a fix? -- If you reply to this email, your message will be added to

Re: Nested Case Classes (Found and Required Same)

2015-03-04 Thread Bojan Kostic
Did you find any other way for this issue? I just found out that i have 22 columns data set... And now i am searching for best solution. Anyone else have experienced with this problem? Best Bojan -- View this message in context:

Re: Spark + Tableau

2014-11-11 Thread Bojan Kostic
I finally solved issue with Spark Tableau connection. Thanks Denny Lee for blog post: https://www.concur.com/blog/en-us/connect-tableau-to-sparksql Solution was to use Authentication type Username. And then use username for metastore. Best regards Bojan -- View this message in context:

Re: SQL COUNT DISTINCT

2014-11-05 Thread Bojan Kostic
Here is the link on jira: https://issues.apache.org/jira/browse/SPARK-4243 https://issues.apache.org/jira/browse/SPARK-4243 -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/SQL-COUNT-DISTINCT-tp17818p18166.html Sent from the Apache Spark User List

Re: SQL COUNT DISTINCT

2014-11-03 Thread Bojan Kostic
Hi Michael, Thanks for response. I did test with query that you send me. And it works really faster: Old queries stats by phases: 3.2min 17s Your query stats by phases: 0.3 s 16 s 20 s But will this improvement also affect when you want to count distinct on 2 or more fields: SELECT COUNT(f1),

SQL COUNT DISTINCT

2014-10-31 Thread Bojan Kostic
While i testing Spark SQL i noticed that COUNT DISTINCT works really slow. Map partitions phase finished fast, but collect phase is slow. It's only runs on single executor. Should this run this way? And here is the simple code which i use for testing: val sqlContext = new

Spark + Tableau

2014-10-30 Thread Bojan Kostic
I'm testing beta driver from Databricks for Tableua. And unfortunately i encounter some issues. While beeline connection works without problems, Tableau can't connect to spark thrift server. Error from driver(Tableau): Unable to connect to the ODBC Data Source. Check that the necessary drivers

Re: Spark + Tableau

2014-10-30 Thread Bojan Kostic
I use beta driver SQL ODBC from Databricks. -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/Spark-Tableau-tp17720p17727.html Sent from the Apache Spark User List mailing list archive at Nabble.com.

Re: Spark + Tableau

2014-10-30 Thread Bojan Kostic
? On Thu, Oct 30, 2014 at 8:00 AM, Bojan Kostic blood9ra...@gmail.com wrote: I use beta driver SQL ODBC from Databricks. -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/Spark-Tableau-tp17720p17727.html Sent from the Apache Spark User List mailing