This is probably because the current thrift-server implementation has `SparkContext` inside (See: https://github.com/apache/spark/blob/master/sql/hive-thriftserver/src/main/scala/org/apache/spark/sql/hive/thriftserver/SparkSQLEnv.scala#L34 ). To support yarn-cluster, we need to add a lots of functionalities to deploy the thrift-server itself in a cluster. However, istm there are many technical issues around this.
// maropu On Fri, Jul 1, 2016 at 1:38 PM, Egor Pahomov <pahomov.e...@gmail.com> wrote: > What about yarn-cluster mode? > > 2016-07-01 11:24 GMT-07:00 Egor Pahomov <pahomov.e...@gmail.com>: > >> Separate bad users with bad quires from good users with good quires. >> Spark do not provide no scope separation out of the box. >> >> 2016-07-01 11:12 GMT-07:00 Jeff Zhang <zjf...@gmail.com>: >> >>> I think so, any reason you want to deploy multiple thrift server on one >>> machine ? >>> >>> On Fri, Jul 1, 2016 at 10:59 AM, Egor Pahomov <pahomov.e...@gmail.com> >>> wrote: >>> >>>> Takeshi, of course I used different HIVE_SERVER2_THRIFT_PORT >>>> Jeff, thanks, I would try, but from your answer I'm getting the >>>> feeling, that I'm trying some very rare case? >>>> >>>> 2016-07-01 10:54 GMT-07:00 Jeff Zhang <zjf...@gmail.com>: >>>> >>>>> This is not a bug, because these 2 processes use the >>>>> same SPARK_PID_DIR which is /tmp by default. Although you can resolve >>>>> this >>>>> by using different SPARK_PID_DIR, I suspect you would still have other >>>>> issues like port conflict. I would suggest you to deploy one spark thrift >>>>> server per machine for now. If stick to deploy multiple spark thrift >>>>> server >>>>> on one machine, then define different SPARK_CONF_DIR, SPARK_LOG_DIR and >>>>> SPARK_PID_DIR for your 2 instances of spark thrift server. Not sure if >>>>> there's other conflicts. but please try first. >>>>> >>>>> >>>>> On Fri, Jul 1, 2016 at 10:47 AM, Egor Pahomov <pahomov.e...@gmail.com> >>>>> wrote: >>>>> >>>>>> I get >>>>>> >>>>>> "org.apache.spark.sql.hive.thriftserver.HiveThriftServer2 running as >>>>>> process 28989. Stop it first." >>>>>> >>>>>> Is it a bug? >>>>>> >>>>>> 2016-07-01 10:10 GMT-07:00 Jeff Zhang <zjf...@gmail.com>: >>>>>> >>>>>>> I don't think the one instance per machine is true. As long as you >>>>>>> resolve the conflict issue such as port conflict, pid file, log file and >>>>>>> etc, you can run multiple instances of spark thrift server. >>>>>>> >>>>>>> On Fri, Jul 1, 2016 at 9:32 AM, Egor Pahomov <pahomov.e...@gmail.com >>>>>>> > wrote: >>>>>>> >>>>>>>> Hi, I'm using Spark Thrift JDBC server and 2 limitations are really >>>>>>>> bother me - >>>>>>>> >>>>>>>> 1) One instance per machine >>>>>>>> 2) Yarn client only(not yarn cluster) >>>>>>>> >>>>>>>> Are there any architectural reasons for such limitations? About >>>>>>>> yarn-client I might understand in theory - master is the same process >>>>>>>> as a >>>>>>>> server, so it makes some sense, but it's really inconvenient - I need >>>>>>>> a lot >>>>>>>> of memory on my driver machine. Reasons for one instance per machine I >>>>>>>> do >>>>>>>> not understand. >>>>>>>> >>>>>>>> -- >>>>>>>> >>>>>>>> >>>>>>>> *Sincerely yoursEgor Pakhomov* >>>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>>> -- >>>>>>> Best Regards >>>>>>> >>>>>>> Jeff Zhang >>>>>>> >>>>>> >>>>>> >>>>>> >>>>>> -- >>>>>> >>>>>> >>>>>> *Sincerely yoursEgor Pakhomov* >>>>>> >>>>> >>>>> >>>>> >>>>> -- >>>>> Best Regards >>>>> >>>>> Jeff Zhang >>>>> >>>> >>>> >>>> >>>> -- >>>> >>>> >>>> *Sincerely yoursEgor Pakhomov* >>>> >>> >>> >>> >>> -- >>> Best Regards >>> >>> Jeff Zhang >>> >> >> >> >> -- >> >> >> *Sincerely yoursEgor Pakhomov* >> > > > > -- > > > *Sincerely yoursEgor Pakhomov* > -- --- Takeshi Yamamuro