[jira] [Comment Edited] (SPARK-9686) Spark Thrift server doesn't return correct JDBC metadata

Andriy Kushnir (JIRA) Thu, 26 Oct 2017 03:19:28 -0700

    [ 
https://issues.apache.org/jira/browse/SPARK-9686?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16217201#comment-16217201
 ]


Andriy Kushnir edited comment on SPARK-9686 at 10/26/17 10:18 AM:
------------------------------------------------------------------

[~rxin], I did a little research for this error.
To invoke {{run()}} → {{runInternal()}} on any 
{{org.apache.hive.service.cli.operation.Operation}} (for example, 
{{GetSchemasOperation}}) we need {{IMetaStoreClient}}. Currently it's taken 
from {{HiveSession}} instance:
{code:java}
public class GetSchemasOperation extends MetadataOperation {
    @Override
    public void runInternal() throws HiveSQLException {
        IMetaStoreClient metastoreClient = 
getParentSession().getMetaStoreClient();
    }
}
{code}

All opened {{HiveSession}} s are handled by 
{{org.apache.hive.service.cli.session.SessionManager}} instance.
{{SessionManager}}, among with others, implements 
{{org.apache.hive.service.Service}} interface, and all {{Service}} s 
initialized with same Hive configuration:
{code:java}
public interface Service { 
    void init(HiveConf conf);
}
{code}
When {{org.apache.spark.sql.hive.thriftserver.HiveThriftServer2}} initializes, 
all {{org.apache.hive.service.CompositeService}} s receive same {{HiveConf}}:

{code:java}
private[hive] class HiveThriftServer2(sqlContext: SQLContext) extends 
HiveServer2 with ReflectedCompositeService {
    override def init(hiveConf: HiveConf) {
        initCompositeService(hiveConf)
    }
}

object HiveThriftServer2 extends Logging {
    @DeveloperApi
    def startWithContext(sqlContext: SQLContext): Unit = {
        val server = new HiveThriftServer2(sqlContext)

        val executionHive = HiveUtils.newClientForExecution(
          sqlContext.sparkContext.conf,
          sqlContext.sessionState.newHadoopConf())

        server.init(executionHive.conf)
    }
}

{code}

So, {{HiveUtils#newClientForExecution()}} returns implementation of 
{{IMetaStoreClient}} which *ALWAYS* points to derby metastore (see dosctrings 
and comments in 
{{org.apache.spark.sql.hive.HiveUtils#newTemporaryConfiguration()}})

IMHO, to get correct metadata we need to additionally create another 
{{IMetaStoreClient}} with {{newClientForMetadata()}}, and pass it's 
{{HiveConf}} to underlying {{Service}} s.


was (Author: orhideous):
[~rxin], I did a little research for this error.
To invoke {{run()}} → {{runInternal()}} on any 
{{org.apache.hive.service.cli.operation.Operation}} (for example, 
{{GetSchemasOperation}}) we need {{IMetaStoreClient}}. Currently it's taken 
from {{HiveSession}} instance:
{code:java}
public class GetSchemasOperation extends MetadataOperation {
    @Override
    public void runInternal() throws HiveSQLException {
        IMetaStoreClient metastoreClient = 
getParentSession().getMetaStoreClient();
    }
}
{code}

All opened {{HiveSession}} s are handled by 
{{org.apache.hive.service.cli.session.SessionManager}} instance.
{{SessionManager}}, among with others, implements 
{{org.apache.hive.service.Service}} interface, and all {{Service}}s initialized 
with same Hive configuration:
{code:java}
public interface Service { 
    void init(HiveConf conf);
}
{code}
When {{org.apache.spark.sql.hive.thriftserver.HiveThriftServer2}} initializes, 
all {{org.apache.hive.service.CompositeService}} s receive same {{HiveConf}}:

{code:java}
private[hive] class HiveThriftServer2(sqlContext: SQLContext) extends 
HiveServer2 with ReflectedCompositeService {
    override def init(hiveConf: HiveConf) {
        initCompositeService(hiveConf)
    }
}

object HiveThriftServer2 extends Logging {
    @DeveloperApi
    def startWithContext(sqlContext: SQLContext): Unit = {
        val server = new HiveThriftServer2(sqlContext)

        val executionHive = HiveUtils.newClientForExecution(
          sqlContext.sparkContext.conf,
          sqlContext.sessionState.newHadoopConf())

        server.init(executionHive.conf)
    }
}

{code}

So, {{HiveUtils#newClientForExecution()}} returns implementation of 
{{IMetaStoreClient}} which *ALWAYS* points to derby metastore (see dosctrings 
and comments in 
{{org.apache.spark.sql.hive.HiveUtils#newTemporaryConfiguration()}})

IMHO, to get correct metadata we need to additionally create another 
{{IMetaStoreClient}} with {{newClientForMetadata()}}, and pass it's 
{{HiveConf}} to underlying {{Service}} s.

> Spark Thrift server doesn't return correct JDBC metadata 
> ---------------------------------------------------------
>
>                 Key: SPARK-9686
>                 URL: https://issues.apache.org/jira/browse/SPARK-9686
>             Project: Spark
>          Issue Type: Bug
>          Components: SQL
>    Affects Versions: 1.4.0, 1.4.1, 1.5.0, 1.5.1, 1.5.2
>            Reporter: pin_zhang
>            Priority: Critical
>         Attachments: SPARK-9686.1.patch.txt
>
>
> 1. Start  start-thriftserver.sh
> 2. connect with beeline
> 3. create table
> 4.show tables, the new created table returned
> 5.
>       Class.forName("org.apache.hive.jdbc.HiveDriver");
>       String URL = "jdbc:hive2://localhost:10000/default";
>        Properties info = new Properties();
>         Connection conn = DriverManager.getConnection(URL, info);
>       ResultSet tables = conn.getMetaData().getTables(conn.getCatalog(),
>                null, null, null);
> Problem:
>            No tables with returned this API, that work in spark1.3



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Comment Edited] (SPARK-9686) Spark Thrift server doesn't return correct JDBC metadata

Reply via email to