[
https://issues.apache.org/jira/browse/SPARK-9686?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16217201#comment-16217201
]
Andriy Kushnir edited comment on SPARK-9686 at 10/26/17 10:18 AM:
------------------------------------------------------------------
[~rxin], I did a little research for this error.
To invoke {{run()}} → {{runInternal()}} on any
{{org.apache.hive.service.cli.operation.Operation}} (for example,
{{GetSchemasOperation}}) we need {{IMetaStoreClient}}. Currently it's taken
from {{HiveSession}} instance:
{code:java}
public class GetSchemasOperation extends MetadataOperation {
@Override
public void runInternal() throws HiveSQLException {
IMetaStoreClient metastoreClient =
getParentSession().getMetaStoreClient();
}
}
{code}
All opened {{HiveSession}} s are handled by
{{org.apache.hive.service.cli.session.SessionManager}} instance.
{{SessionManager}}, among with others, implements
{{org.apache.hive.service.Service}} interface, and all {{Service}} s
initialized with same Hive configuration:
{code:java}
public interface Service {
void init(HiveConf conf);
}
{code}
When {{org.apache.spark.sql.hive.thriftserver.HiveThriftServer2}} initializes,
all {{org.apache.hive.service.CompositeService}} s receive same {{HiveConf}}:
{code:java}
private[hive] class HiveThriftServer2(sqlContext: SQLContext) extends
HiveServer2 with ReflectedCompositeService {
override def init(hiveConf: HiveConf) {
initCompositeService(hiveConf)
}
}
object HiveThriftServer2 extends Logging {
@DeveloperApi
def startWithContext(sqlContext: SQLContext): Unit = {
val server = new HiveThriftServer2(sqlContext)
val executionHive = HiveUtils.newClientForExecution(
sqlContext.sparkContext.conf,
sqlContext.sessionState.newHadoopConf())
server.init(executionHive.conf)
}
}
{code}
So, {{HiveUtils#newClientForExecution()}} returns implementation of
{{IMetaStoreClient}} which *ALWAYS* points to derby metastore (see dosctrings
and comments in
{{org.apache.spark.sql.hive.HiveUtils#newTemporaryConfiguration()}})
IMHO, to get correct metadata we need to additionally create another
{{IMetaStoreClient}} with {{newClientForMetadata()}}, and pass it's
{{HiveConf}} to underlying {{Service}} s.
was (Author: orhideous):
[~rxin], I did a little research for this error.
To invoke {{run()}} → {{runInternal()}} on any
{{org.apache.hive.service.cli.operation.Operation}} (for example,
{{GetSchemasOperation}}) we need {{IMetaStoreClient}}. Currently it's taken
from {{HiveSession}} instance:
{code:java}
public class GetSchemasOperation extends MetadataOperation {
@Override
public void runInternal() throws HiveSQLException {
IMetaStoreClient metastoreClient =
getParentSession().getMetaStoreClient();
}
}
{code}
All opened {{HiveSession}} s are handled by
{{org.apache.hive.service.cli.session.SessionManager}} instance.
{{SessionManager}}, among with others, implements
{{org.apache.hive.service.Service}} interface, and all {{Service}}s initialized
with same Hive configuration:
{code:java}
public interface Service {
void init(HiveConf conf);
}
{code}
When {{org.apache.spark.sql.hive.thriftserver.HiveThriftServer2}} initializes,
all {{org.apache.hive.service.CompositeService}} s receive same {{HiveConf}}:
{code:java}
private[hive] class HiveThriftServer2(sqlContext: SQLContext) extends
HiveServer2 with ReflectedCompositeService {
override def init(hiveConf: HiveConf) {
initCompositeService(hiveConf)
}
}
object HiveThriftServer2 extends Logging {
@DeveloperApi
def startWithContext(sqlContext: SQLContext): Unit = {
val server = new HiveThriftServer2(sqlContext)
val executionHive = HiveUtils.newClientForExecution(
sqlContext.sparkContext.conf,
sqlContext.sessionState.newHadoopConf())
server.init(executionHive.conf)
}
}
{code}
So, {{HiveUtils#newClientForExecution()}} returns implementation of
{{IMetaStoreClient}} which *ALWAYS* points to derby metastore (see dosctrings
and comments in
{{org.apache.spark.sql.hive.HiveUtils#newTemporaryConfiguration()}})
IMHO, to get correct metadata we need to additionally create another
{{IMetaStoreClient}} with {{newClientForMetadata()}}, and pass it's
{{HiveConf}} to underlying {{Service}} s.
> Spark Thrift server doesn't return correct JDBC metadata
> ---------------------------------------------------------
>
> Key: SPARK-9686
> URL: https://issues.apache.org/jira/browse/SPARK-9686
> Project: Spark
> Issue Type: Bug
> Components: SQL
> Affects Versions: 1.4.0, 1.4.1, 1.5.0, 1.5.1, 1.5.2
> Reporter: pin_zhang
> Priority: Critical
> Attachments: SPARK-9686.1.patch.txt
>
>
> 1. Start start-thriftserver.sh
> 2. connect with beeline
> 3. create table
> 4.show tables, the new created table returned
> 5.
> Class.forName("org.apache.hive.jdbc.HiveDriver");
> String URL = "jdbc:hive2://localhost:10000/default";
> Properties info = new Properties();
> Connection conn = DriverManager.getConnection(URL, info);
> ResultSet tables = conn.getMetaData().getTables(conn.getCatalog(),
> null, null, null);
> Problem:
> No tables with returned this API, that work in spark1.3
--
This message was sent by Atlassian JIRA
(v6.4.14#64029)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]