Looking a bit deeper, I think this is because listTables actually retrieves all the metadata about each table, not just the table names, and metadata can be a fair amount of data.
Seems like querying just table names via SQL is reasonably fast: spark.sql("show tables like 'pattern'") as it only returns table name and isTemporary. On Wed, Jan 4, 2017 at 11:26 AM, Everett Anderson <ever...@nuna.com> wrote: > Hi, > > In Spark 1.6.2, we were able to very quickly -- nearly instantly -- search > through the list of (many) table names in our Hive metastore with > > sqlContext.tableNames().filter(_.matches("some regex")).foreach { println > } > > In Spark 2.0.2, however, this takes forever. Similarly, queries with > Catalog that should return a Dataset like > > spark.catalog.listTables("default") > > take forever. Setting the log level to DEBUG in the spark-shell, I can > see the above command is scrolling through every table name in the > metastore. > > Does anyone have a better way to quickly search through the metastore for > a table names matching a regexp in Spark 2? > > Thanks! > > - Everett > > >