Hi Younes.
When you have multiple user connected to hive, or you have multiple 
applications trying to access a shared memory. My recommendation would be to 
store it to a off-heap rather then disk. Checkout this link and check RDD 
Persistence http://spark.apache.org/docs/latest/programming-guide.html 
<http://spark.apache.org/docs/latest/programming-guide.html>. But you can also 
go for a disk only option and store to hdfs but this would cost you more IO/ 
access cost to again read the data for computation. Using thrift-server to 
cache is the same as using hdfs to store as cache as hive would also use hdfs 
to store it. Thanks
> On Feb 3, 2016, at 1:17 PM, Younes Naguib <younes.nag...@tritondigital.com> 
> wrote:
> 
> Hi all,
>  
> Since 1.6.0, low latency query are much slower now.
> This seems to be connected to the multi-user in the thrift-server.
> So on any newly created session, jobs are added to fill the session cache 
> with information related to the tables it queries.
> Here is the details for this job:
> load at LocalCache.java:3599
> org.apache.spark.sql.hive.HiveMetastoreCatalog$$anon$1.load(HiveMetastoreCatalog.scala:124)
> org.spark-project.guava.cache.LocalCache$LoadingValueReference.loadFuture(LocalCache.java:3599)
> org.spark-project.guava.cache.LocalCache$Segment.loadSync(LocalCache.java:2379)
> org.spark-project.guava.cache.LocalCache$Segment.lockedGetOrLoad(LocalCache.java:2342)
> org.spark-project.guava.cache.LocalCache$Segment.get(LocalCache.java:2257)
> org.spark-project.guava.cache.LocalCache.get(LocalCache.java:4000)
> org.spark-project.guava.cache.LocalCache.getOrLoad(LocalCache.java:4004)
> org.spark-project.guava.cache.LocalCache$LocalLoadingCache.get(LocalCache.java:4874)
> org.spark-project.guava.cache.LocalCache$LocalLoadingCache.getUnchecked(LocalCache.java:4880)
> org.spark-project.guava.cache.LocalCache$LocalLoadingCache.apply(LocalCache.java:4898)
> org.apache.spark.sql.hive.HiveMetastoreCatalog.lookupRelation(HiveMetastoreCatalog.scala:387)
> org.apache.spark.sql.hive.HiveContext$$anon$2.org 
> <http://2.org/>$apache$spark$sql$catalyst$analysis$OverrideCatalog$$super$lookupRelation(HiveContext.scala:457)
> org.apache.spark.sql.catalyst.analysis.OverrideCatalog$class.lookupRelation(Catalog.scala:161)
> org.apache.spark.sql.hive.HiveContext$$anon$2.lookupRelation(HiveContext.scala:457)
> org.apache.spark.sql.catalyst.analysis.Analyzer$ResolveRelations$.getTable(Analyzer.scala:303)
> org.apache.spark.sql.catalyst.analysis.Analyzer$ResolveRelations$$anonfun$apply$9.applyOrElse(Analyzer.scala:315)
> org.apache.spark.sql.catalyst.analysis.Analyzer$ResolveRelations$$anonfun$apply$9.applyOrElse(Analyzer.scala:310)
> org.apache.spark.sql.catalyst.plans.logical.LogicalPlan$$anonfun$resolveOperators$1.apply(LogicalPlan.scala:57)
> org.apache.spark.sql.catalyst.plans.logical.LogicalPlan$$anonfun$resolveOperators$1.apply(LogicalPlan.scala:57)
> org.apache.spark.sql.catalyst.trees.CurrentOrigin$.withOrigin(TreeNode.scala:53)
>  
> <image001.png>
>  
> Any ways to cache this at thrift-server instead? So it’s reusable but all 
> sessions? Other than going back to single user ofcourseJ
>  
> Thanks,
> Younes

Reply via email to