Hi all: I wonder if there is a way to save a table in order to optimize join at a later time.
For example if I do something like: val df = anotherDF.repartition("id")//some data frame df.registerTempTable("tableAlias") hiveContext.sql( "INSERT INTO whse.someTable SELECT * FROM tableAlias " ) Do the partition information ("id") will be stored in whse.someTable such that when querying on that table in a second spark job, the information will be used for optimizing joins for example? If this approach do not work, can you suggest one that works? Thanks -- Cesar Flores