subject:"Data frame created from hive table and its partition"

Re: Data frame created from hive table and its partition

2015-08-20 Thread VIJAYAKUMAR JAWAHARLAL

Thanks Michael. My bad regarding hive table primary keys. I have one big 140GB hdfs file and external hive table defined on it. Table is not partitioned. When I read external hive table using sqlContext.sql, how does spark decides number of partitions which should be created for that data frame?

Re: Data frame created from hive table and its partition

2015-08-20 Thread Michael Armbrust

There is no such thing as primary keys in the Hive metastore, but Spark SQL does support partitioned hive tables: https://cwiki.apache.org/confluence/display/Hive/LanguageManual+DDL#LanguageManualDDL-PartitionedTables DataFrameWriter also has a partitionBy method. On Thu, Aug 20, 2015 at 7:29 AM,

Data frame created from hive table and its partition

2015-08-20 Thread VIJAYAKUMAR JAWAHARLAL

Hi I have a question regarding data frame partition. I read a hive table from spark and following spark api converts it as DF. test_df = sqlContext.sql(“select * from hivetable1”) How does spark decide partition of test_df? Is there a way to partition test_df based on some column while readin

Re: Data frame created from hive table and its partition

Re: Data frame created from hive table and its partition

Data frame created from hive table and its partition

3 matches

Site Navigation

Mail list logo

Footer information