it is a good to be in control :) Dr Mich Talebzadeh
LinkedIn * https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw <https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw>* http://talebzadehmich.wordpress.com On 14 June 2016 at 13:06, Patrick Duin <patd...@gmail.com> wrote: > Thanks, yes I have something similar working as "the alternative > solution". :) > > I was hoping to get away with not having to specify my schema so the > sqlContext.createExternalTable seemed like a nice clean approach. > > 2016-06-14 13:59 GMT+02:00 Mich Talebzadeh <mich.talebza...@gmail.com>: > >> Try this this will work >> >> sql("use test") >> sql("drop table if exists test.orctype") >> var sqltext: String = "" >> sqltext = """ >> CREATE EXTERNAL TABLE test.orctype( >> prod_id bigint, >> cust_id bigint, >> time_id timestamp, >> channel_id bigint, >> promo_id bigint, >> quantity_sold decimal(10,0), >> amount_sold decimal(10,0)) >> PARTITIONED BY ( >> year int, >> month int) >> CLUSTERED BY ( >> prod_id, >> cust_id, >> time_id, >> channel_id, >> promo_id) >> INTO 256 BUCKETS >> STORED AS ORC >> LOCATION '/tmp' >> TBLPROPERTIES ( "orc.compress"="SNAPPY", >> "orc.create.index"="true", >> >> "orc.bloom.filter.columns"="PROD_ID,CUST_ID,TIME_ID,CHANNEL_ID,PROMO_ID", >> "orc.bloom.filter.fpp"="0.05", >> "orc.stripe.size"="268435456", >> "orc.row.index.stride"="10000" ) >> """ >> sql(sqltext) >> sql("select count(1) from test.orctype").show >> >> res2: org.apache.spark.sql.DataFrame = [result: string] >> +---+ >> |_c0| >> +---+ >> | 0| >> +---+ >> >> HTH >> >> >> Dr Mich Talebzadeh >> >> >> >> LinkedIn * >> https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw >> <https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw>* >> >> >> >> http://talebzadehmich.wordpress.com >> >> >> >> On 14 June 2016 at 11:39, Patrick Duin <patd...@gmail.com> wrote: >> >>> Hi, >>> >>> I'm trying to use sqlContext.createExternalTable("my_table", >>> "/tmp/location/", "orc") to create tables. This is working fine for >>> non-partitioned tables. I'd like to create a partitioned table though, how >>> do I do that? >>> >>> Can I add some information in the options: Map[String, String] >>> parameter? >>> >>> Thanks, >>> Patrick >>> >> >> >