Thanks for coming back with the solution!
Sorry my suggestion did not help
Daniel
On Wed, 20 Jun 2018, 21:46 mattl156, wrote:
> Alright so I figured it out.
>
> When reading from and writing to Hive metastore Parquet tables, Spark SQL
> will try to use its own Parquet support instead of Hive
Hi Matt,
What I tend to do is partition by date in the following way:
s3://data-lake/pipeline1/extract_year=2018/extract_month=06/extract_day=20/file1.json
See the pattern is key=value for physical partitions
When you read that like this:
spark.read.json("s3://data-lake/pipeline1/")
It will
Hi everyone,
I am trying to understand the behaviour of .as[SomeClass] (Dataset API):
Say I have a file with Users:
case class User(id: Int, name: String, address: String, date_add: java.sql.Date)
val users = sc.parallelize(Stream.fill(100)(User(0, "test", "Test Street", new
java.sql.Date(0,