from:"thomas j"

Querying over mutliple (avro) files using Spark SQL

2015-01-13 Thread thomas j

Hi, I have a program that loads a single avro file using spark SQL, queries it, transforms it and then outputs the data. The file is loaded with: val records = sqlContext.avroFile(filePath) val data = records.registerTempTable("data") ... Now I want to run it over tens of thousands of Avro file

Re: How can I read this avro file using spark & scala?

2014-11-21 Thread thomas j

like this: person.map(r => (r.getInt(2), r)).take(4).collect() Is there any way to be able to specify the column name ("user_id") instead of needing to know/calculate the offset somehow? Thanks again On Fri, Nov 21, 2014 at 11:48 AM, thomas j wrote: > Thanks for the pointer Mich

Re: How can I read this avro file using spark & scala?

2014-11-21 Thread thomas j

Thanks for the pointer Michael. I've downloaded spark 1.2.0 from https://people.apache.org/~pwendell/spark-1.2.0-snapshot1/ and clone and built the spark-avro repo you linked to. When I run it against the example avro file linked to in the documentation it works. However, when I try to load my av

Querying over mutliple (avro) files using Spark SQL

Re: How can I read this avro file using spark & scala?

Re: How can I read this avro file using spark & scala?

3 matches

Site Navigation

Mail list logo

Footer information